diff --git a/about.html b/about.html index 0ebe1d0..ff3daca 100644 --- a/about.html +++ b/about.html @@ -1,18 +1,16 @@ -
You are here: Home ‣ Dive Into Python 3 ‣
The content of Dive Into Python 3 is licensed under the Creative Commons Attribution-ShareAlike 3.0 Unported License.
The chardet library referenced in Case study: porting chardet to Python 3 is licensed under the LGPL 2.1 or later. All other example code is licensed under the MIT license. Full licensing terms are included in each source code file.
diff --git a/case-study-porting-chardet-to-python-3.html b/case-study-porting-chardet-to-python-3.html
index 33aac87..d24d831 100644
--- a/case-study-porting-chardet-to-python-3.html
+++ b/case-study-porting-chardet-to-python-3.html
@@ -1,19 +1,21 @@
-
You are here: Home ‣ Dive Into Python 3 ‣
chardet to Python 3❝ Words, words. They’re all we have to go on. ❞
— Rosencrantz and Guildenstern are Dead @@ -49,7 +51,7 @@ body{counter-reset:h1 20}Summary Diving in
-Unknown or incorrect character encoding is the #1 cause of gibberish text on the web, in your inbox, and indeed across every computer system ever written. In Chapter 3, I talked about the history of character encoding and the creation of Unicode, the “one encoding to rule them all.” I’d love it if I never had to see a gibberish character on a web page again, because all authoring systems stored accurate encoding information, all transfer protocols were Unicode-aware, and every system that handled text maintained perfect fidelity when converting between encodings. +
Unknown or incorrect character encoding is the #1 cause of gibberish text on the web, in your inbox, and indeed across every computer system ever written. In Chapter 3, I talked about the history of character encoding and the creation of Unicode, the “one encoding to rule them all.” I’d love it if I never had to see a gibberish character on a web page again, because all authoring systems stored accurate encoding information, all transfer protocols were Unicode-aware, and every system that handled text maintained perfect fidelity when converting between encodings.
I’d also like a pony.
A Unicode pony.
A Unipony, as it were. @@ -98,8 +100,8 @@ body{counter-reset:h1 20}
We’re going to migrate the
chardetmodule from Python 2 to Python 3. Python 3 comes with a utility script called2to3, which takes your actual Python 2 source code as input and auto-converts as much as it can to Python 3. In some cases this is easy — a function was renamed or moved to a different modules — but in other cases it can get pretty complex. To get a sense of all that it can do, refer to the appendix, Porting code to Python 3 with2to3. In this chapter, we’ll start by running2to3on thechardetpackage, but as you’ll see, there will still be a lot of work to do after the automated tools have performed their magic.The main
chardetpackage is split across several different files, all in the same directory. The2to3script makes it easy to convert multiple files at once: just pass a directory as a command line argument, and2to3will convert each of the files in turn.[The code examples will be easier to follow if you enable Javascript, but whatever.] -
C:\home\chardet> python c:\Python30\Tools\Scripts\2to3.py -w chardet\ +C:\home\chardet> python c:\Python30\Tools\Scripts\2to3.py -w chardet\ RefactoringTool: Skipping implicit fixer: buffer RefactoringTool: Skipping implicit fixer: idioms RefactoringTool: Skipping implicit fixer: set_literal @@ -566,8 +568,8 @@ RefactoringTool: chardet\sjisprober.py RefactoringTool: chardet\universaldetector.py RefactoringTool: chardet\utf8prober.pyNow run the
2to3script on the testing harness,test.py. -C:\home\chardet> python c:\Python30\Tools\Scripts\2to3.py -w test.py +C:\home\chardet> python c:\Python30\Tools\Scripts\2to3.py -w test.py RefactoringTool: Skipping implicit fixer: buffer RefactoringTool: Skipping implicit fixer: idioms RefactoringTool: Skipping implicit fixer: set_literal @@ -602,8 +604,8 @@ RefactoringTool: test.pyFixing what
2to3can’t
Falseis invalid syntaxNow for the real test: running the test harness against the test suite. Since the test suite is designed to cover all the possible code paths, it’s a good way to test our ported code to make sure there aren’t any bugs lurking anywhere. -
C:\home\chardet> python test.py tests\*\* +C:\home\chardet> python test.py tests\*\* Traceback (most recent call last): File "test.py", line 1, in <module> from chardet.universaldetector import UniversalDetector @@ -612,7 +614,7 @@ RefactoringTool: test.py^ SyntaxError: invalid syntaxHmm, a small snag. In Python 3,
Falseis a reserved word, so you can’t use it as a variable name. Let’s look atconstants.pyto see where it’s defined. Here’s the original version fromconstants.py, before the2to3script changed it: -import __builtin__ if not hasattr(__builtin__, 'False'): False = 0 @@ -629,8 +631,8 @@ else:Ah, wasn’t that satisfying? The code is shorter and more readable already.
No module named
constantsTime to run
test.pyagain and see how far it gets. -C:\home\chardet> python test.py tests\*\* +C:\home\chardet> python test.py tests\*\* Traceback (most recent call last): File "test.py", line 1, in <module> from chardet.universaldetector import UniversalDetector @@ -649,8 +651,8 @@ import sysOnward!
Name 'file' is not defined
And here we go again, running
-test.pyto try to execute our test cases…C:\home\chardet> python test.py tests\*\* +C:\home\chardet> python test.py tests\*\* tests\ascii\howto.diveintomark.org.xml Traceback (most recent call last): File "test.py", line 9, in <module> @@ -662,8 +664,8 @@ NameError: name 'file' is not definedAnd that’s all I have to say about that.
Can’t use a string pattern on a bytes-like object
Now things are starting to get interesting. And by “interesting,” I mean “confusing as all hell.” -
C:\home\chardet> python test.py tests\*\* +C:\home\chardet> python test.py tests\*\* tests\ascii\howto.diveintomark.org.xml Traceback (most recent call last): File "test.py", line 10, in <module> @@ -673,14 +675,14 @@ NameError: name 'file' is not definedTypeError: can't use a string pattern on a bytes-like object
To debug this, let’s see what self._highBitDetector is. It’s defined in the __init__ method of the UniversalDetector class: -
class UniversalDetector: def __init__(self): self._highBitDetector = re.compile(r'[\x80-\xFF]')This pre-compiles a regular expression designed to find non-ASCII characters in the range 128–255 (0x80–0xFF). Wait, that’s not quite right; I need to be more precise with my terminology. This pattern is designed to find non-ASCII bytes in the range 128-255.
And therein lies the problem.
In Python 2, a string was an array of bytes whose character encoding was tracked separately. If you wanted Python 2 to keep track of the character encoding, you had to use a Unicode string (
u'') instead. But in Python 3, a string is always what Python 2 called a Unicode string — that is, an array of Unicode characters (of possibly varying byte lengths). Since this regular expression is defined by a string pattern, it can only be used to search a string — again, an array of characters. But what we’re searching is not a string, it’s a byte array. Looking at the traceback, this error occurred inuniversaldetector.py: -if self._mInputState == ePureAscii: if self._highBitDetector.search(aBuf):def feed(self, aBuf): . . @@ -688,7 +690,7 @@ TypeError: can't use a string pattern on a bytes-like objectAnd what is aBuf? Let’s backtrack further to a place that calls
UniversalDetector.feed(). One place that calls it is the test harness,test.py. -u = UniversalDetector() . . @@ -698,7 +700,7 @@ for line in open(f, 'rb'):And here we find our answer: in the
UniversalDetector.feed()method, aBuf is a line read from a file on disk. Look carefully at the parameters used to open the file:'rb'.'r'is for “read”; OK, big deal, we’re reading the file. Ah, but'b'is for “binary.” Without the'b'flag, thisforloop would read the file, line by line, and convert each line into a string — an array of Unicode characters — according to the system default character encoding. (You could override the system encoding with another parameter to open(), but never mind that for now.) But with the'b'flag, thisforloop reads the file, line by line, and stores each line exactly as it appears in the file, as an array of bytes. That byte array gets passed toUniversalDetector.feed(), and eventually gets passed to the pre-compiled regular expression, self._highBitDetector, to search for high-bit… characters. But we don’t have characters; we have bytes. Oops.What we need this regular expression to search is not an array of characters, but an array of bytes.
Once you realize that, the solution is not difficult. Regular expressions defined with strings can search strings. Regular expressions defined with byte arrays can search byte arrays. To define a byte array pattern, we simply change the type of the argument we use to define the regular expression to a byte array. (There is one other case of this same problem, on the very next line.) -
class UniversalDetector: def __init__(self):- self._highBitDetector = re.compile(b'[\x80-\xFF]')@@ -709,7 +711,7 @@ for line in open(f, 'rb'): self._mCharSetProbers = [] self.reset()Searching the entire codebase for other uses of the
remodule turns up two more instances, incharsetprober.py. Again, the code is defining regular expressions as strings but executing them on aBuf, which is a byte array. The solution is the same: define the regular expression patterns as byte arrays. -class CharSetProber: . . @@ -726,8 +728,8 @@ for line in open(f, 'rb'):Can't convert
'bytes'object tostrimplicitlyCuriouser and curiouser… -
C:\home\chardet> python test.py tests\*\* +C:\home\chardet> python test.py tests\*\* tests\ascii\howto.diveintomark.org.xml Traceback (most recent call last): File "test.py", line 10, in <module> @@ -736,12 +738,12 @@ for line in open(f, 'rb'): elif (self._mInputState == ePureAscii) and self._escDetector.search(self._mLastChar + aBuf): TypeError: Can't convert 'bytes' object to str implicitlyThere's an unfortunate clash of coding style and Python interpreter here. The
TypeErrorcould be anywhere on that line, but the traceback doesn't tell you exactly where it is. It could be in the first conditional or the second, and the traceback would look the same. To narrow it down, you should split the line in half, like this: -elif (self._mInputState == ePureAscii) and \ self._escDetector.search(self._mLastChar + aBuf):And re-run the test:
-skip over this command output listing -
C:\home\chardet> python test.py tests\*\* +skip over this command output listing +
C:\home\chardet> python test.py tests\*\* tests\ascii\howto.diveintomark.org.xml Traceback (most recent call last): File "test.py", line 10, in <module> @@ -751,7 +753,7 @@ TypeError: Can't convert 'bytes' object to str implicitlyTypeError: Can't convert 'bytes' object to str implicitlyAha! The problem was not in the first conditional (
self._mInputState == ePureAscii) but in the second one. So what could cause aTypeErrorthere? Perhaps you're thinking that thesearch()method is expecting a value of a different type, but that wouldn't generate this traceback. Python functions can take any value; if you pass the right number of arguments, the function will execute. It may crash if you pass it a value of a different type than it's expecting, but if that happened, the traceback would point to somewhere inside the function. But this traceback says it never got as far as calling thesearch()method. So the problem must be in that+operation, as it's trying to construct the value that it will eventually pass to thesearch()method.We know from previous debugging that aBuf is a byte array. So what is
self._mLastChar? It's an instance variable, defined in thereset()method, which is actually called from the__init__()method. -self._mLastChar = ''class UniversalDetector: def __init__(self): self._highBitDetector = re.compile(b'[\x80-\xFF]') @@ -769,7 +771,7 @@ TypeError: Can't convert 'bytes' object to str implicitlyAnd now we have our answer. Do you see it? self._mLastChar is a string, but aBuf is a byte array. And you can't concatenate a string to a byte array — not even a zero-length string.
So what is self._mLastChar anyway? The answer is in the
feed()method, just a few lines down from where the trackback occurred. -self._mLastChar = aBuf[-1]if self._mInputState == ePureAscii: if self._highBitDetector.search(aBuf): self._mInputState = eHighbyte @@ -779,7 +781,7 @@ TypeError: Can't convert 'bytes' object to str implicitlyThe calling function calls this
feed()method over and over again with a few bytes at a time. The method processes the bytes it was given (passed in as aBuf), then stores the last byte in self._mLastChar in case it's needed during the next call. (In a multi-byte encoding, thefeed()method might get called with half of a character, then called again with the other half.) But because aBuf is now a byte array instead of a string, self._mLastChar needs to be a byte array as well. Thus: -def reset(self): . . @@ -787,7 +789,7 @@ TypeError: Can't convert 'bytes' object to str implicitly- self._mLastChar = ''+ self._mLastChar = b''Searching the entire codebase for
"mLastChar"turns up a similar problem inmbcharsetprober.py, but instead of tracking the last character, it tracks the last two characters. TheMultiByteCharSetProberclass uses a list of 1-character strings to track the last two characters; in Python 3, it needs to use a list of integers. -+ self._mLastChar = [0, 0]class MultiByteCharSetProber(CharSetProber): def __init__(self): @@ -807,8 +809,8 @@ TypeError: Can't convert 'bytes' object to str implicitlyUnsupported operand type(s) for +:
'int'and'bytes'I have good news, and I have bad news. The good news is we're making progress… -
skip over this command listing -
C:\home\chardet> python test.py tests\*\* +skip over this command listing +
C:\home\chardet> python test.py tests\*\* tests\ascii\howto.diveintomark.org.xml Traceback (most recent call last): File "test.py", line 10, in <module> @@ -819,7 +821,7 @@ TypeError: unsupported operand type(s) for +: 'int' and 'bytes'…The bad news is it doesn't always feel like progress.
But this is progress! Really! Even though the traceback calls out the same line of code, it's a different error than it used to be. Progress! So what's the problem now? The last time I checked, this line of code didn't try to concatenate an
intwith a byte array (bytes). In fact, you just spent a lot of time ensuring that self._mLastChar was a byte array. How did it turn into anint?The answer lies not in the previous lines of code, but in the following lines. -
self._mLastChar = aBuf[-1]if self._mInputState == ePureAscii: if self._highBitDetector.search(aBuf): self._mInputState = eHighbyte @@ -829,24 +831,24 @@ TypeError: unsupported operand type(s) for +: 'int' and 'bytes'This error doesn't occur the first time the
feed()method gets called; it occurs the second time, after self._mLastChar has been set to the last byte of aBuf. Well, what's the problem with that? Getting a single element from a byte array yields an integer, not a byte array. To see the difference, follow me to the interactive shell: -skip over this interpreter listing +
skip over this interpreter listing
->>> aBuf = b'\xEF\xBB\xBF' ① ->>> len(aBuf) +>>> aBuf = b'\xEF\xBB\xBF' ① +>>> len(aBuf) 3 ->>> mLastChar = aBuf[-1] ->>> mLastChar ② +>>> mLastChar = aBuf[-1] +>>> mLastChar ② 191 ->>> type(mLastChar) ③ +>>> type(mLastChar) ③ <class 'int'> ->>> mLastChar + aBuf ④ +>>> mLastChar + aBuf ④ Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unsupported operand type(s) for +: 'int' and 'bytes' ->>> mLastChar = aBuf[-1:] ⑤ ->>> mLastChar +>>> mLastChar = aBuf[-1:] ⑤ +>>> mLastChar b'\xbf' ->>> mLastChar + aBuf ⑥ +>>> mLastChar + aBuf ⑥ b'\xbf\xef\xbb\xbf'-
- Define a byte array of length 3. @@ -864,8 +866,8 @@ TypeError: unsupported operand type(s) for +: 'int' and 'bytes' + self._mLastChar = aBuf[-1:]
ord()expected string of length 1, butintfoundTired yet? You're almost there… -
skip over this command output listing -
C:\home\chardet> python test.py tests\*\* +skip over this command output listing +
C:\home\chardet> python test.py tests\*\* tests\ascii\howto.diveintomark.org.xml ascii with confidence 1.0 tests\Big5\0804.blogspot.com.xml Traceback (most recent call last): @@ -881,28 +883,28 @@ tests\Big5\0804.blogspot.com.xml byteCls = self._mModel['classTable'][ord(c)] TypeError: ord() expected string of length 1, but int foundOK, so c is an
int, but theord()function was expecting a 1-character string. Fair enough. Where is c defined? -# codingstatemachine.py def next_state(self, c): # for each byte we get its class # if it is first byte, we also get byte length byteCls = self._mModel['classTable'][ord(c)]That's no help; it's just passed into the function. Let's pop the stack. -
# utf8prober.py def feed(self, aBuf): for c in aBuf: codingState = self._mCodingSM.next_state(c)And now we have the answer. Do you see it? In Python 2, aBuf was a string, so c was a 1-character string. (That's what you get when you iterate over a string — all the characters, one by one.) But now, aBuf is a byte array, so c is an
int, not a 1-character string. In other words, there's no need to call theord()function because c is already anint!Thus: -
def next_state(self, c): # for each byte we get its class # if it is first byte, we also get byte length- byteCls = self._mModel['classTable'][ord(c)]+ byteCls = self._mModel['classTable'][c]Searching the entire codebase for instances of
"ord(c)"uncovers similar problems insbcharsetprober.py… -# sbcharsetprober.py def feed(self, aBuf): if not self._mModel['keepEnglishLetter']: @@ -913,14 +915,14 @@ def feed(self, aBuf): for c in aBuf: order = self._mModel['charToOrderMap'][ord(c)]…and
latin1prober.py… -# latin1prober.py def feed(self, aBuf): aBuf = self.filter_with_english_letters(aBuf) for c in aBuf: charClass = Latin1_CharToClass[ord(c)]c is iterating over aBuf, which means it is an integer, not a 1-character string. The solution is the same: change
ord(c)to just plainc. -# sbcharsetprober.py def feed(self, aBuf): if not self._mModel['keepEnglishLetter']: @@ -941,8 +943,8 @@ def feed(self, aBuf):Unorderable types:
int()>=str()Let's go again. -
skip over this command output listing -
C:\home\chardet> python test.py tests\*\* +skip over this command output listing +
C:\home\chardet> python test.py tests\*\* tests\ascii\howto.diveintomark.org.xml ascii with confidence 1.0 tests\Big5\0804.blogspot.com.xml Traceback (most recent call last): @@ -961,7 +963,7 @@ tests\Big5\0804.blogspot.com.xml TypeError: unorderable types: int() >= str()Did you notice? This time around, the code passed the first test case (
tests\ascii\howto.diveintomark.org.xml). You're making real progress here.So what's this all about? “Unorderable types”? Once again, the difference between byte arrays and strings is rearing its ugly head. Take a look at the code: -
else: charLen = 1class SJISContextAnalysis(JapaneseContextAnalysis): def get_order(self, aStr): if not aStr: return -1, 1 @@ -972,7 +974,7 @@ TypeError: unorderable types: int() >= str()And where does aStr come from? Let's pop the stack: -
def feed(self, aBuf, aLen): . . @@ -983,7 +985,7 @@ TypeError: unorderable types: int() >= str()Oh look, it's our old friend, aBuf. As you might have guessed from every other issue we've encountered in this chapter, aBuf is a byte array. Here, the
feed()method isn't just passing it on wholesale; it's slicing it. But as you saw earlier in this chapter, slicing a byte array returns a byte array, so the aStr parameter that gets passed to theget_order()method is still a byte array.And what is this code trying to do with aStr? It's taking the first element of the byte array and comparing it to a string of length 1. In Python 2, that worked, because aStr and aBuf were strings, and aStr[0] would be a string, and you can compare strings for inequality. But in Python 3, aStr and aBuf are byte arrays, aStr[0] is an integer, and you can't compare integers and strings for inequality without explicitly coercing one of them.
In this case, there's no need to make the code more complicated by adding an explicit coercion. aStr[0] yields an integer; the things you're comparing to are all constants. Let's change them from 1-character strings to integers. -
return -1, charLenclass SJISContextAnalysis(JapaneseContextAnalysis): def get_order(self, aStr): if not aStr: return -1, 1 @@ -1037,8 +1039,8 @@ TypeError: unorderable types: int() >= str()Searching the entire codebase for occurrences of the
ord()function uncovers the same problem inchardistribution.py: -skip over this command output listing -
C:\home\chardet> python test.py tests\*\* +skip over this command output listing +
C:\home\chardet> python test.py tests\*\* tests\ascii\howto.diveintomark.org.xml ascii with confidence 1.0 tests\Big5\0804.blogspot.com.xml Traceback (most recent call last): @@ -1056,7 +1058,7 @@ tests\Big5\0804.blogspot.com.xml if (aStr[0] >= '\x81') and (aStr[0] <= '\x9F'): TypeError: unorderable types: int() >= str()The fix is the same: -
return -1class EUCTWDistributionAnalysis(CharDistributionAnalysis): def __init__(self): CharDistributionAnalysis.__init__(self) @@ -1163,8 +1165,8 @@ TypeError: unorderable types: int() >= str()Global name
'reduce'is not definedOnce more into the breach… -
skip over this command output listing -
C:\home\chardet> python test.py tests\*\* +skip over this command output listing +
C:\home\chardet> python test.py tests\*\* tests\ascii\howto.diveintomark.org.xml ascii with confidence 1.0 tests\Big5\0804.blogspot.com.xml Traceback (most recent call last): @@ -1177,14 +1179,14 @@ tests\Big5\0804.blogspot.com.xml NameError: global name 'reduce' is not definedAccording to the official What's New In Python 3.0 guide, the
reduce()function has been moved out of the global namespace and into thefunctoolsmodule. Quoting the guide: "Usefunctools.reduce()if you really need it; however, 99 percent of the time an explicitforloop is more readable."OK then, let's refactor it to use a
forloop. -def get_confidence(self): if self.get_state() == constants.eNotMe: return 0.01 total = reduce(operator.add, self._mFreqCounter)The
reduce()function takes two arguments — a function and a list (strictly speaking, any iterable object will do) — and applies the function cumulatively to each item of the list. In other words, this is a fancy and roundabout way of adding up all the items in a list and returning the result. It looks much more readable as aforloop. -+ for frequency in self._mFreqCounter: + total += frequencydef get_confidence(self): if self.get_state() == constants.eNotMe: return 0.01 @@ -1194,8 +1196,8 @@ NameError: global name 'reduce' is not definedI CAN HAZ TESTZ? -
skip over this command output listing -
C:\home\chardet> python test.py tests\*\* +skip over this command output listing +
C:\home\chardet> python test.py tests\*\* tests\ascii\howto.diveintomark.org.xml ascii with confidence 1.0 tests\Big5\0804.blogspot.com.xml Big5 with confidence 0.99 tests\Big5\blog.worren.net.xml Big5 with confidence 0.99 @@ -1239,6 +1241,6 @@ tests\EUC-JP\arclamp.jp.xml EUC-JP with confide- You need to understand your program. Thoroughly. Preferably because you wrote it, but at the very least, you need to be comfortable with all its quirks and musty corners. The bugs are everywhere.
- Test cases are essential. Don't port anything without them. Don't even try. The only reason I have any confidence at all that
chardetworks in Python 3 is because I had a test suite that exercised every line of code in the entire library. I never would have found half of these problems with manual spot-checking.© 2001–4, 2009 ℳark Pilgrim • open standards • open content • open source +
© 2001–4, 2009 ℳark Pilgrim • open standards • open content • open source diff --git a/dip2 b/dip2 index f7572d8..3c911c4 100644 --- a/dip2 +++ b/dip2 @@ -254,7 +254,7 @@ several months behind in updating their ActivePython installer when new version PythonWin 2.2.2 (#37, Nov 26 2002, 10:24:37) [MSC 32 bit (Intel)] on win32. Portions Copyright 1994-2001 Mark Hammond (mhammond@skippinet.com.au) - see 'Help/About PythonWin' for further copyright information. ->>> +>>>
Procedure 1.2. Option 2: Installing Python from Python.org
@@ -289,7 +289,7 @@ Type "copyright", "credits" or "license()" for more information. **************************************************************** IDLE 1.0 ->>> +>>>
1.3. Python on Mac OS X
On Mac OS X, you have two choices for installing Python: install it, or don't install it. You probably want to install it.
Mac OS X 10.2 and later comes with a command-line version of Python preinstalled. If you are comfortable with the command line, you can use this version for the first third of the book. However, @@ -316,12 +316,12 @@ interactive shell.
Try it out:
Welcome to Darwin! -[localhost:~] you% python +[localhost:~] you% python Python 2.2 (#1, 07/14/02, 23:25:09) [GCC Apple cpp-precomp 6.14] on darwin Type "help", "copyright", "credits", or "license" for more information. ->>> [press Ctrl+D to get back to the command prompt] -[localhost:~] you% +>>> [press Ctrl+D to get back to the command prompt] +[localhost:~] you%Procedure 1.4. Installing the Latest Version of Python on Mac OS X
Follow these steps to download and install the latest version of Python: @@ -358,21 +358,21 @@ Window->Python Interactive (Cmd-0). The opening window [GCC 3.1 20020420 (prerelease)] Type "copyright", "credits" or "license" for more information. MacPython IDE 1.0.1 ->>> +>>>
Note that once you install the latest version, the pre-installed version is still present. If you are running scripts from the command line, you need to be aware which version of Python you are using.
Example 1.1. Two versions of Python
-[localhost:~] you% python +[localhost:~] you% python Python 2.2 (#1, 07/14/02, 23:25:09) [GCC Apple cpp-precomp 6.14] on darwin Type "help", "copyright", "credits", or "license" for more information. ->>> [press Ctrl+D to get back to the command prompt] -[localhost:~] you% /usr/local/bin/python +>>> [press Ctrl+D to get back to the command prompt] +[localhost:~] you% /usr/local/bin/python Python 2.3 (#2, Jul 30 2003, 11:45:28) [GCC 3.1 20020420 (prerelease)] on darwin Type "help", "copyright", "credits", or "license" for more information. ->>> [press Ctrl+D to get back to the command prompt] -[localhost:~] you% +>>> [press Ctrl+D to get back to the command prompt] +[localhost:~] you%1.4. Python on Mac OS 9
Mac OS 9 does not come with any version of Python, but installation is very simple, and there is only one choice.
@@ -407,34 +407,34 @@ Window->Python Interactive (Cmd-0). You'll see a scree [GCC 3.1 20020420 (prerelease)] Type "copyright", "credits" or "license" for more information. MacPython IDE 1.0.1 ->>> +>>>1.5. Python on RedHat Linux
Installing under UNIX-compatible operating systems such as Linux is easy if you're willing to install a binary package. Pre-built binary packages are available for most popular Linux distributions. Or you can always compile from source.
Download the latest Python RPM by going to http://www.python.org/ftp/python/ and selecting the highest version number listed, then selecting the
rpms/directory within that. Then download the RPM with the highest version number. You can install it with the rpm command, as shown here:Example 1.2. Installing on RedHat Linux 9
-localhost:~$ su - -Password: [enter your root password] -[root@localhost root]# wget http://python.org/ftp/python/2.3/rpms/redhat-9/python2.3-2.3-5pydotorg.i386.rpm +localhost:~$ su - +Password: [enter your root password] +[root@localhost root]# wget http://python.org/ftp/python/2.3/rpms/redhat-9/python2.3-2.3-5pydotorg.i386.rpm Resolving python.org... done. Connecting to python.org[194.109.137.226]:80... connected. HTTP request sent, awaiting response... 200 OK Length: 7,495,111 [application/octet-stream] ... -[root@localhost root]# rpm -Uvh python2.3-2.3-5pydotorg.i386.rpm +[root@localhost root]# rpm -Uvh python2.3-2.3-5pydotorg.i386.rpm Preparing... ########################################### [100%] 1:python2.3 ########################################### [100%] -[root@localhost root]# python ① +[root@localhost root]# python ① Python 2.2.2 (#1, Feb 24 2003, 19:13:11) [GCC 3.2.2 20030222 (Red Hat Linux 3.2.2-4)] on linux2 Type "help", "copyright", "credits", or "license" for more information. ->>> [press Ctrl+D to exit] -[root@localhost root]# python2.3 ② +>>> [press Ctrl+D to exit] +[root@localhost root]# python2.3 ② Python 2.3 (#1, Sep 12 2003, 10:53:56) [GCC 3.2.2 20030222 (Red Hat Linux 3.2.2-5)] on linux2 Type "help", "copyright", "credits", or "license" for more information. ->>> [press Ctrl+D to exit] -[root@localhost root]# which python2.3 ③ +>>> [press Ctrl+D to exit] +[root@localhost root]# which python2.3 ③ /usr/bin/python2.3@@ -444,9 +444,9 @@ Type "help", "copyright", "credits", or "license" for more information.
1.6. Python on Debian GNU/Linux
If you are lucky enough to be running Debian GNU/Linux, you install Python through the apt command.
Example 1.3. Installing on Debian GNU/Linux
-localhost:~$ su - -Password: [enter your root password] -localhost:~# apt-get install python +localhost:~$ su - +Password: [enter your root password] +localhost:~# apt-get install python Reading Package Lists... Done Building Dependency Tree... Done The following extra packages will be installed: @@ -458,7 +458,7 @@ The following NEW packages will be installed: 0 upgraded, 2 newly installed, 0 to remove and 3 not upgraded. Need to get 0B/2880kB of archives. After unpacking 9351kB of additional disk space will be used. -Do you want to continue? [Y/n] Y +Do you want to continue? [Y/n] Y Selecting previously deselected package python2.3. (Reading database ... 22848 files and directories currently installed.) Unpacking python2.3 (from .../python2.3_2.3.1-1_i386.deb) ... @@ -468,32 +468,32 @@ Setting up python (2.3.1-1) ... Setting up python2.3 (2.3.1-1) ... Compiling python modules in /usr/lib/python2.3 ... Compiling optimized python modules in /usr/lib/python2.3 ... -localhost:~# exit +localhost:~# exit logout -localhost:~$ python +localhost:~$ python Python 2.3.1 (#2, Sep 24 2003, 11:39:14) [GCC 3.3.2 20030908 (Debian prerelease)] on linux2 Type "help", "copyright", "credits" or "license" for more information. ->>> [press Ctrl+D to exit] +>>> [press Ctrl+D to exit]1.7. Python Installation from Source
If you prefer to build from source, you can download the Python source code from http://www.python.org/ftp/python/. Select the highest version number listed, download the
.tgzfile), and then do the usual configure, make, make install dance.Example 1.4. Installing from source
-localhost:~$ su - -Password: [enter your root password] -localhost:~# wget http://www.python.org/ftp/python/2.3/Python-2.3.tgz +localhost:~$ su - +Password: [enter your root password] +localhost:~# wget http://www.python.org/ftp/python/2.3/Python-2.3.tgz Resolving www.python.org... done. Connecting to www.python.org[194.109.137.226]:80... connected. HTTP request sent, awaiting response... 200 OK Length: 8,436,880 [application/x-tar] ... -localhost:~# tar xfz Python-2.3.tgz -localhost:~# cd Python-2.3 -localhost:~/Python-2.3# ./configure +localhost:~# tar xfz Python-2.3.tgz +localhost:~# cd Python-2.3 +localhost:~/Python-2.3# ./configure checking MACHDEP... linux2 checking EXTRAPLATDIR... checking for --without-gcc... no ... -localhost:~/Python-2.3# make +localhost:~/Python-2.3# make gcc -pthread -c -fno-strict-aliasing -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -I. -I./Include -DPy_BUILD_CORE -o Modules/python.o Modules/python.c gcc -pthread -c -fno-strict-aliasing -DNDEBUG -g -O3 -Wall -Wstrict-prototypes @@ -501,19 +501,19 @@ gcc -pthread -c -fno-strict-aliasing -DNDEBUG -g -O3 -Wall -Wstrict-prototypes gcc -pthread -c -fno-strict-aliasing -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -I. -I./Include -DPy_BUILD_CORE -o Parser/grammar1.o Parser/grammar1.c ... -localhost:~/Python-2.3# make install +localhost:~/Python-2.3# make install /usr/bin/install -c python /usr/local/bin/python2.3 ... -localhost:~/Python-2.3# exit +localhost:~/Python-2.3# exit logout -localhost:~$ which python +localhost:~$ which python /usr/local/bin/python -localhost:~$ python +localhost:~$ python Python 2.3.1 (#2, Sep 24 2003, 11:39:14) [GCC 3.3.2 20030908 (Debian prerelease)] on linux2 Type "help", "copyright", "credits" or "license" for more information. ->>> [press Ctrl+D to get back to the command prompt] -localhost:~$ +>>> [press Ctrl+D to get back to the command prompt] +localhost:~$1.8. The Interactive Shell
Now that you have Python installed, what's this interactive shell thing you're running?
It's like this: Python leads a double life. It's an interpreter for scripts that you can run from the command line or run like applications, by @@ -521,13 +521,13 @@ double-clicking the scripts. But it's also an interactive shell that can evaluat This is extremely useful for debugging, quick hacking, and testing. I even know some people who use the Python interactive shell in lieu of a calculator!
Launch the Python interactive shell in whatever way works on your platform, and let's dive in with the steps shown here:
Example 1.5. First Steps in the Interactive Shell
->>> 1 + 1 ① +>>> 1 + 1 ① 2 ->>> print 'hello world' ② +>>> print 'hello world' ② hello world ->>> x = 1 ③ ->>> y = 2 ->>> x + y +>>> x = 1 ③ +>>> y = 2 +>>> x + y 3@@ -575,8 +575,8 @@ if __name__ == "__main__":
Some quick observations before you get to the
Like C, Python uses ==for comparison and=for assignment. Unlike C, Python does not support in-line assignment, so there's no chance of accidentally assigning the value you thought you were comparing.So why is this particular
ifstatement a trick? Modules are objects, and all modules have a built-in attribute__name__. A module's__name__depends on how you're using the module. If youimportthe module, then__name__is the module's filename, without a directory path or file extension. But you can also run the module directly as a standalone program, in which case__name__will be a special default value,__main__. ->>> import odbchelper ->>> odbchelper.__name__+>>> import odbchelper +>>> odbchelper.__name__'odbchelper'Knowing this, you can design a test suite for your module within the module itself by putting it in this
ifstatement. When you run the module directly,__name__is__main__, so the test suite executes. When you import the module,__name__is something else, so the test suite is ignored. This makes it easier to develop and debug new modules before integrating them into a larger program.@@ -620,35 +620,35 @@ if __name__ == "__main__": a matter of style.
Third, you never declared the variable myParams, you just assigned a value to it. This is like VBScript without the
option explicitoption. Luckily, unlike VBScript, Python will not allow you to reference a variable that has never been assigned a value; trying to do so will raise an exception.3.4.1. Referencing Variables
-Example 3.18. Referencing an Unbound Variable
>>> x +Example 3.18. Referencing an Unbound Variable
>>> x Traceback (innermost last): File "<interactive input>", line 1, in ? NameError: There is no variable named 'x' ->>> x = 1 ->>> x +>>> x = 1 +>>> x 1You will thank Python for this one day.
3.4.2. Assigning Multiple Values at Once
One of the cooler programming shortcuts in Python is using sequences to assign multiple values at once. -
Example 3.19. Assigning multiple values at once
>>> v = ('a', 'b', 'e') ->>> (x, y, z) = v ① ->>> x +Example 3.19. Assigning multiple values at once
>>> v = ('a', 'b', 'e') +>>> (x, y, z) = v ① +>>> x 'a' ->>> y +>>> y 'b' ->>> z +>>> z 'e'
- v is a tuple of three elements, and
(x, y, z)is a tuple of three variables. Assigning one to the other assigns each of the values of v to each of the variables, in order.This has all sorts of uses. I often want to assign names to a range of values. In C, you would use
enumand manually list each constant and its associated value, which seems especially tedious when the values are consecutive. In Python, you can use the built-inrangefunction with multi-variable assignment to quickly assign consecutive values. -Example 3.20. Assigning Consecutive Values
>>> range(7) ① +Example 3.20. Assigning Consecutive Values
>>> range(7) ① [0, 1, 2, 3, 4, 5, 6] ->>> (MONDAY, TUESDAY, WEDNESDAY, THURSDAY, FRIDAY, SATURDAY, SUNDAY) = range(7) ② ->>> MONDAY ③ +>>> (MONDAY, TUESDAY, WEDNESDAY, THURSDAY, FRIDAY, SATURDAY, SUNDAY) = range(7) ② +>>> MONDAY ③ 0 ->>> TUESDAY +>>> TUESDAY 1 ->>> SUNDAY +>>> SUNDAY 6
- The built-in
rangefunction returns a list of integers. In its simplest form, it takes an upper limit and returns a zero-based list counting @@ -668,13 +668,13 @@ NameError: There is no variable named 'x'3.6. Mapping Lists
One of the most powerful features of Python is the list comprehension, which provides a compact way of mapping a list into another list by applying a function to each of the elements of the list. -
Example 3.24. Introducing List Comprehensions
>>> li = [1, 9, 8, 4] ->>> [elem*2 for elem in li] ① +Example 3.24. Introducing List Comprehensions
>>> li = [1, 9, 8, 4] +>>> [elem*2 for elem in li] ① [2, 18, 16, 8] ->>> li ② +>>> li ② [1, 9, 8, 4] ->>> li = [elem*2 for elem in li] ③ ->>> li +>>> li = [elem*2 for elem in li] ③ +>>> li [2, 18, 16, 8]
- To make sense of this, look at it from right to left. li is the list you're mapping. Python loops through li one element at a time, temporarily assigning the value of each element to the variable elem. Python then applies the function
elem*2and appends that result to the returned list. @@ -683,12 +683,12 @@ NameError: There is no variable named 'x'Here are the list comprehensions in the
buildConnectionStringfunction that you declared in Chapter 2:["%s=%s" % (k, v) for k, v in params.items()]First, notice that you're calling the
itemsfunction of the params dictionary. This function returns a list of tuples of all the data in the dictionary. -Example 3.25. The
keys,values, anditemsFunctions>>> params = {"server":"mpilgrim", "database":"master", "uid":"sa", "pwd":"secret"} ->>> params.keys() ① +Example 3.25. The
keys,values, anditemsFunctions>>> params = {"server":"mpilgrim", "database":"master", "uid":"sa", "pwd":"secret"} +>>> params.keys() ① ['server', 'uid', 'database', 'pwd'] ->>> params.values() ② +>>> params.values() ② ['mpilgrim', 'sa', 'master', 'secret'] ->>> params.items() ③ +>>> params.items() ③ [('server', 'mpilgrim'), ('uid', 'sa'), ('database', 'master'), ('pwd', 'secret')]
- The
keysmethod of a dictionary returns a list of all the keys. The list is not in the order in which the dictionary was defined @@ -697,14 +697,14 @@ NameError: There is no variable named 'x'- The
itemsmethod returns a list of tuples of the form(key, value). The list contains all the data in the dictionary.Now let's see what
buildConnectionStringdoes. It takes a list,params., and maps it to a new list by applying string formatting to each element. The new list will have the same number of elements asitems()params., but each element in the new list will be a string that contains both a key and its associated value from the params dictionary. -items()Example 3.26. List Comprehensions in
buildConnectionString, Step by Step>>> params = {"server":"mpilgrim", "database":"master", "uid":"sa", "pwd":"secret"} ->>> params.items() +Example 3.26. List Comprehensions in
buildConnectionString, Step by Step>>> params = {"server":"mpilgrim", "database":"master", "uid":"sa", "pwd":"secret"} +>>> params.items() [('server', 'mpilgrim'), ('uid', 'sa'), ('database', 'master'), ('pwd', 'secret')] ->>> [k for k, v in params.items()] ① +>>> [k for k, v in params.items()] ① ['server', 'uid', 'database', 'pwd'] ->>> [v for k, v in params.items()] ② +>>> [v for k, v in params.items()] ② ['mpilgrim', 'sa', 'master', 'secret'] ->>> ["%s=%s" % (k, v) for k, v in params.items()] ③ +>>> ["%s=%s" % (k, v) for k, v in params.items()] ③ ['server=mpilgrim', 'uid=sa', 'database=master', 'pwd=secret']
- Note that you're using two variables to iterate through the
params.items()list. This is another use of multi-variable assignment. The first element ofparams.items()is('server', 'mpilgrim'), so in the first iteration of the list comprehension, k will get'server'and v will get'mpilgrim'. In this case, you're ignoring the value of v and only including the value of k in the returned list, so this list comprehension ends up being equivalent toparams.. @@ -789,9 +789,9 @@ if __name__ == "__main__": ④ ⑤keys()ifstatements use==for comparison, and parentheses are not required.The
infofunction is designed to be used by you, the programmer, while working in the Python IDE. It takes any object that has functions or methods (like a module, which has functions, or a list, which has methods) and prints out the functions and theirdocstrings. -Example 4.2. Sample Usage of
apihelper.py>>> from apihelper import info ->>> li = [] ->>> info(li) +Example 4.2. Sample Usage of
apihelper.py>>> from apihelper import info +>>> li = [] +>>> info(li) append L.append(object) -- append object to end count L.count(value) -> integer -- return number of occurrences of value extend L.extend(list) -- extend list by appending list elements @@ -801,12 +801,12 @@ pop L.pop([index]) -> item -- remove and return item at index (default la remove L.remove(value) -- remove first occurrence of value reverse L.reverse() -- reverse *IN PLACE* sort L.sort([cmpfunc]) -- sort *IN PLACE*; if given, cmpfunc(x, y) -> -1, 0, 1By default the output is formatted to be easy to read. Multi-line
docstrings are collapsed into a single long line, but this option can be changed by specifying0for thecollapseargument. If the function names are longer than 10 characters, you can specify a larger value for thespacingargument to make the output easier to read. -Example 4.3. Advanced Usage of
apihelper.py>>> import odbchelper ->>> info(odbchelper) +Example 4.3. Advanced Usage of
apihelper.py>>> import odbchelper +>>> info(odbchelper) buildConnectionString Build a connection string from a dictionary Returns string. ->>> info(odbchelper, 30) +>>> info(odbchelper, 30) buildConnectionString Build a connection string from a dictionary Returns string. ->>> info(odbchelper, 30, 0) +>>> info(odbchelper, 30, 0) buildConnectionString Build a connection string from a dictionary Returns string. @@ -846,16 +846,16 @@ time, you'll call functions the “normal” way, but you always have th cough, Visual Basic).4.3.1. The
typeFunctionThe
typefunction returns the datatype of any arbitrary object. The possible types are listed in thetypesmodule. This is useful for helper functions that can handle several types of data. -Example 4.5. Introducing
type>>> type(1) ① +Example 4.5. Introducing
type>>> type(1) ① <type 'int'> ->>> li = [] ->>> type(li) ② +>>> li = [] +>>> type(li) ② <type 'list'> ->>> import odbchelper ->>> type(odbchelper) ③ +>>> import odbchelper +>>> type(odbchelper) ③ <type 'module'> ->>> import types ④ ->>> type(odbchelper) == types.ModuleType +>>> import types ④ +>>> type(odbchelper) == types.ModuleType True
typetakes anything -- and I mean anything -- and returns its datatype. Integers, strings, lists, dictionaries, tuples, functions, @@ -866,17 +866,17 @@ True4.3.2. The
strFunctionThe
strcoerces data into a string. Every datatype can be coerced into a string.Example 4.6. Introducing
str->>> str(1) ① +>>> str(1) ① '1' ->>> horsemen = ['war', 'pestilence', 'famine'] ->>> horsemen +>>> horsemen = ['war', 'pestilence', 'famine'] +>>> horsemen ['war', 'pestilence', 'famine'] ->>> horsemen.append('Powerbuilder') ->>> str(horsemen) ② +>>> horsemen.append('Powerbuilder') +>>> str(horsemen) ② "['war', 'pestilence', 'famine', 'Powerbuilder']" ->>> str(odbchelper) ③ +>>> str(odbchelper) ③ "<module 'odbchelper' from 'c:\\docbook\\dip\\py\\odbchelper.py'>" ->>> str(None) ④ +>>> str(None) ④ 'None'
- For simple datatypes like integers, you would expect
strto work, because almost every language has a function to convert an integer to a string. @@ -886,15 +886,15 @@ True- A subtle but important behavior of
stris that it works onNone, the Python null value. It returns the string'None'. You'll use this to your advantage in theinfofunction, as you'll see shortly.At the heart of the
infofunction is the powerfuldirfunction.dirreturns a list of the attributes and methods of any object: modules, functions, strings, lists, dictionaries... pretty much anything. -Example 4.7. Introducing
dir>>> li = [] ->>> dir(li) ① +Example 4.7. Introducing
dir>>> li = [] +>>> dir(li) ① ['append', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort'] ->>> d = {} ->>> dir(d) ② +>>> d = {} +>>> dir(d) ② ['clear', 'copy', 'get', 'has_key', 'items', 'keys', 'setdefault', 'update', 'values'] ->>> import odbchelper ->>> dir(odbchelper) ③ +>>> import odbchelper +>>> dir(odbchelper) ③ ['__builtins__', '__doc__', '__file__', '__name__', 'buildConnectionString']
- li is a list, so
returns a list of all the methods of a list. Note that the returned list contains the names of the methods as strings, not @@ -903,16 +903,16 @@ Truedir(li)- This is where it really gets interesting.
odbchelperis a module, soreturns a list of all kinds of stuff defined in the module, including built-in attributes, likedir(odbchelper)__name__,__doc__, and whatever other attributes and methods you define. In this case,odbchelperhas only one user-defined method, thebuildConnectionStringfunction described in Chapter 2.Finally, the
callablefunction takes any object and returnsTrueif the object can be called, orFalseotherwise. Callable objects include functions, class methods, even classes themselves. (More on classes in the next chapter.)Example 4.8. Introducing
callable->>> import string ->>> string.punctuation ① +>>> import string +>>> string.punctuation ① '!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~' ->>> string.join② +>>> string.join② <function join at 00C55A7C> ->>> callable(string.punctuation) ③ +>>> callable(string.punctuation) ③ False ->>> callable(string.join) ④ +>>> callable(string.join) ④ True ->>> print string.join.__doc__ ⑤ +>>> print string.join.__doc__ ⑤ join(list [,sep]) -> string Return a string composed of the words in list, with @@ -932,9 +932,9 @@ TrueThe advantage of thinking like this is that you can access all the built-in functions and attributes as a group by getting information about the
__builtin__module. And guess what, Python has a function calledinfo. Try it yourself and skim through the list now. We'll dive into some of the more important functions later. (Some of the built-in error classes, likeAttributeError, should already look familiar.) -Example 4.9. Built-in Attributes and Functions
>>> from apihelper import info ->>> import __builtin__ ->>> info(__builtin__, 20) +Example 4.9. Built-in Attributes and Functions
>>> from apihelper import info +>>> import __builtin__ +>>> info(__builtin__, 20) ArithmeticError Base class for arithmetic errors. AssertionError Assertion failed. AttributeError Attribute not found. @@ -957,17 +957,17 @@ IOError I/O operation failed.4.4. Getting Object References With
getattrYou already know that Python functions are objects. What you don't know is that you can get a reference to a function without knowing its name until run-time, by using the
getattrfunction. -Example 4.10. Introducing
getattr>>> li = ["Larry", "Curly"] ->>> li.pop ① +Example 4.10. Introducing
getattr>>> li = ["Larry", "Curly"] +>>> li.pop ① <built-in method pop of list object at 010DF884> ->>> getattr(li, "pop") ② +>>> getattr(li, "pop") ② <built-in method pop of list object at 010DF884> ->>> getattr(li, "append")("Moe") ③ ->>> li +>>> getattr(li, "append")("Moe") ③ +>>> li ["Larry", "Curly", "Moe"] ->>> getattr({}, "clear") ④ +>>> getattr({}, "clear") ④ <built-in method clear of dictionary object at 00F113D4> ->>> getattr((), "pop") ⑤ +>>> getattr((), "pop") ⑤ Traceback (innermost last): File "<interactive input>", line 1, in ? AttributeError: 'tuple' object has no attribute 'pop'@@ -980,21 +980,21 @@ AttributeError: 'tuple' object has no attribute 'pop'In theory,getattrwould work on tuples, except that tuples have no methods, sogetattrwill raise an exception no matter what attribute name you give.4.4.1.
getattrwith Modules
getattrisn't just for built-in datatypes. It also works on modules. -Example 4.11. The
getattrFunction inapihelper.py>>> import odbchelper ->>> odbchelper.buildConnectionString ① +Example 4.11. The
getattrFunction inapihelper.py>>> import odbchelper +>>> odbchelper.buildConnectionString ① <function buildConnectionString at 00D18DD4> ->>> getattr(odbchelper, "buildConnectionString") ② +>>> getattr(odbchelper, "buildConnectionString") ② <function buildConnectionString at 00D18DD4> ->>> object = odbchelper ->>> method = "buildConnectionString" ->>> getattr(object, method) ③ +>>> object = odbchelper +>>> method = "buildConnectionString" +>>> getattr(object, method) ③ <function buildConnectionString at 00D18DD4> ->>> type(getattr(object, method)) ④ +>>> type(getattr(object, method)) ④ <type 'function'> ->>> import types ->>> type(getattr(object, method)) == types.FunctionType +>>> import types +>>> type(getattr(object, method)) == types.FunctionType True ->>> callable(getattr(object, method)) ⑤ +>>> callable(getattr(object, method)) ⑤ True
- This returns a reference to the
buildConnectionStringfunction in theodbchelpermodule, which you studied in Chapter 2, Your First Python Program. (The hex address you see is specific to my machine; your output will be different.) @@ -1040,12 +1040,12 @@ def output(data, format="text"):Here is the list filtering syntax:
[mapping-expressionforelementinsource-listiffilter-expression]This is an extension of the list comprehensions that you know and love. The first two thirds are the same; the last part, starting with the
if, is the filter expression. A filter expression can be any expression that evaluates true or false (which in Python can be almost anything). Any element for which the filter expression evaluates true will be included in the mapping. All other elements are ignored, so they are never put through the mapping expression and are not included in the output list. -Example 4.14. Introducing List Filtering
>>> li = ["a", "mpilgrim", "foo", "b", "c", "b", "d", "d"] ->>> [elem for elem in li if len(elem) > 1] ① +Example 4.14. Introducing List Filtering
>>> li = ["a", "mpilgrim", "foo", "b", "c", "b", "d", "d"] +>>> [elem for elem in li if len(elem) > 1] ① ['mpilgrim', 'foo'] ->>> [elem for elem in li if elem != "b"] ② +>>> [elem for elem in li if elem != "b"] ② ['a', 'mpilgrim', 'foo', 'c', 'd', 'd'] ->>> [elem for elem in li if li.count(elem) == 1] ③ +>>> [elem for elem in li if li.count(elem) == 1] ③ ['a', 'mpilgrim', 'foo', 'c']
- The mapping expression here is simple (it just returns the value of each element), so concentrate on the filter expression. @@ -1074,11 +1074,11 @@ the
popmethod of a list) and user-defined (like thebuildCon4.6. The Peculiar Nature of
andandorIn Python,
andandorperform boolean logic as you would expect, but they do not return boolean values; instead, they return one of the actual values they are comparing. -Example 4.15. Introducing
and>>> 'a' and 'b' ① +Example 4.15. Introducing
and>>> 'a' and 'b' ① 'b' ->>> '' and 'b' ② +>>> '' and 'b' ② '' ->>> 'a' and 'b' and 'c' ③ +>>> 'a' and 'b' and 'c' ③ 'c'
- When using
and, values are evaluated in a boolean context from left to right.0,'',[],(),{}, andNoneare false in a boolean context; everything else is true. Well, almost everything. By default, instances of classes are @@ -1086,16 +1086,16 @@ thepopmethod of a list) and user-defined (like thebuildCon learn all about classes and special methods in Chapter 5. If all values are true in a boolean context,andreturns the last value. In this case,andevaluates'a', which is true, then'b', which is true, and returns'b'.- If any value is false in a boolean context,
andreturns the first false value. In this case,''is the first false value.- All values are true, so
andreturns the last value,'c'. -Example 4.16. Introducing
or>>> 'a' or 'b' ① +Example 4.16. Introducing
or>>> 'a' or 'b' ① 'a' ->>> '' or 'b' ② +>>> '' or 'b' ② 'b' ->>> '' or [] or {} ③ +>>> '' or [] or {} ③ {} ->>> def sidefx(): -... print "in sidefx()" -... return 1 ->>> 'a' or sidefx() ④ +>>> def sidefx(): +... print "in sidefx()" +... return 1 +>>> 'a' or sidefx() ④ 'a'
- When using
or, values are evaluated in a boolean context from left to right, just likeand. If any value is true,orreturns that value immediately. In this case,'a'is the first true value. @@ -1105,11 +1105,11 @@ thepopmethod of a list) and user-defined (like thebuildCon is important if some values can have side effects. Here, the functionsidefxis never called, becauseorevaluates'a', which is true, and returns'a'immediately.If you're a C hacker, you are certainly familiar with the
bool ? a : bexpression, which evaluates to a ifboolis true, and b otherwise. Because of the wayandandorwork in Python, you can accomplish the same thing.4.6.1. Using the
-and-orTrickExample 4.17. Introducing the
and-orTrick>>> a = "first" ->>> b = "second" ->>> 1 and a or b ① +Example 4.17. Introducing the
and-orTrick>>> a = "first" +>>> b = "second" +>>> 1 and a or b ① 'first' ->>> 0 and a or b ② +>>> 0 and a or b ② 'second'@@ -1117,17 +1117,17 @@ the
popmethod of a list) and user-defined (like thebuildCon0 and 'first'evalutes toFalse, and then0 or 'second'evaluates to'second'.However, since this Python expression is simply boolean logic, and not a special construct of the language, there is one extremely important difference between this
and-ortrick in Python and thebool ? a : bsyntax in C. If the value of a is false, the expression will not work as you would expect it to. (Can you tell I was bitten by this? More than once?) -Example 4.18. When the
and-orTrick Fails>>> a = "" ->>> b = "second" ->>> 1 and a or b ① +Example 4.18. When the
and-orTrick Fails>>> a = "" +>>> b = "second" +>>> 1 and a or b ① 'second'
- Since a is an empty string, which Python considers false in a boolean context,
1 and ''evalutes to'', and then'' or 'second'evalutes to'second'. Oops! That's not what you wanted.The
and-ortrick,bool and a or b, will not work like the C expressionbool ? a : bwhen a is false in a boolean context.The real trick behind the
and-ortrick, then, is to make sure that the value of a is never false. One common way of doing this is to turn a into[a]and b into[b], then taking the first element of the returned list, which will be either a or b. -Example 4.19. Using the
and-orTrick Safely>>> a = "" ->>> b = "second" ->>> (1 and [a] or [b])[0] ① +Example 4.19. Using the
and-orTrick Safely>>> a = "" +>>> b = "second" +>>> (1 and [a] or [b])[0] ① ''
- Since
[a]is a non-empty list, it is never false. Even if a is0or''or some other false value, the list[a]is true because it has one element. @@ -1142,15 +1142,15 @@ thepopmethod of a list) and user-defined (like thebuildCon4.7. Using
lambdaFunctionsPython supports an interesting syntax that lets you define one-line mini-functions on the fly. Borrowed from Lisp, these so-called
lambdafunctions can be used anywhere a function is required. -Example 4.20. Introducing
lambdaFunctions>>> def f(x): -... return x*2 -... ->>> f(3) +Example 4.20. Introducing
lambdaFunctions>>> def f(x): +... return x*2 +... +>>> f(3) 6 ->>> g = lambda x: x*2 ① ->>> g(3) +>>> g = lambda x: x*2 ① +>>> g(3) 6 ->>> (lambda x: x*2)(3) ② +>>> (lambda x: x*2)(3) ② 6
- This is a
lambdafunction that accomplishes the same thing as the normal function above it. Note the abbreviated syntax here: there are no @@ -1170,13 +1170,13 @@ alambdafunction; if you need something more complex, define a norHere are the
lambdafunctions inapihelper.py:processFunc = collapse and (lambda s: " ".join(s.split())) or (lambda s: s)Notice that this uses the simple form of the
and-ortrick, which is okay, because alambdafunction is always true in a boolean context. (That doesn't mean that alambdafunction can't return a false value. The function is always true; its return value could be anything.)Also notice that you're using the
splitfunction with no arguments. You've already seen it used with one or two arguments, but without any arguments it splits on whitespace. -Example 4.21.
splitWith No Arguments>>> s = "this is\na\ttest" ① ->>> print s +Example 4.21.
splitWith No Arguments>>> s = "this is\na\ttest" ① +>>> print s this is a test ->>> print s.split() ② +>>> print s.split() ② ['this', 'is', 'a', 'test'] ->>> print " ".join(s.split()) ③ +>>> print " ".join(s.split()) ③ 'this is a test'
- This is a multiline string, defined by escape characters instead of triple quotes.
\nis a carriage return, and\tis a tab character. @@ -1212,12 +1212,12 @@ a test square brackets.Now, let's take it from the end and work backwards. The
for method in methodListshows that this is a list comprehension. As you know, methodList is a list of all the methods you care about in object. So you're looping through that list with method. -
Example 4.22. Getting a
docstringDynamically>>> import odbchelper ->>> object = odbchelper ① ->>> method = 'buildConnectionString' ② ->>> getattr(object, method) ③ +Example 4.22. Getting a
docstringDynamically>>> import odbchelper +>>> object = odbchelper ① +>>> method = 'buildConnectionString' ② +>>> getattr(object, method) ③ <function buildConnectionString at 010D6D74> ->>> print getattr(object, method).__doc__ ④ +>>> print getattr(object, method).__doc__ ④ Build a connection string from a dictionary of parameters. Returns string.@@ -1227,13 +1227,13 @@ for method in methodListshows that this is a Using the
getattrfunction, you're getting a reference to themethodfunction in theobjectmodule.- Now, printing the actual
docstringof the method is easy.The next piece of the puzzle is the use of
straround thedocstring. As you may recall,stris a built-in function that coerces data into a string. But adocstringis always a string, so why bother with thestrfunction? The answer is that not every function has adocstring, and if it doesn't, its__doc__attribute isNone. -Example 4.23. Why Use
stron adocstring?>>> >>> def foo(): print 2 ->>> >>> foo() +Example 4.23. Why Use
stron adocstring?>>> >>> def foo(): print 2 +>>> >>> foo() 2 ->>> >>> foo.__doc__ ① ->>> foo.__doc__ == None ② +>>> >>> foo.__doc__ ① +>>> foo.__doc__ == None ② True ->>> str(foo.__doc__) ③ +>>> str(foo.__doc__) ③ 'None'@@ -1245,17 +1245,17 @@ True
In SQL, you must use IS NULLinstead of= NULLto compare a null value. In Python, you can use either== Noneoris None, butis Noneis faster.Now that you are guaranteed to have a string, you can pass the string to processFunc, which you have already defined as a function that either does or doesn't collapse whitespace. Now you see why it was important to use
strto convert aNonevalue into a string representation. processFunc is assuming a string argument and calling itssplitmethod, which would crash if you passed itNonebecauseNonedoesn't have asplitmethod.Stepping back even further, you see that you're using string formatting again to concatenate the return value of processFunc with the return value of method's
ljustmethod. This is a new string method that you haven't seen before. -Example 4.24. Introducing
ljust>>> s = 'buildConnectionString' ->>> s.ljust(30) ① +Example 4.24. Introducing
ljust>>> s = 'buildConnectionString' +>>> s.ljust(30) ① 'buildConnectionString ' ->>> s.ljust(20) ② +>>> s.ljust(20) ② 'buildConnectionString'
ljustpads the string with spaces to the given length. This is what theinfofunction uses to make two columns of output and line up all thedocstrings in the second column.- If the given length is smaller than the length of the string,
ljustwill simply return the string unchanged. It never truncates the string.You're almost finished. Given the padded method name from the
ljustmethod and the (possibly collapsed)docstringfrom the call to processFunc, you concatenate the two and get a single string. Since you're mapping methodList, you end up with a list of strings. Using thejoinmethod of the string"\n", you join this list into a single string, with each element of the list on a separate line, and print the result. -Example 4.25. Printing a List
>>> li = ['a', 'b', 'c'] ->>> print "\n".join(li) ① +Example 4.25. Printing a List
>>> li = ['a', 'b', 'c'] +>>> print "\n".join(li) ① a b c@@ -1282,9 +1282,9 @@ def info(object, spacing=10, collapse=1): if __name__ == "__main__": print info.__doc__ -Here is the output of
apihelper.py:>>> from apihelper import info ->>> li = [] ->>> info(li) +Here is the output of
apihelper.py:>>> from apihelper import info +>>> li = [] +>>> info(li) append L.append(object) -- append object to end count L.count(value) -> integer -- return number of occurrences of value extend L.extend(list) -- extend list by appending list elements @@ -1461,15 +1461,15 @@ can import individual items or usefrom module import *
from module import *in Python is likeimport module.*in Java;import modulein Python is likeimport modulein Java. -Example 5.2.
import modulevs.from module import>>> import types ->>> types.FunctionType ① +Example 5.2.
import modulevs.from module import>>> import types +>>> types.FunctionType ① <type 'function'> ->>> FunctionType ② +>>> FunctionType ② Traceback (innermost last): File "<interactive input>", line 1, in ? NameError: There is no variable named 'FunctionType' ->>> from types import FunctionType ③ ->>> FunctionType ④ +>>> from types import FunctionType ③ +>>> FunctionType ④ <type 'function'>
- The
typesmodule contains no methods; it just has attributes for each Python object type. Note that the attribute,FunctionType, must be qualified by the module name,types. @@ -1586,13 +1586,13 @@ class FileInfo(UserDict):5.4. Instantiating Classes
Instantiating classes in Python is straightforward. To instantiate a class, simply call the class as if it were a function, passing the arguments that the
__init__method defines. The return value will be the newly created object. -Example 5.7. Creating a
FileInfoInstance>>> import fileinfo ->>> f = fileinfo.FileInfo("/music/_singles/kairo.mp3") ① ->>> f.__class__ ② +Example 5.7. Creating a
FileInfoInstance>>> import fileinfo +>>> f = fileinfo.FileInfo("/music/_singles/kairo.mp3") ① +>>> f.__class__ ② <class fileinfo.FileInfo at 010EC204> ->>> f.__doc__ ③ +>>> f.__doc__ ③ 'store file metadata' ->>> f ④ +>>> f ④ {'name': '/music/_singles/kairo.mp3'}
- You are creating an instance of the
FileInfoclass (defined in thefileinfomodule) and assigning the newly created instance to the variable f. You are passing one parameter,/music/_singles/kairo.mp3, which will end up as the filename argument inFileInfo's__init__method. @@ -1606,11 +1606,11 @@ class FileInfo(UserDict):5.4.1. Garbage Collection
If creating new instances is easy, destroying them is even easier. In general, there is no need to explicitly free instances, because they are freed automatically when the variables assigned to them go out of scope. Memory leaks are rare in Python. -
Example 5.8. Trying to Implement a Memory Leak
>>> def leakmem(): -... f = fileinfo.FileInfo('/music/_singles/kairo.mp3') ① -... ->>> for i in range(100): -... leakmem() ②+Example 5.8. Trying to Implement a Memory Leak
>>> def leakmem(): +... f = fileinfo.FileInfo('/music/_singles/kairo.mp3') ① +... +>>> for i in range(100): +... leakmem() ②
- Every time the
leakmemfunction is called, you are creating an instance ofFileInfoand assigning it to the variable f, which is a local variable within the function. Then the function ends without ever freeing f, so you would expect a memory leak, but you would be wrong. When the function ends, the local variable f goes out of scope. At this point, there are no longer any references to the newly created instance ofFileInfo(since you never assigned it to anything other than f), so Python destroys the instance for us.- No matter how many times you call the
leakmemfunction, it will never leak memory, because every time, Python will destroy the newly createdFileInfoclass before returning fromleakmem. @@ -1716,12 +1716,12 @@ there are a lot of things you can do with dictionaries besides call methods on t provide a way to map non-method-calling syntax into method calls.5.6.1. Getting and Setting Items
Example 5.12. The
__getitem__Special Method- def __getitem__(self, key): return self.data[key]>>> f = fileinfo.FileInfo("/music/_singles/kairo.mp3") ->>> f + def __getitem__(self, key): return self.data[key]>>> f = fileinfo.FileInfo("/music/_singles/kairo.mp3") +>>> f {'name':'/music/_singles/kairo.mp3'} ->>> f.__getitem__("name") ① +>>> f.__getitem__("name") ① '/music/_singles/kairo.mp3' ->>> f["name"] ② +>>> f["name"] ② '/music/_singles/kairo.mp3'
- The
__getitem__special method looks simple enough. Like the normal methodsclear,keys, andvalues, it just redirects to the dictionary to return its value. But how does it get called? Well, you can call__getitem__directly, but in practice you wouldn't actually do that; I'm just doing it here to show you how it works. The right way @@ -1729,13 +1729,13 @@ provide a way to map non-method-calling syntax into method calls.- This looks just like the syntax you would use to get a dictionary value, and in fact it returns the value you would expect. But here's the missing link: under the covers, Python has converted this syntax to the method call
f.__getitem__("name"). That's why__getitem__is a special class method; not only can you call it yourself, you can get Python to call it for you by using the right syntax.Of course, Python has a
__setitem__special method to go along with__getitem__, as shown in the next example.Example 5.13. The
__setitem__Special Method- def __setitem__(self, key, item): self.data[key] = item>>> f + def __setitem__(self, key, item): self.data[key] = item>>> f {'name':'/music/_singles/kairo.mp3'} ->>> f.__setitem__("genre", 31) ① ->>> f +>>> f.__setitem__("genre", 31) ① +>>> f {'name':'/music/_singles/kairo.mp3', 'genre':31} ->>> f["genre"] = 32 ② ->>> f +>>> f["genre"] = 32 ② +>>> f {'name':'/music/_singles/kairo.mp3', 'genre':32}
- Like the
__getitem__method,__setitem__simply redirects to the real dictionary self.data to do its work. And like__getitem__, you wouldn't ordinarily call it directly like this; Python calls__setitem__for you when you use the right syntax. @@ -1761,17 +1761,17 @@ provide a way to map non-method-calling syntax into method calls.
When accessing data attributes within a class, you need to qualify the attribute name: self.attribute. When calling other methods within a class, you need to qualify the method name:self.method. -Example 5.15. Setting an
MP3FileInfo'sname>>> import fileinfo ->>> mp3file = fileinfo.MP3FileInfo() ① ->>> mp3file +Example 5.15. Setting an
MP3FileInfo'sname>>> import fileinfo +>>> mp3file = fileinfo.MP3FileInfo() ① +>>> mp3file {'name':None} ->>> mp3file["name"] = "/music/_singles/kairo.mp3" ② ->>> mp3file +>>> mp3file["name"] = "/music/_singles/kairo.mp3" ② +>>> mp3file {'album': 'Rave Mix', 'artist': '***DJ MARY-JANE***', 'genre': 31, 'title': 'KAIRO****THE BEST GOA', 'name': '/music/_singles/kairo.mp3', 'year': '2000', 'comment': 'http://mp3.com/DJMARYJANE'} ->>> mp3file["name"] = "/music/_singles/sidewinder.mp3" ③ ->>> mp3file +>>> mp3file["name"] = "/music/_singles/sidewinder.mp3" ③ +>>> mp3file {'album': '', 'artist': 'The Cynic Project', 'genre': 18, 'title': 'Sidewinder', 'name': '/music/_singles/sidewinder.mp3', 'year': '2000', 'comment': 'http://mp3.com/cynicproject'}@@ -1832,18 +1832,18 @@ class MP3FileInfo(FileInfo): "album" : ( 63, 93, stripnulls), "year" : ( 93, 97, stripnulls), "comment" : ( 97, 126, stripnulls), -"genre" : (127, 128, ord)}>>> import fileinfo ->>> fileinfo.MP3FileInfo ① +"genre" : (127, 128, ord)}>>> import fileinfo +>>> fileinfo.MP3FileInfo ① <class fileinfo.MP3FileInfo at 01257FDC> ->>> fileinfo.MP3FileInfo.tagDataMap ② +>>> fileinfo.MP3FileInfo.tagDataMap ② {'title': (3, 33, <function stripnulls at 0260C8D4>), 'genre': (127, 128, <built-in function ord>), 'artist': (33, 63, <function stripnulls at 0260C8D4>), 'year': (93, 97, <function stripnulls at 0260C8D4>), 'comment': (97, 126, <function stripnulls at 0260C8D4>), 'album': (63, 93, <function stripnulls at 0260C8D4>)} ->>> m = fileinfo.MP3FileInfo() ③ ->>> m.tagDataMap +>>> m = fileinfo.MP3FileInfo() ③ +>>> m.tagDataMap {'title': (3, 33, <function stripnulls at 0260C8D4>), 'genre': (127, 128, <built-in function ord>), 'artist': (33, 63, <function stripnulls at 0260C8D4>), @@ -1861,26 +1861,26 @@ class MP3FileInfo(FileInfo):
There are no constants in Python. Everything can be changed if you try hard enough. This fits with one of the core principles of Python: bad behavior should be discouraged but not banned. If you really want to change the value of None, you can do it, but don't come running to me when your code is impossible to debug. -Example 5.18. Modifying Class Attributes
>>> class counter: -... count = 0 ① -... def __init__(self): -... self.__class__.count += 1 ② -... ->>> counter +Example 5.18. Modifying Class Attributes
>>> class counter: +... count = 0 ① +... def __init__(self): +... self.__class__.count += 1 ② +... +>>> counter <class __main__.counter at 010EAECC> ->>> counter.count ③ +>>> counter.count ③ 0 ->>> c = counter() ->>> c.count ④ +>>> c = counter() +>>> c.count ④ 1 ->>> counter.count +>>> counter.count 1 ->>> d = counter() ⑤ ->>> d.count +>>> d = counter() ⑤ +>>> d.count 2 ->>> c.count +>>> c.count 2 ->>> counter.count +>>> counter.count 2
- count is a class attribute of the
counterclass. @@ -1907,9 +1907,9 @@ call it directly (even from outside thefileinfomodule) if you hadIn Python, all special methods (like __setitem__) and built-in attributes (like__doc__) follow a standard naming convention: they both start with and end with two underscores. Don't name your own methods and attributes this way, because it will only confuse you (and others) later. -Example 5.19. Trying to Call a Private Method
>>> import fileinfo ->>> m = fileinfo.MP3FileInfo() ->>> m.__parse("/music/_singles/kairo.mp3") ① +Example 5.19. Trying to Call a Private Method
>>> import fileinfo +>>> m = fileinfo.MP3FileInfo() +>>> m.__parse("/music/_singles/kairo.mp3") ① Traceback (innermost last): File "<interactive input>", line 1, in ? AttributeError: 'MP3FileInfo' instance has no attribute '__parse'@@ -1969,15 +1969,15 @@ way back to the default behavior built in to Python, which is to spit out some d many times, an exception is something you can anticipate. If you're opening a file, it might not exist. If you're connecting to a database, it might be unavailable, or you might not have the correct security credentials to access it. If you know a line of code may raise an exception, you should handle the exception using atry...exceptblock. -Example 6.1. Opening a Non-Existent File
>>> fsock = open("/notthere", "r") ① +Example 6.1. Opening a Non-Existent File
>>> fsock = open("/notthere", "r") ① Traceback (innermost last): File "<interactive input>", line 1, in ? IOError: [Errno 2] No such file or directory: '/notthere' ->>> try: -... fsock = open("/notthere") ② -... except IOError: ③ -... print "The file does not exist, exiting gracefully" -... print "This line will always print" ④ +>>> try: +... fsock = open("/notthere") ② +... except IOError: ③ +... print "The file does not exist, exiting gracefully" +... print "This line will always print" ④ The file does not exist, exiting gracefully This line will always print@@ -2041,12 +2041,12 @@ exceptions, errors occur immediately, and you can handle them in a standard way
6.2. Working with File Objects
Python has a built-in function,
open, for opening a file on disk.openreturns a file object, which has methods and attributes for getting information about and manipulating the opened file. -Example 6.3. Opening a File
>>> f = open("/music/_singles/kairo.mp3", "rb") ① ->>> f ② +Example 6.3. Opening a File
>>> f = open("/music/_singles/kairo.mp3", "rb") ① +>>> f ② <open file '/music/_singles/kairo.mp3', mode 'rb' at 010E3988> ->>> f.mode ③ +>>> f.mode ③ 'rb' ->>> f.name ④ +>>> f.name ④ '/music/_singles/kairo.mp3'
- The
openmethod can take up to three parameters: a filename, a mode, and a buffering parameter. Only the first one, the filename, @@ -2058,18 +2058,18 @@ exceptions, errors occur immediately, and you can handle them in a standard way6.2.1. Reading Files
After you open a file, the first thing you'll want to do is read from it, as shown in the next example.
Example 6.4. Reading a File
->>> f +>>> f <open file '/music/_singles/kairo.mp3', mode 'rb' at 010E3988> ->>> f.tell() ① +>>> f.tell() ① 0 ->>> f.seek(-128, 2) ② ->>> f.tell() ③ +>>> f.seek(-128, 2) ② +>>> f.tell() ③ 7542909 ->>> tagData = f.read(128) ④ ->>> tagData +>>> tagData = f.read(128) ④ +>>> tagData 'TAGKAIRO****THE BEST GOA ***DJ MARY-JANE*** Rave Mix 2000http://mp3.com/DJMARYJANE \037' ->>> f.tell() ⑤ +>>> f.tell() ⑤ 7543037
- A file object maintains state about the file it has open. The
tellmethod of a file object tells you your current position in the open file. Since you haven't done anything with this file @@ -2086,28 +2086,28 @@ Rave Mix 2000http://mp3.com/DJMARYJANE \037'Open files consume system resources, and depending on the file mode, other programs may not be able to access them. It's important to close files as soon as you're finished with them.
Example 6.5. Closing a File
->>> f +>>> f <open file '/music/_singles/kairo.mp3', mode 'rb' at 010E3988> ->>> f.closed ① +>>> f.closed ① False ->>> f.close() ② ->>> f +>>> f.close() ② +>>> f <closed file '/music/_singles/kairo.mp3', mode 'rb' at 010E3988> ->>> f.closed ③ +>>> f.closed ③ True ->>> f.seek(0) ④ +>>> f.seek(0) ④ Traceback (innermost last): File "<interactive input>", line 1, in ? ValueError: I/O operation on closed file ->>> f.tell() +>>> f.tell() Traceback (innermost last): File "<interactive input>", line 1, in ? ValueError: I/O operation on closed file ->>> f.read() +>>> f.read() Traceback (innermost last): File "<interactive input>", line 1, in ? ValueError: I/O operation on closed file ->>> f.close() ⑤+>>> f.close() ⑤
- The closed attribute of a file object indicates whether the object has a file open or not. In this case, the file is still open (closed is
False).- To close a file, call the
closemethod of the file object. This frees the lock (if any) that you were holding on the file, flushes buffered writes (if any) @@ -2151,15 +2151,15 @@ ValueError: I/O operation on closed file "if the log file doesn't exist yet, create a new empty file just so you can open it for the first time" logic. Just open it and start writing.Example 6.7. Writing to Files
->>> logfile = open('test.log', 'w') ① ->>> logfile.write('test succeeded') ② ->>> logfile.close() ->>> print file('test.log').read() ③ +>>> logfile = open('test.log', 'w') ① +>>> logfile.write('test succeeded') ② +>>> logfile.close() +>>> print file('test.log').read() ③ test succeeded ->>> logfile = open('test.log', 'a') ④ ->>> logfile.write('line 2') ->>> logfile.close() ->>> print file('test.log').read() ⑤ +>>> logfile = open('test.log', 'a') ④ +>>> logfile.write('line 2') +>>> logfile.close() +>>> print file('test.log').read() ⑤ test succeededline 2@@ -2187,13 +2187,13 @@ test succeededline 2
Like most other languages, Python has
forloops. The only reason you haven't seen them until now is that Python is good at so many other things that you don't need them as often.Most other languages don't have a powerful list datatype like Python, so you end up doing a lot of manual work, specifying a start, end, and step to define a range of integers or characters or other iteratable entities. But in Python, a
forloop simply iterates over a list, the same way list comprehensions work. -Example 6.8. Introducing the
forLoop>>> li = ['a', 'b', 'e'] ->>> for s in li: ① -... print s ② +Example 6.8. Introducing the
forLoop>>> li = ['a', 'b', 'e'] +>>> for s in li: ① +... print s ② a b e ->>> print "\n".join(li) ③ +>>> print "\n".join(li) ③ a b e@@ -2203,16 +2203,16 @@ e