mirror of
https://github.com/kennethreitz/dive-into-python3.git
synced 2026-06-05 23:10:17 +00:00
markup fiddling
This commit is contained in:
@@ -13,7 +13,6 @@ del{background:#f87}
|
||||
mark{background:#ff8;font-weight:bold}
|
||||
</style>
|
||||
</head>
|
||||
<p class=s><a href=#divingin>skip to main content</a>
|
||||
<form action=http://www.google.com/cse><div><input type=hidden name=cx value=014021643941856155761:l5eihuescdw><input type=hidden name=ie value=UTF-8> <input name=q size=31> <input type=submit name=sa value=Search></div></form>
|
||||
<p>You are here: <a href=index.html>Home</a> <span>‣</span> <a href=table-of-contents.html#case-study-porting-chardet-to-python-3>Dive Into Python 3</a> <span>‣</span>
|
||||
<h1>Case study: porting <code>chardet</code> to Python 3</h1>
|
||||
@@ -100,7 +99,6 @@ mark{background:#ff8;font-weight:bold}
|
||||
<p>We’re going to migrate the <code>chardet</code> module from Python 2 to Python 3. Python 3 comes with a utility script called <code>2to3</code>, which takes your actual Python 2 source code as input and auto-converts as much as it can to Python 3. In some cases this is easy — a function was renamed or moved to a different modules — but in other cases it can get pretty complex. To get a sense of all that it <em>can</em> do, refer to the appendix, <a href=porting-code-to-python-3-with-2to3.html>Porting code to Python 3 with <code>2to3</code></a>. In this chapter, we’ll start by running <code>2to3</code> on the <code>chardet</code> package, but as you’ll see, there will still be a lot of work to do after the automated tools have performed their magic.
|
||||
<p>The main <code>chardet</code> package is split across several different files, all in the same directory. The <code>2to3</code> script makes it easy to convert multiple files at once: just pass a directory as a command line argument, and <code>2to3</code> will convert each of the files in turn.
|
||||
<p id=noscript>[The code examples will be easier to follow if you enable Javascript, but whatever.]
|
||||
<p class=s><a href=#skip2to3output>skip over this</a>
|
||||
<pre class=screen><samp class=p>C:\home\chardet> </samp><kbd>python c:\Python30\Tools\Scripts\2to3.py -w chardet\</kbd>
|
||||
<samp>RefactoringTool: Skipping implicit fixer: buffer
|
||||
RefactoringTool: Skipping implicit fixer: idioms
|
||||
@@ -567,8 +565,7 @@ RefactoringTool: chardet\sbcsgroupprober.py
|
||||
RefactoringTool: chardet\sjisprober.py
|
||||
RefactoringTool: chardet\universaldetector.py
|
||||
RefactoringTool: chardet\utf8prober.py</samp></pre>
|
||||
<p id=skip2to3output>Now run the <code>2to3</code> script on the testing harness, <code>test.py</code>.
|
||||
<p class=s><a href=#skip2to3outputtest>skip over this</a>
|
||||
<p>Now run the <code>2to3</code> script on the testing harness, <code>test.py</code>.
|
||||
<pre class=screen><samp class=p>C:\home\chardet> </samp><kbd>python c:\Python30\Tools\Scripts\2to3.py -w test.py</kbd>
|
||||
<samp>RefactoringTool: Skipping implicit fixer: buffer
|
||||
RefactoringTool: Skipping implicit fixer: idioms
|
||||
@@ -599,12 +596,11 @@ RefactoringTool: Skipping implicit fixer: ws_comma
|
||||
<ins>+print(count, 'tests')</ins>
|
||||
RefactoringTool: Files that were modified:
|
||||
RefactoringTool: test.py</samp></pre>
|
||||
<p id=skip2to3outputtest>[FIXME explain the difference in import syntax]
|
||||
<p>[FIXME explain the difference in import syntax]
|
||||
<p>Well, that wasn’t so hard. Just a few imports and print statements to convert. Time to run the new version. Do you think it’ll work?
|
||||
<h2 id=manual>Fixing what <code>2to3</code> can’t</h2>
|
||||
<h3 id=falseisinvalidsyntax><code>False</code> is invalid syntax</h3>
|
||||
<p>Now for the real test: running the test harness against the test suite. Since the test suite is designed to cover all the possible code paths, it’s a good way to test our ported code to make sure there aren’t any bugs lurking anywhere.
|
||||
<p class=s><a href=#skipinvalidsyntax>skip over this</a>
|
||||
<pre class=screen><samp class=p>C:\home\chardet> </samp><kbd>python test.py tests\*\*</kbd>
|
||||
<samp class=traceback>Traceback (most recent call last):
|
||||
File "test.py", line 1, in <module>
|
||||
@@ -613,8 +609,7 @@ RefactoringTool: test.py</samp></pre>
|
||||
self.done = constants.False
|
||||
^
|
||||
SyntaxError: invalid syntax</samp></pre>
|
||||
<p id=skipinvalidsyntax>Hmm, a small snag. In Python 3, <code>False</code> is a reserved word, so you can’t use it as a variable name. Let’s look at <code>constants.py</code> to see where it’s defined. Here’s the original version from <code>constants.py</code>, before the <code>2to3</code> script changed it:
|
||||
<p class=s><a href=#skipbuiltincode>skip over this</a>
|
||||
<p>Hmm, a small snag. In Python 3, <code>False</code> is a reserved word, so you can’t use it as a variable name. Let’s look at <code>constants.py</code> to see where it’s defined. Here’s the original version from <code>constants.py</code>, before the <code>2to3</code> script changed it:
|
||||
<pre><code>import __builtin__
|
||||
if not hasattr(__builtin__, 'False'):
|
||||
False = 0
|
||||
@@ -622,7 +617,7 @@ if not hasattr(__builtin__, 'False'):
|
||||
else:
|
||||
False = __builtin__.False
|
||||
True = __builtin__.True</code></pre>
|
||||
<p id=skipbuiltincode>This piece of code is designed to allow this library to run under older versions of Python 2. Prior to Python 2.3 [FIXME-LINK], Python had no built-in <code>Boolean</code> type. This code detects the absence of the built-in constants <code>True</code> and <code>False</code>, and defines them if necessary.
|
||||
<p>This piece of code is designed to allow this library to run under older versions of Python 2. Prior to Python 2.3 [FIXME-LINK], Python had no built-in <code>Boolean</code> type. This code detects the absence of the built-in constants <code>True</code> and <code>False</code>, and defines them if necessary.
|
||||
<p>However, Python 3 will always have a <code>Boolean</code> type, so this entire code snippet is unnecessary. The simplest solution is to replace all instances of <code>constants.True</code> and <code>constants.False</code> with <code>True</code> and <code>False</code>, respectively, then delete this dead code from <code>constants.py</code>.
|
||||
<p>So this line in <code>universaldetector.py</code>:
|
||||
<pre><code>self.done = constants.False</code></pre>
|
||||
@@ -631,7 +626,6 @@ else:
|
||||
<p>Ah, wasn’t that satisfying? The code is shorter and more readable already.
|
||||
<h3 id=nomodulenamedconstants>No module named <code>constants</code></h3>
|
||||
<p>Time to run <code>test.py</code> again and see how far it gets.
|
||||
<p class=s><a href=#skipnomodulenamedconstants>skip over this</a>
|
||||
<pre class=screen><samp class=p>C:\home\chardet> </samp><kbd>python test.py tests\*\*</kbd>
|
||||
<samp class=traceback>Traceback (most recent call last):
|
||||
File "test.py", line 1, in <module>
|
||||
@@ -639,7 +633,7 @@ else:
|
||||
File "C:\home\chardet\chardet\universaldetector.py", line 29, in <module>
|
||||
import constants, sys
|
||||
ImportError: No module named constants</samp></pre>
|
||||
<p id=skipnomodulenamedconstants>What’s that you say? No module named <code>constants</code>? Of course there’s a module named <code>constants</code>. …Oh wait, no there isn’t. Remember when the <code>2to3</code> script fixed up all those import statements? This library has a lot of relative imports — that is, modules that import other modules within the library. In Python 3, all import statements are absolute by default [FIXME-LINK PEP 0328]. To do relative imports, you need to do something like this instead:
|
||||
<p>What’s that you say? No module named <code>constants</code>? Of course there’s a module named <code>constants</code>. …Oh wait, no there isn’t. Remember when the <code>2to3</code> script fixed up all those import statements? This library has a lot of relative imports — that is, modules that import other modules within the library. In Python 3, all import statements are absolute by default [FIXME-LINK PEP 0328]. To do relative imports, you need to do something like this instead:
|
||||
<pre><code>from . import constants</code></pre>
|
||||
<p>But wait. Wasn’t the <code>2to3</code> script supposed to take care of these for you? Well, it did, but this particular import statement combines two different types of imports into one line: a relative import of the <code>constants</code> module within the library, and an absolute import of the <code>sys</code> module that is pre-installed in the Python standard library. In Python 2, you could combine these into one import statement. In Python 3, you can’t, and the <code>2to3</code> script is not smart enough to split the import statement into two.
|
||||
<p>The solution is to split the import statement manually. So this two-in-one import:
|
||||
@@ -651,20 +645,18 @@ import sys</code></pre>
|
||||
<p>Onward!
|
||||
<h3 id=namefileisnotdefined>Name <var>'file'</var> is not defined</h3>
|
||||
<p>And here we go again, running <code>test.py</code> to try to execute our test cases…</p>
|
||||
<p class=s><a href=#skipnamefileisnotdefined>skip over this</a>
|
||||
<pre class=screen><samp class=p>C:\home\chardet> </samp><kbd>python test.py tests\*\*</kbd>
|
||||
<samp>tests\ascii\howto.diveintomark.org.xml</samp>
|
||||
<samp class=traceback>Traceback (most recent call last):
|
||||
File "test.py", line 9, in <module>
|
||||
for line in file(f, 'rb'):
|
||||
NameError: name 'file' is not defined</samp></pre>
|
||||
<p id=skipnamefileisnotdefined>This one surprised me, because I’ve been using this idiom as long as I can remember. In Python 2, the global <var>file()</var> function was an alias for <var>open()</var>, which was the standard way of opening files for reading. In Python 3, the entire system for reading and writing files has been refactored into the <code>io</code> module. [FIXME-LINK PEP 3116] I’ll cover the new I/O module in more detail in Chapter FIXME, but for now, the important bit is that the global <var>file()</var> function no longer exists. However, the <var>open()</var> function does still exist. (Technically, it’s an alias for <var>io.open()</var>, but never mind that right now.)
|
||||
<p>This one surprised me, because I’ve been using this idiom as long as I can remember. In Python 2, the global <var>file()</var> function was an alias for <var>open()</var>, which was the standard way of opening files for reading. In Python 3, the entire system for reading and writing files has been refactored into the <code>io</code> module. [FIXME-LINK PEP 3116] I’ll cover the new I/O module in more detail in Chapter FIXME, but for now, the important bit is that the global <var>file()</var> function no longer exists. However, the <var>open()</var> function does still exist. (Technically, it’s an alias for <var>io.open()</var>, but never mind that right now.)
|
||||
<p>Thus, the simplest solution to the problem of the missing <var>file()</var> is to call <var>open()</var> instead:
|
||||
<pre><code>for line in open(f, 'rb'):</code></pre>
|
||||
<p>And that’s all I have to say about that.
|
||||
<h3 id=cantuseastringpattern>Can’t use a string pattern on a bytes-like object</h3>
|
||||
<p>Now things are starting to get interesting. And by “interesting,” I mean “confusing as all hell.”
|
||||
<p class=s><a href=#skipcantuseastringpattern>skip over this</a>
|
||||
<pre class=screen><samp class=p>C:\home\chardet> </samp><kbd>python test.py tests\*\*</kbd>
|
||||
<samp>tests\ascii\howto.diveintomark.org.xml</samp>
|
||||
<samp class=traceback>Traceback (most recent call last):
|
||||
@@ -673,34 +665,29 @@ NameError: name 'file' is not defined</samp></pre>
|
||||
File "C:\home\chardet\chardet\universaldetector.py", line 98, in feed
|
||||
if self._highBitDetector.search(aBuf):
|
||||
TypeError: can't use a string pattern on a bytes-like object</samp></pre>
|
||||
<p id=skipcantuseastringpattern>
|
||||
<p>To debug this, let’s see what <var>self._highBitDetector</var> is. It’s defined in the <var>__init__</var> method of the <var>UniversalDetector</var> class:
|
||||
<p class=s><a href=#skiphighbitdetectorcode>skip over this</a>
|
||||
<pre><code>class UniversalDetector:
|
||||
def __init__(self):
|
||||
self._highBitDetector = re.compile(r'[\x80-\xFF]')</code></pre>
|
||||
<p id=skiphighbitdetectorcode>This pre-compiles a regular expression designed to find non-<abbr>ASCII</abbr> characters in the range 128–255 (0x80–0xFF). Wait, that’s not quite right; I need to be more precise with my terminology. This pattern is designed to find non-<abbr>ASCII</abbr> <em>bytes</em> in the range 128-255.
|
||||
<p>This pre-compiles a regular expression designed to find non-<abbr>ASCII</abbr> characters in the range 128–255 (0x80–0xFF). Wait, that’s not quite right; I need to be more precise with my terminology. This pattern is designed to find non-<abbr>ASCII</abbr> <em>bytes</em> in the range 128-255.
|
||||
<p>And therein lies the problem.
|
||||
<p>In Python 2, a string was an array of bytes whose character encoding was tracked separately. If you wanted Python 2 to keep track of the character encoding, you had to use a Unicode string (<code>u''</code>) instead. But in Python 3, a string is always what Python 2 called a Unicode string — that is, an array of Unicode characters (of possibly varying byte lengths). Since this regular expression is defined by a string pattern, it can only be used to search a string — again, an array of characters. But what we’re searching is not a string, it’s a byte array. Looking at the traceback, this error occurred in <code>universaldetector.py</code>:
|
||||
<p class=s><a href=#skipfeedhighbitdetectorcode>skip over this</a>
|
||||
<pre><code>def feed(self, aBuf):
|
||||
.
|
||||
.
|
||||
.
|
||||
if self._mInputState == ePureAscii:
|
||||
if self._highBitDetector.search(aBuf):</code></pre>
|
||||
<p id=skipfeedhighbitdetectorcode>And what is <var>aBuf</var>? Let’s backtrack further to a place that calls <code>UniversalDetector.feed()</code>. One place that calls it is the test harness, <code>test.py</code>.
|
||||
<p class=s><a href=#skiptestharnessfeedcode>skip over this</a>
|
||||
<p>And what is <var>aBuf</var>? Let’s backtrack further to a place that calls <code>UniversalDetector.feed()</code>. One place that calls it is the test harness, <code>test.py</code>.
|
||||
<pre><code>u = UniversalDetector()
|
||||
.
|
||||
.
|
||||
.
|
||||
for line in open(f, 'rb'):
|
||||
u.feed(line)</code></pre>
|
||||
<p id=skiptestharnessfeedcode>And here we find our answer: in the <code>UniversalDetector.feed()</code> method, <var>aBuf</var> is a line read from a file on disk. Look carefully at the parameters used to open the file: <code>'rb'</code>. <code>'r'</code> is for “read”; OK, big deal, we’re reading the file. Ah, but <code>'b'</code> is for “binary.” Without the <code>'b'</code> flag, this <code>for</code> loop would read the file, line by line, and convert each line into a string — an array of Unicode characters — according to the system default character encoding. (You could override the system encoding with another parameter to <var>open()</var>, but never mind that for now.) But with the <code>'b'</code> flag, this <code>for</code> loop reads the file, line by line, and stores each line exactly as it appears in the file, as an array of bytes. That byte array gets passed to <code>UniversalDetector.feed()</code>, and eventually gets passed to the pre-compiled regular expression, <var>self._highBitDetector</var>, to search for high-bit… characters. But we don’t have characters; we have bytes. Oops.
|
||||
<p>And here we find our answer: in the <code>UniversalDetector.feed()</code> method, <var>aBuf</var> is a line read from a file on disk. Look carefully at the parameters used to open the file: <code>'rb'</code>. <code>'r'</code> is for “read”; OK, big deal, we’re reading the file. Ah, but <code>'b'</code> is for “binary.” Without the <code>'b'</code> flag, this <code>for</code> loop would read the file, line by line, and convert each line into a string — an array of Unicode characters — according to the system default character encoding. (You could override the system encoding with another parameter to <var>open()</var>, but never mind that for now.) But with the <code>'b'</code> flag, this <code>for</code> loop reads the file, line by line, and stores each line exactly as it appears in the file, as an array of bytes. That byte array gets passed to <code>UniversalDetector.feed()</code>, and eventually gets passed to the pre-compiled regular expression, <var>self._highBitDetector</var>, to search for high-bit… characters. But we don’t have characters; we have bytes. Oops.
|
||||
<p>What we need this regular expression to search is not an array of characters, but an array of bytes.
|
||||
<p>Once you realize that, the solution is not difficult. Regular expressions defined with strings can search strings. Regular expressions defined with byte arrays can search byte arrays. To define a byte array pattern, we simply change the type of the argument we use to define the regular expression to a byte array. (There is one other case of this same problem, on the very next line.)
|
||||
<p class=s><a href=#skip-cant-use-a-string-pattern-solution>skip over this code listing</a>
|
||||
<pre><code> class UniversalDetector:
|
||||
def __init__(self):
|
||||
<del>- self._highBitDetector = re.compile(b'[\x80-\xFF]')</del>
|
||||
@@ -710,8 +697,7 @@ for line in open(f, 'rb'):
|
||||
self._mEscCharSetProber = None
|
||||
self._mCharSetProbers = []
|
||||
self.reset()</code></pre>
|
||||
<p id=skip-case-use-a-string-pattern-solution>Searching the entire codebase for other uses of the <code>re</code> module turns up two more instances, in <code>charsetprober.py</code>. Again, the code is defining regular expressions as strings but executing them on <var>aBuf</var>, which is a byte array. The solution is the same: define the regular expression patterns as byte arrays.
|
||||
<p class=s><a href=#cantconvertbytesobject>skip over this code listing</a>
|
||||
<p>Searching the entire codebase for other uses of the <code>re</code> module turns up two more instances, in <code>charsetprober.py</code>. Again, the code is defining regular expressions as strings but executing them on <var>aBuf</var>, which is a byte array. The solution is the same: define the regular expression patterns as byte arrays.
|
||||
<pre><code> class CharSetProber:
|
||||
.
|
||||
.
|
||||
@@ -728,7 +714,6 @@ for line in open(f, 'rb'):
|
||||
|
||||
<h3 id=cantconvertbytesobject>Can't convert <code>'bytes'</code> object to <code>str</code> implicitly</h3>
|
||||
<p>Curiouser and curiouser…
|
||||
<p class=s><a href=#skipcantconvertbytesobject>skip over this</a>
|
||||
<pre class=screen><samp class=p>C:\home\chardet> </samp><kbd>python test.py tests\*\*</kbd>
|
||||
<samp>tests\ascii\howto.diveintomark.org.xml</samp>
|
||||
<samp class=traceback>Traceback (most recent call last):
|
||||
@@ -737,12 +722,10 @@ for line in open(f, 'rb'):
|
||||
File "C:\home\chardet\chardet\universaldetector.py", line 100, in feed
|
||||
elif (self._mInputState == ePureAscii) and self._escDetector.search(self._mLastChar + aBuf):
|
||||
TypeError: Can't convert 'bytes' object to str implicitly</samp></pre>
|
||||
<p id=skipcantconvertbytesobject>There's an unfortunate clash of coding style and Python interpreter here. The <code>TypeError</code> could be anywhere on that line, but the traceback doesn't tell you exactly where it is. It could be in the first conditional or the second, and the traceback would look the same. To narrow it down, you should split the line in half, like this:
|
||||
<p class=s><a href=#skip-split-conditional>skip over this code listing</a>
|
||||
<p>There's an unfortunate clash of coding style and Python interpreter here. The <code>TypeError</code> could be anywhere on that line, but the traceback doesn't tell you exactly where it is. It could be in the first conditional or the second, and the traceback would look the same. To narrow it down, you should split the line in half, like this:
|
||||
<pre><code>elif (self._mInputState == ePureAscii) and \
|
||||
self._escDetector.search(self._mLastChar + aBuf):</code></pre>
|
||||
<p id=skip-split-conditional>And re-run the test:</p>
|
||||
<p class=s><a href=#skip-cant-convert-bytes-object-2>skip over this command output listing</a>
|
||||
<p>And re-run the test:</p>
|
||||
<pre class=screen><samp class=p>C:\home\chardet> </samp><kbd>python test.py tests\*\*</kbd>
|
||||
<samp>tests\ascii\howto.diveintomark.org.xml</samp>
|
||||
<samp class=traceback>Traceback (most recent call last):
|
||||
@@ -751,9 +734,8 @@ TypeError: Can't convert 'bytes' object to str implicitly</samp></pre>
|
||||
File "C:\home\chardet\chardet\universaldetector.py", line 101, in feed
|
||||
self._escDetector.search(self._mLastChar + aBuf):
|
||||
TypeError: Can't convert 'bytes' object to str implicitly</samp></pre>
|
||||
<p id=skip-over-cant-convert-bytes-object-2>Aha! The problem was not in the first conditional (<code>self._mInputState == ePureAscii</code>) but in the second one. So what could cause a <code>TypeError</code> there? Perhaps you're thinking that the <code>search()</code> method is expecting a value of a different type, but that wouldn't generate this traceback. Python functions can take any value; if you pass the right number of arguments, the function will execute. It may <em>crash</em> if you pass it a value of a different type than it's expecting, but if that happened, the traceback would point to somewhere inside the function. But this traceback says it never got as far as calling the <code>search()</code> method. So the problem must be in that <code>+</code> operation, as it's trying to construct the value that it will eventually pass to the <code>search()</code> method.
|
||||
<p>Aha! The problem was not in the first conditional (<code>self._mInputState == ePureAscii</code>) but in the second one. So what could cause a <code>TypeError</code> there? Perhaps you're thinking that the <code>search()</code> method is expecting a value of a different type, but that wouldn't generate this traceback. Python functions can take any value; if you pass the right number of arguments, the function will execute. It may <em>crash</em> if you pass it a value of a different type than it's expecting, but if that happened, the traceback would point to somewhere inside the function. But this traceback says it never got as far as calling the <code>search()</code> method. So the problem must be in that <code>+</code> operation, as it's trying to construct the value that it will eventually pass to the <code>search()</code> method.
|
||||
<p>We know from <a href=#cantuseastringpattern>previous debugging</a> that <var>aBuf</var> is a byte array. So what is <code>self._mLastChar</code>? It's an instance variable, defined in the <code>reset()</code> method, which is actually called from the <code>__init__()</code> method.
|
||||
<p class=s><a href=#skip-mlastchar-declaration>skip over this code listing</a>
|
||||
<pre><code>class UniversalDetector:
|
||||
def __init__(self):
|
||||
self._highBitDetector = re.compile(b'[\x80-\xFF]')
|
||||
@@ -769,9 +751,8 @@ TypeError: Can't convert 'bytes' object to str implicitly</samp></pre>
|
||||
self._mGotData = False
|
||||
self._mInputState = ePureAscii
|
||||
<mark> self._mLastChar = ''</mark></code></pre>
|
||||
<p id=skip-mlastchar-declaration>And now we have our answer. Do you see it? <var>self._mLastChar</var> is a string, but <var>aBuf</var> is a byte array. And you can't concatenate a string to a byte array — not even a zero-length string.
|
||||
<p>And now we have our answer. Do you see it? <var>self._mLastChar</var> is a string, but <var>aBuf</var> is a byte array. And you can't concatenate a string to a byte array — not even a zero-length string.
|
||||
<p>So what is <var>self._mLastChar</var> anyway? The answer is in the <code>feed()</code> method, just a few lines down from where the trackback occurred.
|
||||
<p class=s><a href=#skip-mlastchar-set>skip over this code listing</a>
|
||||
<pre><code>if self._mInputState == ePureAscii:
|
||||
if self._highBitDetector.search(aBuf):
|
||||
self._mInputState = eHighbyte
|
||||
@@ -781,15 +762,13 @@ TypeError: Can't convert 'bytes' object to str implicitly</samp></pre>
|
||||
|
||||
<mark>self._mLastChar = aBuf[-1]</mark></code></pre>
|
||||
<p>The calling function calls this <code>feed()</code> method over and over again with a few bytes at a time. The method processes the bytes it was given (passed in as <var>aBuf</var>), then stores the last byte in <var>self._mLastChar</var> in case it's needed during the next call. (In a multi-byte encoding, the <code>feed()</code> method might get called with half of a character, then called again with the other half.) But because <var>aBuf</var> is now a byte array instead of a string, <var>self._mLastChar</var> needs to be a byte array as well. Thus:
|
||||
<p class=s><a href=#skip-mlastchar-solution>skip over this code listing</a>
|
||||
<pre><code> def reset(self):
|
||||
.
|
||||
.
|
||||
.
|
||||
<del>- self._mLastChar = ''</del>
|
||||
<ins>+ self._mLastChar = b''</ins></code></pre>
|
||||
<p id=skip-mlastchar-solution>Searching the entire codebase for <code>"mLastChar"</code> turns up a similar problem in <code>mbcharsetprober.py</code>, but instead of tracking the last character, it tracks the last <em>two</em> characters. The <code>MultiByteCharSetProber</code> class uses a list of 1-character strings to track the last two characters; in Python 3, it needs to use a list of integers.
|
||||
<p class=s><a href=#skip-mbcharsetprober>skip over this code listing</a>
|
||||
<p>Searching the entire codebase for <code>"mLastChar"</code> turns up a similar problem in <code>mbcharsetprober.py</code>, but instead of tracking the last character, it tracks the last <em>two</em> characters. The <code>MultiByteCharSetProber</code> class uses a list of 1-character strings to track the last two characters; in Python 3, it needs to use a list of integers.
|
||||
<pre><code>
|
||||
class MultiByteCharSetProber(CharSetProber):
|
||||
def __init__(self):
|
||||
@@ -809,7 +788,6 @@ TypeError: Can't convert 'bytes' object to str implicitly</samp></pre>
|
||||
<ins>+ self._mLastChar = [0, 0]</ins></code></pre>
|
||||
<h3 id=unsupportedoperandtypeforplus>Unsupported operand type(s) for +: <code>'int'</code> and <code>'bytes'</code></h3>
|
||||
<p>I have good news, and I have bad news. The good news is we're making progress…
|
||||
<p class=s><a href=#skip-unsupported-operand-types>skip over this command listing</a>
|
||||
<pre class=screen><samp class=p>C:\home\chardet> </samp><kbd>python test.py tests\*\*</kbd>
|
||||
<samp>tests\ascii\howto.diveintomark.org.xml</samp>
|
||||
<samp class=traceback>Traceback (most recent call last):
|
||||
@@ -818,10 +796,9 @@ TypeError: Can't convert 'bytes' object to str implicitly</samp></pre>
|
||||
File "C:\home\chardet\chardet\universaldetector.py", line 101, in feed
|
||||
self._escDetector.search(self._mLastChar + aBuf):
|
||||
TypeError: unsupported operand type(s) for +: 'int' and 'bytes'</samp></pre>
|
||||
<p id=skip-unsupported-operand-types>…The bad news is it doesn't always feel like progress.
|
||||
<p>…The bad news is it doesn't always feel like progress.
|
||||
<p>But this is progress! Really! Even though the traceback calls out the same line of code, it's a different error than it used to be. Progress! So what's the problem now? The last time I checked, this line of code didn't try to concatenate an <code>int</code> with a byte array (<code>bytes</code>). In fact, you just spent a lot of time <a href=#cantconvertbytesobject>ensuring that <var>self._mLastChar</var> was a byte array</a>. How did it turn into an <code>int</code>?
|
||||
<p>The answer lies not in the previous lines of code, but in the following lines.
|
||||
<p class=s><a href=#skip-mlastchar-highlight>skip over this code listing</a>
|
||||
<pre><code>if self._mInputState == ePureAscii:
|
||||
if self._highBitDetector.search(aBuf):
|
||||
self._mInputState = eHighbyte
|
||||
@@ -830,8 +807,7 @@ TypeError: unsupported operand type(s) for +: 'int' and 'bytes'</samp></pre>
|
||||
self._mInputState = eEscAscii
|
||||
|
||||
<mark>self._mLastChar = aBuf[-1]</mark></code></pre>
|
||||
<p id=skip-mlastchar-highlight>This error doesn't occur the first time the <code>feed()</code> method gets called; it occurs the <em>second time</em>, after <var>self._mLastChar</var> has been set to the last byte of <var>aBuf</var>. Well, what's the problem with that? Getting a single element from a byte array yields an integer, not a byte array. To see the difference, follow me to the interactive shell:
|
||||
<p class=s><a href=#skip-mlastchar-interactive>skip over this interpreter listing</a>
|
||||
<p>This error doesn't occur the first time the <code>feed()</code> method gets called; it occurs the <em>second time</em>, after <var>self._mLastChar</var> has been set to the last byte of <var>aBuf</var>. Well, what's the problem with that? Getting a single element from a byte array yields an integer, not a byte array. To see the difference, follow me to the interactive shell:
|
||||
<pre class=screen>
|
||||
<a><samp class=p>>>> </samp><kbd>aBuf = b'\xEF\xBB\xBF'</kbd> <span>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd>len(aBuf)</kbd>
|
||||
@@ -850,7 +826,7 @@ TypeError: unsupported operand type(s) for +: 'int' and 'bytes'</samp>
|
||||
<samp>b'\xbf'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>mLastChar + aBuf</kbd> <span>⑥</span></a>
|
||||
<samp>b'\xbf\xef\xbb\xbf'</samp></pre>
|
||||
<ol id=skip-mlastchar-interactive>
|
||||
<ol>
|
||||
<li>Define a byte array of length 3.
|
||||
<li>The last element of the byte array is 191.
|
||||
<li>That's an integer.
|
||||
@@ -866,7 +842,6 @@ TypeError: unsupported operand type(s) for +: 'int' and 'bytes'</samp>
|
||||
<ins>+ self._mLastChar = aBuf[-1:]</ins></code></pre>
|
||||
<h3 id=ordexpectedstring><code>ord()</code> expected string of length 1, but <code>int</code> found</h3>
|
||||
<p>Tired yet? You're almost there…
|
||||
<p class=s><a href=#skip-ord-expected-string>skip over this command output listing</a>
|
||||
<pre class=screen><samp class=p>C:\home\chardet> </samp><kbd>python test.py tests\*\*</kbd>
|
||||
<samp>tests\ascii\howto.diveintomark.org.xml ascii with confidence 1.0
|
||||
tests\Big5\0804.blogspot.com.xml</samp>
|
||||
@@ -882,29 +857,25 @@ tests\Big5\0804.blogspot.com.xml</samp>
|
||||
File "C:\home\chardet\chardet\codingstatemachine.py", line 43, in next_state
|
||||
byteCls = self._mModel['classTable'][ord(c)]
|
||||
TypeError: ord() expected string of length 1, but int found</samp></pre>
|
||||
<p id=skip-ord-expected-string>OK, so <var>c</var> is an <code>int</code>, but the <code>ord()</code> function was expecting a 1-character string. Fair enough. Where is <var>c</var> defined?
|
||||
<p class=s><a href=#skip-next-state>skip over this code listing</a>
|
||||
<p>OK, so <var>c</var> is an <code>int</code>, but the <code>ord()</code> function was expecting a 1-character string. Fair enough. Where is <var>c</var> defined?
|
||||
<pre><code># codingstatemachine.py
|
||||
def next_state(self, c):
|
||||
# for each byte we get its class
|
||||
# if it is first byte, we also get byte length
|
||||
byteCls = self._mModel['classTable'][ord(c)]</code></pre>
|
||||
<p id=skip-next-state>That's no help; it's just passed into the function. Let's pop the stack.
|
||||
<p class=s><a href=#skip-utf8prober-feed>skip over this code listing</a>
|
||||
<p>That's no help; it's just passed into the function. Let's pop the stack.
|
||||
<pre><code># utf8prober.py
|
||||
def feed(self, aBuf):
|
||||
for c in aBuf:
|
||||
codingState = self._mCodingSM.next_state(c)</code></pre>
|
||||
<p id=skip-utf8prober-feed>And now we have the answer. Do you see it? In Python 2, <var>aBuf</var> was a string, so <var>c</var> was a 1-character string. (That's what you get when you iterate over a string — all the characters, one by one.) But now, <var>aBuf</var> is a byte array, so <var>c</var> is an <code>int</code>, not a 1-character string. In other words, there's no need to call the <code>ord()</code> function because <var>c</var> is already an <code>int</code>!
|
||||
<p>And now we have the answer. Do you see it? In Python 2, <var>aBuf</var> was a string, so <var>c</var> was a 1-character string. (That's what you get when you iterate over a string — all the characters, one by one.) But now, <var>aBuf</var> is a byte array, so <var>c</var> is an <code>int</code>, not a 1-character string. In other words, there's no need to call the <code>ord()</code> function because <var>c</var> is already an <code>int</code>!
|
||||
<p>Thus:
|
||||
<p class=s><a href=#skip-ordc-diff>skip over this code listing</a>
|
||||
<pre><code> def next_state(self, c):
|
||||
# for each byte we get its class
|
||||
# if it is first byte, we also get byte length
|
||||
<del>- byteCls = self._mModel['classTable'][ord(c)]</del>
|
||||
<ins>+ byteCls = self._mModel['classTable'][c]</ins></code></pre>
|
||||
<p>Searching the entire codebase for instances of <code>"ord(c)"</code> uncovers similar problems in <code>sbcharsetprober.py</code>…
|
||||
<p class=s><a href=#skip-sbcharsetprober-code>skip over this code listing</a>
|
||||
<pre><code># sbcharsetprober.py
|
||||
def feed(self, aBuf):
|
||||
if not self._mModel['keepEnglishLetter']:
|
||||
@@ -914,15 +885,13 @@ def feed(self, aBuf):
|
||||
return self.get_state()
|
||||
for c in aBuf:
|
||||
<mark> order = self._mModel['charToOrderMap'][ord(c)]</mark></code></pre>
|
||||
<p id=skip-sbcharsetprober-code>…and <code>latin1prober.py</code>…
|
||||
<p class=s><a href=#skip-latin1prober-code-2>skip over this code listing</a>
|
||||
<p>…and <code>latin1prober.py</code>…
|
||||
<pre><code># latin1prober.py
|
||||
def feed(self, aBuf):
|
||||
aBuf = self.filter_with_english_letters(aBuf)
|
||||
for c in aBuf:
|
||||
<mark> charClass = Latin1_CharToClass[ord(c)]</mark></code></pre>
|
||||
<p id=skip-sbcharsetprober-code-2><var>c</var> is iterating over <var>aBuf</var>, which means it is an integer, not a 1-character string. The solution is the same: change <code>ord(c)</code> to just plain <code>c</code>.
|
||||
<p class=s><a href=#unorderabletypes>skip over this code listing</a>
|
||||
<p><var>c</var> is iterating over <var>aBuf</var>, which means it is an integer, not a 1-character string. The solution is the same: change <code>ord(c)</code> to just plain <code>c</code>.
|
||||
<pre><code> # sbcharsetprober.py
|
||||
def feed(self, aBuf):
|
||||
if not self._mModel['keepEnglishLetter']:
|
||||
@@ -943,7 +912,6 @@ def feed(self, aBuf):
|
||||
</code></pre>
|
||||
<h3 id=unorderabletypes>Unorderable types: <code>int()</code> >= <code>str()</code></h3>
|
||||
<p>Let's go again.
|
||||
<p class=s><a href=#skip-unorderable-types-screen>skip over this command output listing</a>
|
||||
<pre class=screen><samp class=p>C:\home\chardet> </samp><kbd>python test.py tests\*\*</kbd>
|
||||
<samp>tests\ascii\howto.diveintomark.org.xml ascii with confidence 1.0
|
||||
tests\Big5\0804.blogspot.com.xml</samp>
|
||||
@@ -961,9 +929,8 @@ tests\Big5\0804.blogspot.com.xml</samp>
|
||||
File "C:\home\chardet\chardet\jpcntx.py", line 176, in get_order
|
||||
if ((aStr[0] >= '\x81') and (aStr[0] <= '\x9F')) or \
|
||||
TypeError: unorderable types: int() >= str()</samp></pre>
|
||||
<p id=skip-unorderable-types-screen>Did you notice? This time around, the code passed the first test case (<code>tests\ascii\howto.diveintomark.org.xml</code>). You're making real progress here.
|
||||
<p>Did you notice? This time around, the code passed the first test case (<code>tests\ascii\howto.diveintomark.org.xml</code>). You're making real progress here.
|
||||
<p>So what's this all about? “Unorderable types”? Once again, the difference between byte arrays and strings is rearing its ugly head. Take a look at the code:
|
||||
<p class=s><a href=#skip-unorderable-types-1>skip over this code listing</a>
|
||||
<pre><code>class SJISContextAnalysis(JapaneseContextAnalysis):
|
||||
def get_order(self, aStr):
|
||||
if not aStr: return -1, 1
|
||||
@@ -973,8 +940,7 @@ TypeError: unorderable types: int() >= str()</samp></pre>
|
||||
charLen = 2
|
||||
else:
|
||||
charLen = 1</code></pre>
|
||||
<p id=skip-unorderable-types-1>And where does <var>aStr</var> come from? Let's pop the stack:
|
||||
<p class=s><a href=#skip-unorderable-types-2>skip over this code listing</a>
|
||||
<p>And where does <var>aStr</var> come from? Let's pop the stack:
|
||||
<pre><code>def feed(self, aBuf, aLen):
|
||||
.
|
||||
.
|
||||
@@ -982,10 +948,9 @@ TypeError: unorderable types: int() >= str()</samp></pre>
|
||||
i = self._mNeedToSkipCharNum
|
||||
while i < aLen:
|
||||
<mark> order, charLen = self.get_order(aBuf[i:i+2])</mark></code></pre>
|
||||
<p id=skip-unorderable-types-2>Oh look, it's our old friend, <var>aBuf</var>. As you might have guessed from every other issue we've encountered in this chapter, <var>aBuf</var> is a byte array. Here, the <code>feed()</code> method isn't just passing it on wholesale; it's slicing it. But as you saw <a href=#unsupportedoperandtypeforplus>earlier in this chapter</a>, slicing a byte array returns a byte array, so the <var>aStr</var> parameter that gets passed to the <code>get_order()</code> method is still a byte array.
|
||||
<p>Oh look, it's our old friend, <var>aBuf</var>. As you might have guessed from every other issue we've encountered in this chapter, <var>aBuf</var> is a byte array. Here, the <code>feed()</code> method isn't just passing it on wholesale; it's slicing it. But as you saw <a href=#unsupportedoperandtypeforplus>earlier in this chapter</a>, slicing a byte array returns a byte array, so the <var>aStr</var> parameter that gets passed to the <code>get_order()</code> method is still a byte array.
|
||||
<p>And what is this code trying to do with <var>aStr</var>? It's taking the first element of the byte array and comparing it to a string of length 1. In Python 2, that worked, because <var>aStr</var> and <var>aBuf</var> were strings, and <var>aStr[0]</var> would be a string, and you can compare strings for inequality. But in Python 3, <var>aStr</var> and <var>aBuf</var> are byte arrays, <var>aStr[0]</var> is an integer, and you can't compare integers and strings for inequality without explicitly coercing one of them.
|
||||
<p>In this case, there's no need to make the code more complicated by adding an explicit coercion. <var>aStr[0]</var> yields an integer; the things you're comparing to are all constants. Let's change them from 1-character strings to integers.
|
||||
<p class=s><a href=#skip-unorderable-types-3>skip over this code listing</a>
|
||||
<pre><code> class SJISContextAnalysis(JapaneseContextAnalysis):
|
||||
def get_order(self, aStr):
|
||||
if not aStr: return -1, 1
|
||||
@@ -1039,7 +1004,6 @@ TypeError: unorderable types: int() >= str()</samp></pre>
|
||||
|
||||
return -1, charLen</code></pre>
|
||||
<p>Searching the entire codebase for occurrences of the <code>ord()</code> function uncovers the same problem in <code>chardistribution.py</code>:
|
||||
<p class=s><a href=#skip-unorderable-types-4>skip over this command output listing</a>
|
||||
<pre class=screen><samp class=p>C:\home\chardet> </samp><kbd>python test.py tests\*\*</kbd>
|
||||
<samp>tests\ascii\howto.diveintomark.org.xml ascii with confidence 1.0
|
||||
tests\Big5\0804.blogspot.com.xml</samp>
|
||||
@@ -1057,8 +1021,7 @@ tests\Big5\0804.blogspot.com.xml</samp>
|
||||
File "C:\home\chardet\chardet\chardistribution.py", line 174, in get_order
|
||||
if (aStr[0] >= '\x81') and (aStr[0] <= '\x9F'):
|
||||
TypeError: unorderable types: int() >= str()</samp></pre>
|
||||
<p id=skip-unorderable-types-4>The fix is the same:
|
||||
<p class=s><a href=#reduceisnotdefined>skip over this code listing</a>
|
||||
<p>The fix is the same:
|
||||
<pre><code> class EUCTWDistributionAnalysis(CharDistributionAnalysis):
|
||||
def __init__(self):
|
||||
CharDistributionAnalysis.__init__(self)
|
||||
@@ -1165,7 +1128,6 @@ TypeError: unorderable types: int() >= str()</samp></pre>
|
||||
return -1</code></pre>
|
||||
<h3 id=reduceisnotdefined>Global name <code>'reduce'</code> is not defined</h3>
|
||||
<p>Once more into the breach…
|
||||
<p class=s><a href=#skip-reduceisnotdefined-output>skip over this command output listing</a>
|
||||
<pre class=screen><samp class=p>C:\home\chardet> </samp><kbd>python test.py tests\*\*</kbd>
|
||||
<samp>tests\ascii\howto.diveintomark.org.xml ascii with confidence 1.0
|
||||
tests\Big5\0804.blogspot.com.xml</samp>
|
||||
@@ -1177,16 +1139,14 @@ tests\Big5\0804.blogspot.com.xml</samp>
|
||||
File "C:\home\chardet\chardet\latin1prober.py", line 126, in get_confidence
|
||||
total = reduce(operator.add, self._mFreqCounter)
|
||||
NameError: global name 'reduce' is not defined</samp></pre>
|
||||
<p id=skip-reduceisnotdefined-output>According to the official <a href=http://docs.python.org/dev/3.0/whatsnew/3.0.html#builtins>What's New In Python 3.0</a> guide, the <code>reduce()</code> function has been moved out of the global namespace and into the <code>functools</code> module. Quoting the guide: "Use <code>functools.reduce()</code> if you really need it; however, 99 percent of the time an explicit <code>for</code> loop is more readable."
|
||||
<p>According to the official <a href=http://docs.python.org/dev/3.0/whatsnew/3.0.html#builtins>What's New In Python 3.0</a> guide, the <code>reduce()</code> function has been moved out of the global namespace and into the <code>functools</code> module. Quoting the guide: "Use <code>functools.reduce()</code> if you really need it; however, 99 percent of the time an explicit <code>for</code> loop is more readable."
|
||||
<p>OK then, let's refactor it to use a <code>for</code> loop.
|
||||
<p class=s><a href=#skip-reduce-code>skip over this code listing</a>
|
||||
<pre><code>def get_confidence(self):
|
||||
if self.get_state() == constants.eNotMe:
|
||||
return 0.01
|
||||
|
||||
<mark> total = reduce(operator.add, self._mFreqCounter)</mark></code></pre>
|
||||
<p>The <code>reduce()</code> function takes two arguments — a function and a list (strictly speaking, any iterable object will do) — and applies the function cumulatively to each item of the list. In other words, this is a fancy and roundabout way of adding up all the items in a list and returning the result. It looks much more readable as a <code>for</code> loop.
|
||||
<p class=s><a href=#skip-reduce-refactoring>skip over this code listing</a>
|
||||
<pre><code> def get_confidence(self):
|
||||
if self.get_state() == constants.eNotMe:
|
||||
return 0.01
|
||||
@@ -1195,8 +1155,7 @@ NameError: global name 'reduce' is not defined</samp></pre>
|
||||
<ins>+ total = 0</ins>
|
||||
<ins>+ for frequency in self._mFreqCounter:</ins>
|
||||
<ins>+ total += frequency</ins></code></pre>
|
||||
<p id=skip-reduce-refactoring>I CAN HAZ TESTZ?
|
||||
<p class=s><a href=#skip-final-output>skip over this command output listing</a>
|
||||
<p>I CAN HAZ TESTZ?
|
||||
<pre class=screen><samp class=p>C:\home\chardet> </samp><kbd>python test.py tests\*\*</kbd>
|
||||
<samp>tests\ascii\howto.diveintomark.org.xml ascii with confidence 1.0
|
||||
tests\Big5\0804.blogspot.com.xml Big5 with confidence 0.99
|
||||
@@ -1231,7 +1190,7 @@ tests\EUC-JP\arclamp.jp.xml EUC-JP with confide
|
||||
.
|
||||
.
|
||||
316 tests</samp></pre>
|
||||
<p id=skip-final-output>Holy crap, it actually works! <em><a href=http://www.hampsterdance.com/>/me does a little dance</a></em>
|
||||
<p>Holy crap, it actually works! <em><a href=http://www.hampsterdance.com/>/me does a little dance</a></em>
|
||||
<h2 id=summary>Summary</h2>
|
||||
<p>What have we learned?
|
||||
<ol>
|
||||
|
||||
@@ -26,10 +26,6 @@ a:link,.w a{color:#26c}
|
||||
a:visited{color:#93c}
|
||||
.c a{color:inherit}
|
||||
|
||||
/* skip links */
|
||||
.s a,.s a:hover,.s a:visited{position:absolute;left:0px;top:-500px;width:1px;height:1px;overflow:hidden}
|
||||
.s a:active,.s a:focus{position:static;width:auto;height:auto}
|
||||
|
||||
/* code blocks */
|
||||
pre{white-space:pre-wrap;padding-left:2.154em;border-left:1px solid #ddd}
|
||||
.w{float:left}
|
||||
|
||||
@@ -9,7 +9,6 @@
|
||||
body{counter-reset:h1 2}
|
||||
</style>
|
||||
</head>
|
||||
<p class=s><a href=#divingin>skip to main content</a>
|
||||
<form action=http://www.google.com/cse><div><input type=hidden name=cx value=014021643941856155761:l5eihuescdw><input type=hidden name=ie value=UTF-8> <input name=q size=31> <input type=submit name=root value=Search></div></form>
|
||||
<p>You are here: <a href=index.html>Home</a> <span>‣</span> <a href=table-of-contents.html#native-datatypes>Dive Into Python 3</a> <span>‣</span>
|
||||
<h1>Native datatypes</h1>
|
||||
|
||||
@@ -19,7 +19,6 @@ th,td,td pre{margin:0}
|
||||
td pre{padding:0;border:0}
|
||||
</style>
|
||||
</head>
|
||||
<p class=s><a href=#divingin>skip to main content</a>
|
||||
<form action=http://www.google.com/cse><div><input type=hidden name=cx value=014021643941856155761:l5eihuescdw><input type=hidden name=ie value=UTF-8> <input name=q size=31> <input type=submit name=sa value=Search></div></form>
|
||||
<p>You are here: <a href=index.html>Home</a> <span>‣</span> <a href=table-of-contents.html#porting-code-to-python-3-with-2to3>Dive Into Python 3</a> <span>‣</span>
|
||||
<h1>Porting code to Python 3 with <code>2to3</code></h1>
|
||||
@@ -89,8 +88,7 @@ td pre{padding:0;border:0}
|
||||
<h2 id=print><code>print</code> statement</h2>
|
||||
<p>In Python 2, <code>print</code> was a statement. Whatever you wanted to print simply followed the <code>print</code> keyword. In Python 3, <code>print()</code> is a function — whatever you want to print is passed to <code>print()</code> like any other function.
|
||||
<p id=noscript>[The code examples will be easier to follow if you enable Javascript, but whatever.]
|
||||
<p class=s><a href=#skipcompareprint>skip over this table</a>
|
||||
<table id=compareprint>
|
||||
<table>
|
||||
<tr>
|
||||
<th class=notes>Notes</th>
|
||||
<th class=python2>Python 2</th>
|
||||
@@ -112,7 +110,7 @@ td pre{padding:0;border:0}
|
||||
<td><code>print >>sys.stderr, 1, 2, 3</code></td>
|
||||
<td><code>print(1, 2, 3, file=sys.stderr)</code></td></tr>
|
||||
</table>
|
||||
<ol id=skipcompareprint>
|
||||
<ol>
|
||||
<li>To print a blank line, call <code>print()</code> without any arguments.
|
||||
<li>To print a single value, call <code>print()</code> with one argument
|
||||
<li>To print two values separated by a space, call <code>print()</code> with two arguments.
|
||||
@@ -121,8 +119,7 @@ td pre{padding:0;border:0}
|
||||
</ol>
|
||||
<h2 id=unicodeliteral>Unicode string literals</h2>
|
||||
<p>Python 2 had two string types: Unicode strings and non-Unicode strings. Python 3 has one string type: Unicode strings.
|
||||
<p class=s><a href=#skipcompareunicodeliteral>skip over this table</a>
|
||||
<table id=compareunicodeliteral>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Python 2</th>
|
||||
<th>Python 3</th>
|
||||
@@ -134,14 +131,13 @@ td pre{padding:0;border:0}
|
||||
<td><code>ur"PapayaWhip\foo"</code></td>
|
||||
<td><code>r"PapayaWhip\foo"</code></td></tr>
|
||||
</table>
|
||||
<ol id=skipcompareunicodeliteral>
|
||||
<ol>
|
||||
<li>Unicode string literals are simply converted into string literals, which, in Python 3, are always Unicode.
|
||||
<li>Unicode raw strings (in which Python does not auto-escape backslashes) are converted to raw strings. In Python 3, raw strings are always Unicode.
|
||||
</ol>
|
||||
<h2 id=unicode><code>unicode()</code> global function</h2>
|
||||
<p>Python 2 had two global functions to coerce objects into strings: <code>unicode()</code> to coerce them into Unicode strings, and <code>str()</code> to coerce them into non-Unicode strings. Python 3 has only one string type, Unicode strings, so the <code>str()</code> function is all you need. (The <code>unicode()</code> function no longer exists.)
|
||||
<p class=s><a href=#skipcompareunicode>skip over this table</a>
|
||||
<table id=compareunicode>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Python 2</th>
|
||||
<th>Python 3</th>
|
||||
@@ -150,12 +146,10 @@ td pre{padding:0;border:0}
|
||||
<td><code>unicode(anything)</code></td>
|
||||
<td><code>str(anything)</code></td></tr>
|
||||
</table>
|
||||
<p id=skipcompareunicode>
|
||||
<h2 id=long><code>long</code> data type</h2>
|
||||
<p>Python 2 had separate <code>int</code> and <code>long</code> types for non-floating-point numbers. An <code>int</code> could not be any larger than <a href=#renames><code>sys.maxint</code></a>, which varied by platform. Longs were defined by appending an <code>L</code> to the end of the number, and they could be, well, longer than ints. In Python 3, there is only one integer type, called <code>int</code>, which mostly behaves like the <code>long</code> type in Python 2. Since there are no longer two types, there is no need for special syntax to distinguish them.
|
||||
<p>Further reading: <a href=http://www.python.org/dev/peps/pep-0237/><abbr>PEP</abbr> 237: Unifying Long Integers and Integers</a>.
|
||||
<p class=s><a href=#skipcomparelong>skip over this table</a>
|
||||
<table id=comparelong>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Python 2</th>
|
||||
<th>Python 3</th>
|
||||
@@ -176,7 +170,7 @@ td pre{padding:0;border:0}
|
||||
<td><code>isinstance(x, long)</code></td>
|
||||
<td><code>isinstance(x, int)</code></td></tr>
|
||||
</table>
|
||||
<ol id=skipcomparelong>
|
||||
<ol>
|
||||
<li>Base 10 long integer literals become base 10 integer literals.
|
||||
<li>Base 16 long integer literals become base 16 integer literals.
|
||||
<li>In Python 3, the old <code>long()</code> function no longer exists, since longs don't exist. To coerce a variable to an integer, use the <code>int()</code> function.
|
||||
@@ -185,8 +179,7 @@ td pre{padding:0;border:0}
|
||||
</ol>
|
||||
<h2 id=ne><> comparison</h2>
|
||||
<p>Python 2 supported <code><></code> as a synonym for <code>!=</code>, the not-equals comparison operator. Python 3 supports the <code>!=</code> operator, but not <code><></code>.
|
||||
<p class=s><a href=#skipcomparene>skip over this table</a>
|
||||
<table id=comparene>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Python 2</th>
|
||||
<th>Python 3</th>
|
||||
@@ -198,14 +191,13 @@ td pre{padding:0;border:0}
|
||||
<td><code>if x <> y <> z:</code></td>
|
||||
<td><code>if x != y != z:</code></td></tr>
|
||||
</table>
|
||||
<ol id=skipcomparene>
|
||||
<ol>
|
||||
<li>A simple comparison.
|
||||
<li>A more complex comparison between three values.
|
||||
</ol>
|
||||
<h2 id=has_key><code>has_key()</code> dictionary method</h2>
|
||||
<p>In Python 2, dictionaries had a <code>has_key()</code> method to test whether the dictionary had a certain key. In Python 3, this method no longer exists. Instead, you need to use the <code>in</code> operator.
|
||||
<p class=s><a href=#skipcomparehas_key>skip over this table</a>
|
||||
<table id=comparehas_key>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Python 2</th>
|
||||
<th>Python 3</th>
|
||||
@@ -226,7 +218,7 @@ td pre{padding:0;border:0}
|
||||
<td><code>x + a_dictionary.has_key(y)</code></td>
|
||||
<td><code>x + (y in a_dictionary)</code></td></tr>
|
||||
</table>
|
||||
<ol id=skipcomparehas_key>
|
||||
<ol>
|
||||
<li>The simplest form.
|
||||
<li>The <code>or</code> operator takes precedence over the <code>in</code> operator, so there is no need for parentheses here.
|
||||
<li>On the other hand, you <em>do</em> need parentheses here, for the same reason — <code>or</code> takes precedence over <code>in</code>.
|
||||
@@ -235,8 +227,7 @@ td pre{padding:0;border:0}
|
||||
</ol>
|
||||
<h2 id=dict>Dictionary methods that return lists</h2>
|
||||
<p>In Python 2, many dictionary methods returned lists. The most frequently used methods were <code>keys()</code>, <code>items()</code>, and <code>values()</code>. In Python 3, all of these methods return dynamic views. In some contexts, this is not a problem. If the method's return value is immediately passed to another function that iterates through the entire sequence, it makes no difference whether the actual type is a list or a view. In other contexts, it matters a great deal. If you were expecting a complete list with individually addressable elements, your code will choke, because views do not support indexing.
|
||||
<p class=s><a href=#skipcomparedict>skip over this table</a>
|
||||
<table id=comparedict>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Python 2</th>
|
||||
<th>Python 3</th>
|
||||
@@ -257,7 +248,7 @@ td pre{padding:0;border:0}
|
||||
<td><code>min(a_dictionary.keys())</code></td>
|
||||
<td><i>no change</i></td></tr>
|
||||
</table>
|
||||
<ol id=skipcomparedict>
|
||||
<ol>
|
||||
<li><code>2to3</code> errs on the side of safety, converting the return value from <code>keys()</code> to a static list with the <code>list()</code> function. This will always work, but it will be less efficient than using a view. You should examine the converted code to see if a list is absolutely necessary, or if a view would do.
|
||||
<li>Another view-to-list conversion, with the <code>items()</code> method. <code>2to3</code> will do the same thing with the <code>values()</code> method.
|
||||
<li>Python 3 does not support the <code>iterkeys()</code> method anymore. Use <code>keys()</code>, and if necessary, convert the view to an iterator with the <code>iter()</code> function.
|
||||
@@ -268,8 +259,7 @@ td pre{padding:0;border:0}
|
||||
<p>Several modules in the Python Standard Library have been renamed. Several other modules which are related to each other have been combined or reorganized to make their association more logical.
|
||||
<h3 id=http><code>http</code></h3>
|
||||
<p>In Python 3, several related <abbr>HTTP</abbr> modules have been combined into a single package, <code>http</code>.
|
||||
<p class=s><a href=#skipcompareimporthttp>skip over this table</a>
|
||||
<table id=compareimporthttp>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Python 2</th>
|
||||
<th>Python 3</th>
|
||||
@@ -289,7 +279,7 @@ import SimpleHTTPServer
|
||||
import CGIHttpServer</code></pre></td>
|
||||
<td><code>import http.server</code></td></tr>
|
||||
</table>
|
||||
<ol id=skipcompareimporthttp>
|
||||
<ol>
|
||||
<li>The <code>http.client</code> module implements a low-level library that can request <abbr>HTTP</abbr> resources and interpret <abbr>HTTP</abbr> responses.
|
||||
<li>The <code>http.cookies</code> module provides a Pythonic interface to browser cookies that are sent in a <code>Set-Cookie:</code> <abbr>HTTP</abbr> header.
|
||||
<li>The <code>http.cookiejar</code> module manipulates the actual files on disk that popular web browsers use to store cookies.
|
||||
@@ -297,8 +287,7 @@ import CGIHttpServer</code></pre></td>
|
||||
</ol>
|
||||
<h3 id=urllib><code>urllib</code></h3>
|
||||
<p>Python 2 had a rat's nest of overlapping modules to parse, encode, and fetch URLs. In Python 3, these have all been refactored and combined in a single package, <code>urllib</code>.
|
||||
<p class=s><a href=#skipcompareimporturllib>skip over this table</a>
|
||||
<table id=compareimporturllib>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Python 2</th>
|
||||
<th>Python 3</th>
|
||||
@@ -326,7 +315,7 @@ from urllib2 import HTTPError</code></pre></td>
|
||||
<td><pre><code>from urllib.request import Request
|
||||
from urllib.error import HTTPError</code></pre></td></tr>
|
||||
</table>
|
||||
<ol id=skipcompareimporturllib>
|
||||
<ol>
|
||||
<li>The old <code>urllib</code> module in Python 2 had a variety of functions, including <code>urlopen()</code> for fetching data and <code>splittype()</code>, <code>splithost()</code>, and <code>splituser()</code> for splitting a <abbr>URL</abbr> into its constituent parts. These functions have been reorganized more logically within the new <code>urllib</code> package. <code>2to3</code> will also change all calls to these functions so they use the new naming scheme.
|
||||
<li>The old <code>urllib2</code> module in Python 2 has been folded into into the <code>urllib</code> package in Python 3. All your <code>urllib2</code> favorites — the <code>build_opener()</code> method, <code>Request</code> objects, and <code>HTTPBasicAuthHandler</code> and friends — are still available.
|
||||
<li>The <code>urllib.parse</code> module in Python 3 contains all the parsing functions from the old <code>urlparse</code> module in Python 2.
|
||||
@@ -336,8 +325,7 @@ from urllib.error import HTTPError</code></pre></td></tr>
|
||||
</ol>
|
||||
<h3 id=dbm><code>dbm</code></h3>
|
||||
<p>All the various <abbr>DBM</abbr> clones are now in a single package, <code>dbm</code>. If you need a specific variant like <abbr>GNU</abbr> <abbr>DBM</abbr>, you can import the appropriate module within the <code>dbm</code> package.
|
||||
<p class=s><a href=#skipcompareimportdbm>skip over this table</a>
|
||||
<table id=compareimportdbm>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Python 2</th>
|
||||
<th>Python 3</th>
|
||||
@@ -359,11 +347,9 @@ from urllib.error import HTTPError</code></pre></td></tr>
|
||||
import whichdb</code></pre></td>
|
||||
<td><code>import dbm</code></td></tr>
|
||||
</table>
|
||||
<p id=skipcompareimportdbm>
|
||||
<h3 id=xmlrpc><code>xmlrpc</code></h3>
|
||||
<p><abbr>XML-RPC</abbr> is a lightweight method of performing remote <abbr>RPC</abbr> calls over <abbr>HTTP</abbr>. The <abbr>XML-RPC</abbr> client library and several <abbr>XML-RPC</abbr> server implementations are now combined in a single package, <code>xmlrpc</code>.
|
||||
<p class=s><a href=#skipcompareimportxmlrpc>skip over this table</a>
|
||||
<table id=compareimportxmlrpc>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Python 2</th>
|
||||
<th>Python 3</th>
|
||||
@@ -376,10 +362,8 @@ import whichdb</code></pre></td>
|
||||
import SimpleXMLRPCServer</code></pre></td>
|
||||
<td><code>import xmlrpc.server</code></td></tr>
|
||||
</table>
|
||||
<p id=skipcompareimportxmlrpc>
|
||||
<h3 id=othermodules>Other modules</h3>
|
||||
<p class=s><a href=#skipcompareimports>skip over this table</a>
|
||||
<table id=compareimports>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Python 2</th>
|
||||
<th>Python 3</th>
|
||||
@@ -418,7 +402,7 @@ except ImportError:
|
||||
<td><code>import commands</code></td>
|
||||
<td><code>import subprocess</code></td></tr>
|
||||
</table>
|
||||
<ol id=skipcompareimports>
|
||||
<ol>
|
||||
<li>A common idiom in Python 2 was to try to import <code>cStringIO as StringIO</code>, and if that failed, to import <code>StringIO</code> instead. Do not do this in Python 3; the <code>io</code> module does it for you. It will find the fastest implementation available and use it automatically.
|
||||
<li>A similar idiom was used to import the fastest pickle implementation. Do not do this in Python 3; the <code>pickle</code> module does it for you.
|
||||
<li>The <code>builtins</code> module contains the global functions, classes, and constants used throughout the Python language. Redefining a function in the <code>builtins</code> module will redefine the global function everywhere. That is exactly as powerful and scary as it sounds.
|
||||
@@ -432,7 +416,6 @@ except ImportError:
|
||||
<h2 id=import>Relative imports within a package</h2>
|
||||
<p>A package is a group of related modules that function as a single entity. In Python 2, when modules within a package need to reference each other, you use <code>import foo</code> or <code>from foo import Bar</code>. The Python 2 interpreter first searches within the current package to find <code>foo.py</code>, and then moves on to the other directories in the Python search path (<code>sys.path</code>). Python 3 works a bit differently. Instead of searching the current package, it goes directly to the Python search path. If you want one module within a package to import another module in the same package, you need to explicitly provide the relative path between the two modules.
|
||||
<p>Suppose you had this package, with multiple files in the same directory:
|
||||
<p class=s><a href=#skippackageart>skip over this <abbr>ASCII</abbr> art</a>
|
||||
<pre>chardet/
|
||||
|
|
||||
+--__init__.py
|
||||
@@ -442,9 +425,8 @@ except ImportError:
|
||||
+--mbcharsetprober.py
|
||||
|
|
||||
+--universaldetector.py</pre>
|
||||
<p id=skippackageart>Now suppose that <code>universaldetector.py</code> needs to import the entire <code>constants.py</code> file and one class from <code>mbcharsetprober.py</code>. How do you do it?
|
||||
<p class=s><a href=#skipcompareimport>skip over this table</a>
|
||||
<table id=compareimport>
|
||||
<p>Now suppose that <code>universaldetector.py</code> needs to import the entire <code>constants.py</code> file and one class from <code>mbcharsetprober.py</code>. How do you do it?
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Python 2</th>
|
||||
<th>Python 3</th>
|
||||
@@ -456,14 +438,13 @@ except ImportError:
|
||||
<td><code>from mbcharsetprober import MultiByteCharSetProber</code></td>
|
||||
<td><code>from .mbcharsetprober import MultiByteCharsetProber</code></td></tr>
|
||||
</table>
|
||||
<ol id=skipcompareimport>
|
||||
<ol>
|
||||
<li>When you need to import an entire module from elsewhere in your package, use the new <code>from . import</code> syntax. The period is actually a relative path from this file (<code>universaldetector.py</code>) to the file you want to import (<code>constants.py</code>). In this case, they are in the same directory, thus the single period. You can also import from the parent directory (<code>from .. import anothermodule</code>) or a subdirectory.
|
||||
<li>To import a specific class or function from another module directly into your module's namespace, prefix the target module with a relative path, minus the trailing slash. In this case, <code>mbcharsetprober.py</code> is in the same directory as <code>universaldetector.py</code>, so the path is a single period. You can also import form the parent directory (<code>from ..anothermodule import AnotherClass</code>) or a subdirectory.
|
||||
</ol>
|
||||
<h2 id=next><code>next()</code> iterator method</h2>
|
||||
<p>In Python 2, iterators had a <code>next()</code> method which returned the next item in the sequence. That's still true in Python 3, but there is now also a global <code>next()</code> function that takes an iterator as an argument.
|
||||
<p class=s><a href=#skipcomparenext>skip over this table</a>
|
||||
<table id=comparenext>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Python 2</th>
|
||||
<th>Python 3</th>
|
||||
@@ -494,7 +475,7 @@ for an_iterator in a_sequence_of_iterators:
|
||||
for an_iterator in a_sequence_of_iterators:
|
||||
an_iterator.__next__()</code></pre></td></tr>
|
||||
</table>
|
||||
<ol id=skipcomparenext>
|
||||
<ol>
|
||||
<li>In the simplest case, instead of calling an iterator's <code>next()</code> method, you now pass the iterator itself to the global <code>next()</code> function.
|
||||
<li>If you have a function that returns an iterator, call the function and pass the result to the <code>next()</code> function. (The <code>2to3</code> script is smart enough to convert this properly.)
|
||||
<li>If you define your own class and mean to use it as an iterator, define the <code>__next__()</code> special method.
|
||||
@@ -503,8 +484,7 @@ for an_iterator in a_sequence_of_iterators:
|
||||
</ol>
|
||||
<h2 id=filter><code>filter()</code> global function</h2>
|
||||
<p>In Python 2, the <code>filter()</code> function returned a list, the result of filtering a sequence through a function that returned <code>True</code> or <code>False</code> for each item in the sequence. In Python 3, the <code>filter()</code> function returns an iterator, not a list.
|
||||
<p class=s><a href=#skipcomparefilter>skip over this table</a>
|
||||
<table id=comparefilter>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Python 2</th>
|
||||
<th>Python 3</th>
|
||||
@@ -525,7 +505,7 @@ for an_iterator in a_sequence_of_iterators:
|
||||
<td><code>[i for i in filter(a_function, a_sequence)]</code></td>
|
||||
<td><i>no change</i></td></tr>
|
||||
</table>
|
||||
<ol id=skipcomparefilter>
|
||||
<ol>
|
||||
<li>In the most basic case, <code>2to3</code> will wrap a call to <code>filter()</code> with a call to <code>list()</code>, which simply iterates through its argument and returns a real list.
|
||||
<li>However, if the call to <code>filter()</code> is <em>already</em> wrapped in <code>list()</code>, <code>2to3</code> will do nothing, since the fact that <code>filter()</code> is returning an iterator is irrelevant.
|
||||
<li>For the special syntax of <code>filter(None, ...)</code>, <code>2to3</code> will transform the call into a semantically equivalent list comprehension.
|
||||
@@ -534,8 +514,7 @@ for an_iterator in a_sequence_of_iterators:
|
||||
</ol>
|
||||
<h2 id=map><code>map()</code> global function</h2>
|
||||
<p>In much the same way as <a href=#filter><code>filter()</code></a>, the <code>map()</code> function now returns an iterator. (In Python 2, it returned a list.)
|
||||
<p class=s><a href=#skipcomparemap>skip over this table</a>
|
||||
<table id=comparemap>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Python 2</th>
|
||||
<th>Python 3</th>
|
||||
@@ -556,7 +535,7 @@ for an_iterator in a_sequence_of_iterators:
|
||||
<td><code>[i for i in map(a_function, a_sequence)]</code></td>
|
||||
<td><i>no change</i></td></tr>
|
||||
</table>
|
||||
<ol id=skipcomparemap>
|
||||
<ol>
|
||||
<li>As with <code>filter()</code>, in the most basic case, <code>2to3</code> will wrap a call to <code>map()</code> with a call to <code>list()</code>.
|
||||
<li>For the special syntax of <code>map(None, ...)</code>, the identity function, <code>2to3</code> will convert it to an equivalent call to <code>list()</code>.
|
||||
<li>If the first argument to <code>map()</code> is a lambda function, <code>2to3</code> will convert it to an equivalent list comprehension.
|
||||
@@ -565,8 +544,7 @@ for an_iterator in a_sequence_of_iterators:
|
||||
</ol>
|
||||
<h2 id=reduce><code>reduce()</code> global function (3.1+)</h2>
|
||||
<p>In Python 3, the <code>reduce()</code> function has been removed from the global namespace and placed in the <code>functools</code> module.
|
||||
<p class=s><a href=#skipcomparereduce>skip over this table</a>
|
||||
<table id=comparereduce>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Python 2</th>
|
||||
<th>Python 3</th>
|
||||
@@ -576,13 +554,12 @@ for an_iterator in a_sequence_of_iterators:
|
||||
<td><pre><code>from functtools import reduce
|
||||
reduce(a, b, c)</code></pre></td></tr>
|
||||
</table>
|
||||
<blockquote id=skipcomparereduce class=note>
|
||||
<blockquote>
|
||||
<p><span>☞</span>The version of <code>2to3</code> that shipped with Python 3.0 would not fix the <code>reduce()</code> function automatically. The fix first appeared in the <code>2to3</code> script that shipped with Python 3.1.
|
||||
</blockquote>
|
||||
<h2 id=apply><code>apply()</code> global function</h2>
|
||||
<p>Python 2 had a global function called <code>apply()</code>, which took a function <var>f</var> and a list <code>[a, b, c]</code> and returned <code>f(a, b, c)</code>. In Python 3, the <code>apply()</code> function no longer exists. Instead, there is a new function calling syntax that allows you to pass a list and have Python apply the list as the function's arguments.
|
||||
<p class=s><a href=#skipcompareapply>skip over this table</a>
|
||||
<table id=compareapply>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Python 2</th>
|
||||
<th>Python 3</th>
|
||||
@@ -600,7 +577,7 @@ reduce(a, b, c)</code></pre></td></tr>
|
||||
<td><code>apply(aModule.a_function, a_list_of_args)</code></td>
|
||||
<td><code>aModule.a_function(*a_list_of_args)</code></td></tr>
|
||||
</table>
|
||||
<ol id=skipcompareapply>
|
||||
<ol>
|
||||
<li>In the simplest form, you can call a function with a list of arguments (an actual list like <code>[a, b, c]</code>) by prepending the list with an asterisk (<code>*</code>). This is exactly equivalent to the old <code>apply()</code> function in Python 2.
|
||||
<li>In Python 2, the <code>apply()</code> function could actually take three parameters: a function, a list of arguments, and a dictionary of named arguments. In Python 3, you can accomplish the same thing by prepending the list of arguments with an asterisk (<code>*</code>) and the dictionary of named arguments with two asterisks (<code>**</code>).
|
||||
<li>The <code>+</code> operator, used here for list concatenation, takes precedence over the <code>*</code> operator, so there is no need for extra parentheses around <code>a_list_of_args + z</code>.
|
||||
@@ -608,8 +585,7 @@ reduce(a, b, c)</code></pre></td></tr>
|
||||
</ol>
|
||||
<h2 id=intern><code>intern()</code> global function</h2>
|
||||
<p>In Python 2, you could call the <code>intern()</code> function on a string to intern it as a performance optimization. In Python 3, the <code>intern()</code> function has been moved to the <code>sys</code> module.
|
||||
<p class=s><a href=#skipcompareintern>skip over this table</a>
|
||||
<table id=compareintern>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Python 2</th>
|
||||
<th>Python 3</th>
|
||||
@@ -618,11 +594,9 @@ reduce(a, b, c)</code></pre></td></tr>
|
||||
<td><code>intern(aString)</code></td>
|
||||
<td><code>sys.intern(aString)</code></td></tr>
|
||||
</table>
|
||||
<p id=skipcompareintern>
|
||||
<h2 id=exec><code>exec</code> statement</h2>
|
||||
<p>Just as <a href=#print>the <code>print</code> statement</a> became a function in Python 3, so too has the <code>exec</code> statement. The <code>exec()</code> function takes a string which contains arbitrary Python code and executes it as if it were just another statement or expression.
|
||||
<p class=s><a href=#skipcompareexec>skip over this table</a>
|
||||
<table id=compareexec>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Python 2</th>
|
||||
<th>Python 3</th>
|
||||
@@ -637,15 +611,14 @@ reduce(a, b, c)</code></pre></td></tr>
|
||||
<td><code>exec codeString in a_global_namespace, a_local_namespace</code></td>
|
||||
<td><code>exec(codeString, a_global_namespace, a_local_namespace)</code></td></tr>
|
||||
</table>
|
||||
<ol id=skipcompareexec>
|
||||
<ol>
|
||||
<li>In the simplest form, the <code>2to3</code> script simply encloses the code-as-a-string in parentheses, since <code>exec()</code> is now a function instead of a statement.
|
||||
<li>The old <code>exec</code> statement could take a namespace, a private environment of globals in which the code-as-a-string would be executed. Python 3 can also do this; just pass the namespace as the second argument to the <code>exec()</code> function.
|
||||
<li>Even fancier, the old <code>exec</code> statement could also take a local namespace (like the variables defined within a function). In Python 3, the <code>exec()</code> function can do that too.
|
||||
</ol>
|
||||
<h2 id=execfile><code>execfile</code> statement (3.1+)</h2>
|
||||
<p>Like the old <a href=#exec><code>exec</code> statement</a>, the old <code>execfile</code> statement will execute strings as if they were Python code. Where <code>exec</code> took a string, <code>execfile</code> took a filename. In Python 3, the <code>execfile</code> statement has been eliminated. If you really need to take a file of Python code and execute it (but you're not willing to simply import it), you can accomplish the same thing by opening the file, reading its contents, calling the global <code>compile()</code> function to force the Python interpreter to compile the code, and then call the new <code>exec()</code> function.
|
||||
<p class=s><a href=#skipcompareexecfile>skip over this table</a>
|
||||
<table id=compareexecfile>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Python 2</th>
|
||||
<th>Python 3</th>
|
||||
@@ -654,13 +627,12 @@ reduce(a, b, c)</code></pre></td></tr>
|
||||
<td><code>execfile("a_filename")</code></td>
|
||||
<td><code>exec(compile(open("a_filename").read(), "a_filename", "exec"))</code></td></tr>
|
||||
</table>
|
||||
<blockquote id=skipcompareexecfile class=note>
|
||||
<blockquote>
|
||||
<p><span>☞</span>The version of <code>2to3</code> that shipped with Python 3.0 would not fix the <code>execfile</code> statement automatically. The fix first appeared in the <code>2to3</code> script that shipped with Python 3.1.
|
||||
</blockquote>
|
||||
<h2 id=repr><code>repr</code> literals (backticks)</h2>
|
||||
<p>In Python 2, there was a special syntax of wrapping any object in backticks (like <code>`x`</code>) to get a representation of the object. In Python 3, this capability still exists, but you can no longer use backticks to get it. Instead, use the global <code>repr()</code> function.
|
||||
<p class=s><a href=#skipcomparerepr>skip over this table</a>
|
||||
<table id=comparerepr>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Python 2</th>
|
||||
<th>Python 3</th>
|
||||
@@ -672,14 +644,13 @@ reduce(a, b, c)</code></pre></td></tr>
|
||||
<td><code>`"PapayaWhip" + `2``</code></td>
|
||||
<td><code>repr("PapayaWhip" + repr(2))</code></td></tr>
|
||||
</table>
|
||||
<ol id=skipcomparerepr>
|
||||
<ol>
|
||||
<li>Remember, <var>x</var> can be anything — a class, a function, a module, a primitive data type, etc. The <code>repr()</code> function works on everything.
|
||||
<li>In Python 2, backticks could be nested, leading to this sort of confusing (but valid) expression. The <code>2to3</code> tool is smart enough to convert this into nested calls to <code>repr()</code>.
|
||||
</ol>
|
||||
<h2 id=except><code>try...except</code> statement</h2>
|
||||
<p>The syntax for catching exceptions has changed slightly between Python 2 and Python 3.
|
||||
<p class=s><a href=#skipcompareexcept>skip over this table</a>
|
||||
<table id=compareexcept>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Python 2</th>
|
||||
<th>Python 3</th>
|
||||
@@ -715,7 +686,7 @@ except:
|
||||
pass</code></pre></td>
|
||||
<td><i>no change</i></td></tr>
|
||||
</table>
|
||||
<ol id=skipcompareexcept>
|
||||
<ol>
|
||||
<li>Instead of a comma after the exception type, Python 3 uses a new keyword, <code>as</code>.
|
||||
<li>The <code>as</code> keyword also works for catching multiple types of exceptions at once.
|
||||
<li>If you catch an exception but don't actually care about accessing the exception object itself, the syntax is identical between Python 2 and Python 3.
|
||||
@@ -726,8 +697,7 @@ except:
|
||||
</blockquote>
|
||||
<h2 id=raise><code>raise</code> statement</h2>
|
||||
<p>The syntax for raising your own exceptions has changed slightly between Python 2 and Python 3.
|
||||
<p class=s><a href=#skipcompareraise>skip over this table</a>
|
||||
<table id=compareraise>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Python 2</th>
|
||||
<th>Python 3</th>
|
||||
@@ -745,7 +715,7 @@ except:
|
||||
<td><code>raise "error message"</code></td>
|
||||
<td><i>unsupported</i></td></tr>
|
||||
</table>
|
||||
<ol id=skipcompareraise>
|
||||
<ol>
|
||||
<li>In the simplest form, raising an exception without a custom error message, the syntax is unchanged.
|
||||
<li>The change becomes noticeable when you want to raise an exception with a custom error message. Python 2 separated the exception class and the message with a comma; Python 3 passes the error message as a parameter.
|
||||
<li>Python 2 supported a more complex syntax to raise an exception with a custom traceback (stack trace). You can do this in Python 3 as well, but the syntax is quite different.
|
||||
@@ -753,8 +723,7 @@ except:
|
||||
</ol>
|
||||
<h2 id=throw><code>throw</code> method on generators</h2>
|
||||
<p>In Python 2, generators have a <code>throw()</code> method. Calling <code>a_generator.throw()</code> raises an exception at the point where the generator was paused, then returns the next value yielded by the generator function. In Python 3, this functionality is still available, but the syntax is slightly different.
|
||||
<p class=s><a href=#skipcomparethrow>skip over this table</a>
|
||||
<table id=comparethrow>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Python 2</th>
|
||||
<th>Python 3</th>
|
||||
@@ -769,15 +738,14 @@ except:
|
||||
<td><code>a_generator.throw("error message")</code></td>
|
||||
<td><i>unsupported</i></td></tr>
|
||||
</table>
|
||||
<ol id=skipcomparethrow>
|
||||
<ol>
|
||||
<li>In the simplest form, a generator throws an exception without a custom error message. In this case, the syntax has not changed between Python 2 and Python 3.
|
||||
<li>If the generator throws an exception <em>with</em> a custom error message, you need to pass the error string to the exception when you create it.
|
||||
<li>Python 2 also supported throwing an exception with <em>only</em> a custom error message. Python 3 does not support this, and the <code>2to3</code> script will display a warning telling you that you will need to fix this code manually.
|
||||
</ol>
|
||||
<h2 id=xrange><code>xrange()</code> global function</h2>
|
||||
<p>In Python 2, there were two ways to get a range of numbers: <code>range()</code>, which returned a list, and <code>xrange()</code>, which returned an iterator. In Python 3, <code>range()</code> returns an iterator, and <code>xrange()</code> doesn't exist.
|
||||
<p class=s><a href=#skipcomparexrange>skip over this table</a>
|
||||
<table id=comparexrange>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Python 2</th>
|
||||
<th>Python 3</th>
|
||||
@@ -798,7 +766,7 @@ except:
|
||||
<td><code>sum(range(10))</code></td>
|
||||
<td><i>no change</i></td></tr>
|
||||
</table>
|
||||
<ol id=skipcomparexrange>
|
||||
<ol>
|
||||
<li>In the simplest case, the <code>2to3</code> script will simply convert <code>xrange()</code> to <code>range()</code>.
|
||||
<li>If your Python 2 code used <code>range()</code>, the <code>2to3</code> script does not know whether you needed a list, or whether an iterator would do. It errs on the side of caution and coerces the return value into a list by calling the <code>list()</code> function.
|
||||
<li>If the <code>xrange()</code> function was inside a list comprehension, there is no need to coerce the result to a list, since the list comprehension will work just fine with an iterator.
|
||||
@@ -807,8 +775,7 @@ except:
|
||||
</ol>
|
||||
<h2 id=raw_input><code>raw_input()</code> and <code>input()</code> global functions</h2>
|
||||
<p>Python 2 had two global functions for asking the user for input on the command line. The first, called <code>input()</code>, expected the user to enter a Python expression (and returned the result). The second, called <code>raw_input()</code>, just returned whatever the user typed. This was wildly confusing for beginners and widely regarded as a “wart” in the language. Python 3 excises this wart by renaming <code>raw_input()</code> to <code>input()</code>, so it works the way everyone naively expects it to work.
|
||||
<p class=s><a href=#skipcompareraw_input>skip over this table</a>
|
||||
<table id=compareraw_input>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Python 2</th>
|
||||
<th>Python 3</th>
|
||||
@@ -823,15 +790,14 @@ except:
|
||||
<td><code>input()</code></td>
|
||||
<td><code>eval(input())</code></td></tr>
|
||||
</table>
|
||||
<ol id=skipcompareraw_input>
|
||||
<ol>
|
||||
<li>In the simplest form, <code>raw_input()</code> becomes <code>input()</code>.
|
||||
<li>In Python 2, the <code>raw_input()</code> function could take a prompt as a parameter. This has been retained in Python 3.
|
||||
<li>If you actually need to ask the user for a Python expression to evaluate, use the <code>input()</code> function and pass the result to <code>eval()</code>.
|
||||
</ol>
|
||||
<h2 id=funcattrs><code>func_*</code> function attributes</h2>
|
||||
<p>In Python 2, code within functions can access special attributes about the function itself. In Python 3, these special function attributes have been renamed for consistency with other attributes.
|
||||
<p class=s><a href=#skipcomparefuncattrs>skip over this table</a>
|
||||
<table id=comparefuncattrs>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Python 2</th>
|
||||
<th>Python 3</th>
|
||||
@@ -858,7 +824,7 @@ except:
|
||||
<td><code>a_function.func_code</code></td>
|
||||
<td><code>a_function.__code__</code></td></tr>
|
||||
</table>
|
||||
<ol id=skipcomparefuncattrs>
|
||||
<ol>
|
||||
<li>The <code>__name__</code> attribute (previously <code>func_name</code>) contains the function's name.
|
||||
<li>The <code>__doc__</code> attribute (previously <code>func_doc</code>) contains the <i>docstring</i> that you defined in the function's source code.
|
||||
<li>The <code>__defaults__</code> attribute (previously <code>func_defaults</code>) is a tuple containing default argument values for those arguments that have default values.
|
||||
@@ -869,8 +835,7 @@ except:
|
||||
</ol>
|
||||
<h2 id=xreadlines><code>xreadlines()</code> I/O method</h2>
|
||||
<p>In Python 2, file objects had an <code>xreadlines()</code> method which returned an iterator that would read the file one line at a time. This was useful in <code>for</code> loops, among other places. In fact, it was so useful, later versions of Python 2 added the capability to file objects themselves.
|
||||
<p class=s><a href=#skipcomparexreadlines>skip over this table</a>
|
||||
<table id=comparexreadlines>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Python 2</th>
|
||||
<th>Python 3</th>
|
||||
@@ -882,15 +847,14 @@ except:
|
||||
<td><code>for line in a_file.xreadlines(5):</code></td>
|
||||
<td><i>no change</i></td></tr>
|
||||
</table>
|
||||
<ol id=skipcomparexreadlines>
|
||||
<ol>
|
||||
<li>If you used to call <code>xreadlines()</code> with no arguments, <code>2to3</code> will convert it to just the file object. In Python 3, this will accomplish the same thing: read the file one line at a time and execute the body of the <code>for</code> loop.
|
||||
<li>If you used to call <code>xreadlines()</code> with an argument (the number of lines to read at a time), keep doing that. It still works in Python 3, and <code>2to3</code> will not change it.
|
||||
</ol>
|
||||
<p class=c><span style="font-size:56px;line-height:0.88">☃</span>
|
||||
<h2 id=tuple_params><code>lambda</code> functions with multiple parameters</h2>
|
||||
<p>In Python 2, you could define anonymous <code>lambda</code> functions which took multiple parameters by defining the function as taking a tuple with a specific number of items. In effect, Python 2 would “unpack” the tuple into named arguments, which you could then reference (by name) within the <code>lambda</code> function. In Python 3, you can still pass a tuple to a <code>lambda</code> function, but the Python interpreter will not unpack the tuple into named arguments. Instead, you will need to reference each argument by its positional index.
|
||||
<p class=s><a href=#skipcomparetuple_params>skip over this table</a>
|
||||
<table id=comparetuple_params>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Python 2</th>
|
||||
<th>Python 3</th>
|
||||
@@ -905,15 +869,14 @@ except:
|
||||
<td><code>lambda (x, (y, z)): x + y + z</code></td>
|
||||
<td><code>lambda x_y_z: x_y_z[0] + x_y_z[1][0] + x_y_z[1][1]</code></td></tr>
|
||||
</table>
|
||||
<ol id=skipcomparetuple_params>
|
||||
<ol>
|
||||
<li>If you had defined a <code>lambda</code> function that took a tuple of one item, in Python 3 that would become a <code>lambda</code> with references to <var>x1[0]</var>. The name <var>x1</var> is autogenerated by the <code>2to3</code> script, based on the named arguments in the original tuple.
|
||||
<li>A <code>lambda</code> function with a two-item tuple <var>(x, y)</var> gets converted to <var>x_y</var> with positional arguments <var>x_y[0]</var> and <var>x_y[1]</var>.
|
||||
<li>The <code>2to3</code> script can even handle <code>lambda</code> functions with nested tuples of named arguments. The resulting Python 3 code is a bit unreadable, but it works the same as the old code did in Python 2.
|
||||
</ol>
|
||||
<h2 id=methodattrs>Special method attributes</h2>
|
||||
<p>In Python 2, class methods can reference the class object they are defined in, as well as the method object itself. <code>im_self</code> is the class instance object; the class <code>im_func</code> is the function object; <code>im_class</code> is the class of <code>im_self</code> (for bound methods) or the class that asked for the method (for unbound methods). In Python 3, these special method attributes have been renamed to follow the naming conventions of other attributes.
|
||||
<p class=s><a href=#skipcomparemethodattrs>skip over this table</a>
|
||||
<table id=comparemethodattrs>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Python 2</th>
|
||||
<th>Python 3</th>
|
||||
@@ -928,11 +891,9 @@ except:
|
||||
<td><code>aClassInstance.aClassMethod.im_class</code></td>
|
||||
<td><code>aClassInstance.aClassMethod.self.__class__</code></td></tr>
|
||||
</table>
|
||||
<p id=skipcomparemethodattrs>
|
||||
<h2 id=nonzero><code>__nonzero__</code> special class attribute</h2>
|
||||
<p>In Python 2, you could build your own classes that could be used in a boolean context. For example, you could instantiate the class and then use the instance in an <code>if</code> statement. To do this, you defined a special <code>__nonzero__()</code> method which returned <code>True</code> or <code>False</code>, and it was called whenever the instance was used in a boolean context. In Python 3, you can still do this, but the name of the method has changed to <code>__bool__()</code>.
|
||||
<p class=s><a href=#skipcomparenonzero>skip over this table</a>
|
||||
<table id=comparenonzero>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Python 2</th>
|
||||
<th>Python 3</th>
|
||||
@@ -950,14 +911,13 @@ except:
|
||||
pass</code></pre></td>
|
||||
<td><i>no change</i></td></tr>
|
||||
</table>
|
||||
<ol id=skipcomparenonzero>
|
||||
<ol>
|
||||
<li>Instead of <code>__nonzero__()</code>, Python 3 calls the <code>__bool__()</code> method when evaluating an instance in a boolean context.
|
||||
<li>However, if you have a <code>__nonzero__()</code> method that takes arguments, the <code>2to3</code> tool will assume that you were using it for some other purpose, and it will not make any changes.
|
||||
</ol>
|
||||
<h2 id=numliterals>Octal literals</h2>
|
||||
<p>The syntax for defining base 8 (octal) numbers has changed slightly between Python 2 and Python 3.
|
||||
<p class=s><a href=#skipcomparenumliterals>skip over this table</a>
|
||||
<table id=comparenumliterals>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Python 2</th>
|
||||
<th>Python 3</th>
|
||||
@@ -966,11 +926,9 @@ except:
|
||||
<td><code>x = 0755</code></td>
|
||||
<td><code>x = 0o755</code></td></tr>
|
||||
</table>
|
||||
<p id=skipcomparenumliterals>
|
||||
<h2 id=renames><code>sys.maxint</code></h2>
|
||||
<p>Due to the <a href=#long>integration of the <code>long</code> and <code>int</code> types</a>, the <code>sys.maxint</code> constant is no longer accurate. Because the value may still be useful in determining platform-specific capabilities, it has been retained but renamed as <code>sys.maxsize</code>.
|
||||
<p class=s><a href=#skipcomparerenames>skip over this table</a>
|
||||
<table id=comparerenames>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Python 2</th>
|
||||
<th>Python 3</th>
|
||||
@@ -982,14 +940,13 @@ except:
|
||||
<td><code>a_function(sys.maxint)</code></td>
|
||||
<td><code>a_function(sys.maxsize)</code></td></tr>
|
||||
</table>
|
||||
<ol id=skipcomparerenames>
|
||||
<ol>
|
||||
<li><code>maxint</code> becomes <code>maxsize</code>.
|
||||
<li>Any usage of <code>sys.maxint</code> becomes <code>sys.maxsize</code>.
|
||||
</ol>
|
||||
<h2 id=callable><code>callable()</code> global function</h2>
|
||||
<p>In Python 2, you could check whether an object was callable (like a function) with the global <code>callable()</code> function. In Python 3, this global function has been eliminated. To check whether an object is callable, check for the existence of the <code>__call__()</code> special method.
|
||||
<p class=s><a href=#skipcomparecallable>skip over this table</a>
|
||||
<table id=comparecallable>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Python 2</th>
|
||||
<th>Python 3</th>
|
||||
@@ -998,11 +955,9 @@ except:
|
||||
<td><code>callable(anything)</code></td>
|
||||
<td><code>hasattr(anything, "__call__")</code></td></tr>
|
||||
</table>
|
||||
<p id=skipcomparecallable>
|
||||
<h2 id=zip><code>zip()</code> global function</h2>
|
||||
<p>In Python 2, the global <code>zip()</code> function took any number of sequences and returned a list of tuples. The first tuple contained the first item from each sequence; the second tuple contained the second item from each sequence; and so on. In Python 3, <code>zip()</code> returns an iterator instead of a list.
|
||||
<p class=s><a href=#skipcomparezip>skip over this table</a>
|
||||
<table id=comparezip>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Python 2</th>
|
||||
<th>Python 3</th>
|
||||
@@ -1014,14 +969,13 @@ except:
|
||||
<td><code>d.join(zip(a, b, c))</code></td>
|
||||
<td><i>no change</i></td></tr>
|
||||
</table>
|
||||
<ol id=skipcomparezip>
|
||||
<ol>
|
||||
<li>In the simplest form, you can get the old behavior of the <code>zip()</code> function by wrapping the return value in a call to <code>list()</code>, which will run through the iterator that <code>zip()</code> returns and return a real list of the results.
|
||||
<li>In contexts that already iterate through all the items of a sequence (such as this call to the <code>join()</code> method), the iterator that <code>zip()</code> returns will work just fine. The <code>2to3</code> script is smart enough to detect these cases and make no change to your code.
|
||||
</ol>
|
||||
<h2 id=standarderror><code>StandardError</code> exception</h2>
|
||||
<p>In Python 2, <code>StandardError</code> was the base class for all built-in exceptions other than <code>StopIteration</code>, <code>GeneratorExit</code>, <code>KeyboardInterrupt</code>, and <code>SystemExit</code>. In Python 3, <code>StandardError</code> has been eliminated; use <code>Exception</code> instead.
|
||||
<p class=s><a href=#skipcomparestandarderror>skip over this table</a>
|
||||
<table id=comparestandarderror>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Python 2</th>
|
||||
<th>Python 3</th>
|
||||
@@ -1033,11 +987,9 @@ except:
|
||||
<td><code>x = StandardError(a, b, c)</code></td>
|
||||
<td><code>x = Exception(a, b, c)</code></td></tr>
|
||||
</table>
|
||||
<p id=skipcomparestandarderror>
|
||||
<h2 id=types><code>types</code> module constants</h2>
|
||||
<p>The <code>types</code> module contains a variety of constants to help you determine the type of an object. In Python 2, it contained constants for all primitive types like <code>dict</code> and <code>int</code>. In Python 3, these constants have been eliminated; just use the primitive type name instead.
|
||||
<p class=s><a href=#skipcomparetypes>skip over this table</a>
|
||||
<table id=comparetypes>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Python 2</th>
|
||||
<th>Python 3</th>
|
||||
@@ -1061,11 +1013,9 @@ except:
|
||||
<td><code>types.NoneType</code></td>
|
||||
<td><code>type(None)</code></td></tr>
|
||||
</table>
|
||||
<p id=skipcomparetypes>
|
||||
<h2 id=isinstance><code>isinstance()</code> global function (3.1+)</h2>
|
||||
<p>The <code>isinstance()</code> function checks whether an object is an instance of a particular class or type. In Python 2, you could pass a tuple of types, and <code>isinstance()</code> would return <code>True</code> if the object was any of those types. In Python 3, you can still do this, but passing the same type twice is deprecated.
|
||||
<p class=s><a href=#skipcompareisinstance>skip over this table</a>
|
||||
<table id=compareisinstance>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Python 2</th>
|
||||
<th>Python 3</th>
|
||||
@@ -1074,13 +1024,12 @@ except:
|
||||
<td><code>isinstance(x, (int, float, int))</code></td>
|
||||
<td><code>isinstance(x, (int, float))</code></td></tr>
|
||||
</table>
|
||||
<blockquote id=skipcompareisinstance class=note>
|
||||
<blockquote>
|
||||
<p><span>☞</span>The version of <code>2to3</code> that shipped with Python 3.0 would not fix these cases of <code>isinstance()</code> automatically. The fix first appeared in the <code>2to3</code> script that shipped with Python 3.1.
|
||||
</blockquote>
|
||||
<h2 id=basestring><code>basestring</code> datatype</h2>
|
||||
<p>Python 2 had two string types: Unicode and non-Unicode. But there was also another type, <code>basestring</code>. It was an abstract type, a superclass for both the <code>str</code> and <code>unicode</code> types. It couldn't be called or instantiated directly, but you could pass it to the global <code>isinstance()</code> function to check whether an object was either a Unicode or non-Unicode string. In Python 3, there is only one string type, so <code>basestring</code> has no reason to exist.
|
||||
<p class=s><a href=#skipcomparebasestring>skip over this table</a>
|
||||
<table id=comparebasestring>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Python 2</th>
|
||||
<th>Python 3</th>
|
||||
@@ -1089,10 +1038,9 @@ except:
|
||||
<td><code>isinstance(x, basestring)</code></td>
|
||||
<td><code>isinstance(x, str)</code></td></tr>
|
||||
</table>
|
||||
<p id=skipcomparebasestring>
|
||||
<h2 id=itertools><code>itertools</code> module</h2>
|
||||
<p>Python 2.3 introduced the <code>itertools</code> module, which defined variants of the global <code>zip()</code>, <code>map()</code>, and <code>filter()</code> functions that returned iterators instead of lists. In Python 3, those global functions return iterators, so those functions in the <code>itertools</code> module have been eliminated.
|
||||
<table id=compareitertools>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Python 2</th>
|
||||
<th>Python 3</th>
|
||||
@@ -1110,7 +1058,7 @@ except:
|
||||
<td><code>from itertools import imap, izip, foo</code></td>
|
||||
<td><code>from itertools import foo</code></td></tr>
|
||||
</table>
|
||||
<ol id=skipcompareitertools>
|
||||
<ol>
|
||||
<li>Instead of <code>itertools.izip()</code>, just use the global <code>zip()</code> function.
|
||||
<li>Instead of <code>itertools.imap()</code>, just use <code>map()</code>.
|
||||
<li><code>itertools.ifilter()</code> becomes <code>filter()</code>.
|
||||
@@ -1118,8 +1066,7 @@ except:
|
||||
</ol>
|
||||
<h2 id=sys_exc><code>sys.exc_type</code>, <code>sys.exc_value</code>, <code>sys.exc_traceback</code></h2>
|
||||
<p>Python 2 had three variables in the <code>sys</code> module that you could access while an exception was being handled: <code>sys.exc_type</code>, <code>sys.exc_value</code>, <code>sys.exc_traceback</code>. (Actually, these date all the way back to Python 1.) Ever since Python 1.5, these variables have been deprecated in favor of <code>sys.exc_info</code>, which is a tuple that contains all three values. In Python 3, these individual variables have finally gone away; you must use <code>sys.exc_info</code>.
|
||||
<p class=s><a href=#skipcomparesys_exc>skip over this table</a>
|
||||
<table id=comparesys_exc>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Python 2</th>
|
||||
<th>Python 3</th>
|
||||
@@ -1134,11 +1081,9 @@ except:
|
||||
<td><code>sys.exc_traceback</code></td>
|
||||
<td><code>sys.exc_info()[2]</code></td></tr>
|
||||
</table>
|
||||
<p id=skipcomparesys_exc>
|
||||
<h2 id=paren>List comprehensions over tuples</h2>
|
||||
<p>In Python 2, if you wanted to code a list comprehension that iterated over a tuple, you did not need to put parentheses around the tuple values. In Python 3, explicit parentheses are required.
|
||||
<p class=s><a href=#skipcompareparen>skip over this table</a>
|
||||
<table id=compareparen>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Python 2</th>
|
||||
<th>Python 3</th>
|
||||
@@ -1147,11 +1092,9 @@ except:
|
||||
<td><code>[i for i in 1, 2]</code></td>
|
||||
<td><code>[i for i in (1, 2)]</code></td></tr>
|
||||
</table>
|
||||
<p id=skipcompareparen>
|
||||
<h2 id=getcwdu><code>os.getcwdu()</code> function</h2>
|
||||
<p>Python 2 had a function named <code>os.getcwd()</code>, which returned the current working directory as a (non-Unicode) string. Because modern file systems can handle directory names in any character encoding, Python 2.3 introduced <code>os.getcwdu()</code>. The <code>os.getcwdu()</code> function returned the current working directory as a Unicode string. In Python 3, there is only one string type (Unicode), so <code>os.getcwd()</code> is all you need.
|
||||
<p class=s><a href=#skipcomparegetcwdu>skip over this table</a>
|
||||
<table id=comparegetcwdu>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Python 2</th>
|
||||
<th>Python 3</th>
|
||||
@@ -1160,11 +1103,9 @@ except:
|
||||
<td><code>os.getcwdu()</code></td>
|
||||
<td><code>os.getcwd()</code></td></tr>
|
||||
</table>
|
||||
<p id=skipcomparegetcwdu>
|
||||
<h2 id=metaclass>Metaclasses</h2>
|
||||
<p>In Python 2, you could create metaclasses either by defining the <code>metaclass</code> argument in the class declaration, or by defining a special class-level <code>__metaclass__</code> attribute. In Python 3, the class-level attribute has been eliminated.
|
||||
<p class=s><a href=#skipcomparemetaclass>skip over this table</a>
|
||||
<table id=comparemetaclass>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Python 2</th>
|
||||
<th>Python 3</th>
|
||||
@@ -1184,7 +1125,7 @@ except:
|
||||
<td><pre><code>class C(Whipper, Beater, metaclass=PapayaMeta):
|
||||
pass</code></pre></td></tr>
|
||||
</table>
|
||||
<ol id=skipcomparemetaclass>
|
||||
<ol>
|
||||
<li>Declaring the metaclass in the class declaration worked in Python 2, and it still works the same in Python 3.
|
||||
<li>Declaring the metaclass in a class attribute worked in Python 2, but doesn't work in Python 3.
|
||||
<li>The <code>2to3</code> script is smart enough to construct a valid class declaration, even if the class is inherited from one or more base classes.
|
||||
@@ -1196,8 +1137,7 @@ except:
|
||||
<blockquote class=note>
|
||||
<p><span>☞</span>The <code>2to3</code> script will not fix <code>set()</code> literals by default. To enable this fix, specify <kbd>-f set_literal</kbd> on the command line when you call <code>2to3</code>.
|
||||
</blockquote>
|
||||
<p class=s><a href=#skipcompareset_literal>skip over this table</a>
|
||||
<table id=compareset_literal>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Before</th>
|
||||
<th>After</th>
|
||||
@@ -1212,14 +1152,12 @@ except:
|
||||
<td><code>set([i for i in a_sequence])</code></td>
|
||||
<td><code>{i for i in a_sequence}</code></td></tr>
|
||||
</table>
|
||||
<p id=skipcompareset_literal>
|
||||
<h3 id=buffer><code>buffer()</code> global function (explicit)</h3>
|
||||
<p>Python objects implemented in C can export a “buffer interface,” which is a block of memory that is directly readable and writeable without copying. (That is exactly as powerful and scary as it sounds.) In Python 3, <code>buffer()</code> has been renamed to <code>memoryview()</code>. (It's a little more complicated than that, but you can almost certainly ignore the differences.)
|
||||
<blockquote class=note>
|
||||
<p><span>☞</span>The <code>2to3</code> script will not fix the <code>buffer()</code> function by default. To enable this fix, specify <kbd>-f buffer</kbd> on the command line when you call <code>2to3</code>.
|
||||
</blockquote>
|
||||
<p class=s><a href=#skipcomparebuffer>skip over this table</a>
|
||||
<table id=comparebuffer>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Before</th>
|
||||
<th>After</th>
|
||||
@@ -1228,14 +1166,12 @@ except:
|
||||
<td><code>x = buffer(y)</code></td>
|
||||
<td><code>x = memoryview(y)</code></td></tr>
|
||||
</table>
|
||||
<p id=skipcomparebuffer>
|
||||
<h3 id=wscomma>Whitespace around commas (explicit)</h3>
|
||||
<p>Despite being draconian about whitespace for indenting and outdenting, Python is actually quite liberal about whitespace in other areas. Within lists, tuples, sets, and dictionaries, whitespace can appear before and after commas with no ill effects. However, the Python style guide states that commas should be preceded by zero spaces and followed by one. Although this is purely an aesthetic issue (the code works either way, in both Python 2 and Python 3), the <code>2to3</code> script can optionally fix this for you.
|
||||
<blockquote class=note>
|
||||
<p><span>☞</span>The <code>2to3</code> script will not fix whitespace around commas by default. To enable this fix, specify <kbd>-f wscomma</kbd> on the command line when you call <code>2to3</code>.
|
||||
</blockquote>
|
||||
<p class=s><a href=#skipcomparewscomma>skip over this table</a>
|
||||
<table id=comparewscomma>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Before</th>
|
||||
<th>After</th>
|
||||
@@ -1247,14 +1183,12 @@ except:
|
||||
<td><code>{a :b}</code></td>
|
||||
<td><code>{a: b}</code></td></tr>
|
||||
</table>
|
||||
<p id=skipcomparewscomma>
|
||||
<h3 id=idioms>Common idioms (explicit)</h3>
|
||||
<p>There were a number of common idioms built up in the Python community. Some, like the <code>while 1:</code> loop, date back to Python 1. (Python didn't have a true boolean type until version 2.3, so developers used <code>1</code> and <code>0</code> instead.) Modern Python programmers should train their brains to use modern versions of these idioms instead.
|
||||
<blockquote class=note>
|
||||
<p><span>☞</span>The <code>2to3</code> script will not fix common idioms by default. To enable this fix, specify <kbd>-f idioms</kbd> on the command line when you call <code>2to3</code>.
|
||||
</blockquote>
|
||||
<p class=s><a href=#skipcompareidioms>skip over this table</a>
|
||||
<table id=compareidioms>
|
||||
<table>
|
||||
<tr><th>Notes</th>
|
||||
<th>Before</th>
|
||||
<th>After</th>
|
||||
@@ -1277,7 +1211,6 @@ do_stuff(a_list)</code></pre></td>
|
||||
<td><pre><code>a_list = sorted(a_sequence)
|
||||
do_stuff(a_list)</code></pre></td></tr>
|
||||
</table>
|
||||
<p id=skipcompareidioms>
|
||||
<p>FIXME: once the rest of the book is written, this appendix should contain copious links back to any chapter or section that touches on these features.
|
||||
<p class=c>© 2001–9 <a href=about.html><span>ℳ</span>ark Pilgrim</a>
|
||||
<script src=jquery.js></script>
|
||||
|
||||
@@ -9,7 +9,6 @@
|
||||
body{counter-reset:h1 4}
|
||||
</style>
|
||||
</head>
|
||||
<p class=s><a href=#divingin>skip to main content</a>
|
||||
<form action=http://www.google.com/cse><div><input type=hidden name=cx value=014021643941856155761:l5eihuescdw><input type=hidden name=ie value=UTF-8> <input name=q size=31> <input type=submit name=root value=Search></div></form>
|
||||
<p>You are here: <a href=index.html>Home</a> <span>‣</span> <a href=table-of-contents.html#regular-expressions>Dive Into Python 3</a> <span>‣</span>
|
||||
<h1>Regular expressions</h1>
|
||||
|
||||
@@ -9,7 +9,6 @@
|
||||
body{counter-reset:h1 3}
|
||||
</style>
|
||||
</head>
|
||||
<p class=s><a href=#divingin>skip to main content</a>
|
||||
<form action=http://www.google.com/cse><div><input type=hidden name=cx value=014021643941856155761:l5eihuescdw><input type=hidden name=ie value=UTF-8> <input name=q size=31> <input type=submit name=sa value=Search></div></form>
|
||||
<p>You are here: <a href=index.html>Home</a> <span>‣</span> <a href=table-of-contents.html#strings>Dive Into Python 3</a> <span>‣</span>
|
||||
<h1>Strings</h1>
|
||||
|
||||
@@ -9,7 +9,6 @@
|
||||
body{counter-reset:h1 7}
|
||||
</style>
|
||||
</head>
|
||||
<p class=s><a href=#divingin>skip to main content</a>
|
||||
<form action=http://www.google.com/cse><div><input type=hidden name=cx value=014021643941856155761:l5eihuescdw><input type=hidden name=ie value=UTF-8> <input name=q size=31> <input type=submit name=root value=Search></div></form>
|
||||
<p>You are here: <a href=index.html>Home</a> <span>‣</span> <a href=table-of-contents.html#unit-testing>Dive Into Python 3</a> <span>‣</span>
|
||||
<h1>Unit testing</h1>
|
||||
|
||||
@@ -10,7 +10,6 @@ body{counter-reset:h1 1}
|
||||
th{font-family:inherit !important}
|
||||
</style>
|
||||
</head>
|
||||
<p class=s><a href=#divingin>skip to main content</a>
|
||||
<form action=http://www.google.com/cse><div><input type=hidden name=cx value=014021643941856155761:l5eihuescdw><input type=hidden name=ie value=UTF-8> <input name=q size=31> <input type=submit name=sa value=Search></div></form>
|
||||
<p>You are here: <a href=index.html>Home</a> <span>‣</span> <a href=table-of-contents.html#your-first-python-program>Dive Into Python 3</a> <span>‣</span>
|
||||
<h1>Your first Python program</h1>
|
||||
@@ -41,7 +40,6 @@ th{font-family:inherit !important}
|
||||
<h2 id=divingin>Diving in</h2>
|
||||
<p class=f>Books about programming usually start with a bunch of boring chapters about fundamentals and eventually work up to building something useful. Let's skip all that. Here is a complete, working Python program. It probably makes absolutely no sense to you. Don't worry about that, because you're going to dissect it line by line. But read through it first and see what, if anything, you can make of it.
|
||||
<p id=noscript>[The code examples will be easier to follow if you enable Javascript, but whatever.]
|
||||
<p class=s><a href=#skip-humansize-py>skip over this code listing</a>
|
||||
<p class=download>[<a href=humansize.py>download <code>humansize.py</code></a>]
|
||||
<pre><code>SUFFIXES = {1000: ['KB', 'MB', 'GB', 'TB', 'PB', 'EB', 'ZB', 'YB'],
|
||||
1024: ['KiB', 'MiB', 'GiB', 'TiB', 'PiB', 'EiB', 'ZiB', 'YiB']}
|
||||
@@ -71,8 +69,7 @@ def approximate_size(size, a_kilobyte_is_1024_bytes=True):
|
||||
if __name__ == "__main__":
|
||||
print(approximate_size(1000000000000, False))
|
||||
print(approximate_size(1000000000000))</code></pre>
|
||||
<p id=skip-humansize-py>Now let's run this program on the command line. On Windows, it will look something like this:
|
||||
<p class=s><a href=#skip-humansize-screen>skip over this command output listing</a>
|
||||
<p>Now let's run this program on the command line. On Windows, it will look something like this:
|
||||
<pre class=screen>
|
||||
<samp class=p>c:\home\diveintopython3> </samp><kbd>c:\python30\python.exe humansize.py</kbd>
|
||||
<samp>1.0 TB
|
||||
@@ -82,7 +79,7 @@ if __name__ == "__main__":
|
||||
<samp class=p>you@localhost:~$ </samp><kbd>python3 humansize.py</kbd>
|
||||
<samp>1.0 TB
|
||||
931.3 GiB</samp></pre>
|
||||
<p id=skip-humansize-screen>FIXME: this would be a good place to explain what the program, you know, actually does.
|
||||
<p>FIXME: this would be a good place to explain what the program, you know, actually does.
|
||||
<h2 id=declaringfunctions>Declaring functions</h2>
|
||||
<p>Python has functions like most other languages, but it does not have separate header files like <abbr>C++</abbr> or <code>interface</code>/<code>implementation</code> sections like Pascal. When you need a function, just declare it, like this:
|
||||
<pre><code>def approximate_size(size, a_kilobyte_is_1024_bytes=True):</code></pre>
|
||||
@@ -122,7 +119,6 @@ if __name__ == "__main__":
|
||||
<p>I won't bore you with a long finger-wagging speech about the importance of documenting your code. Just know that code is written once but read many times, and the most important audience for your code is yourself, six months after writing it (i.e. after you've forgotten everything but need to fix something). Python makes it easy to write readable code, so take advantage of it. You'll thank me in six months.
|
||||
<h3 id=docstrings>Documentation strings</h3>
|
||||
<p>You can document a Python function by giving it a documentation string (<code>docstring</code> for short). In this program, the <code>approximate_size</code> function has a <code>docstring</code>:
|
||||
<p class=s><a href=#skip-approximate-size>skip over this code listing</a>
|
||||
<pre><code>def approximate_size(size, a_kilobyte_is_1024_bytes=True):
|
||||
"""Convert a file size to human-readable form.
|
||||
|
||||
@@ -134,7 +130,7 @@ if __name__ == "__main__":
|
||||
Returns: string
|
||||
|
||||
"""</code></pre>
|
||||
<p id=skip-approximate-size>Triple quotes signify a multi-line string. Everything between the start and end quotes is part of a single string, including carriage returns, leading white space, and other quote characters. You can use them anywhere, but you'll see them most often used when defining a <code>docstring</code>.
|
||||
<p>Triple quotes signify a multi-line string. Everything between the start and end quotes is part of a single string, including carriage returns, leading white space, and other quote characters. You can use them anywhere, but you'll see them most often used when defining a <code>docstring</code>.
|
||||
<blockquote class="note compare perl5">
|
||||
<p><span>☞</span>Triple quotes are also an easy way to define a string with both single and double quotes, like <code>qq/.../</code> in Perl 5.
|
||||
</blockquote>
|
||||
@@ -149,7 +145,6 @@ if __name__ == "__main__":
|
||||
<h2 id=everythingisanobject>Everything is an object</h2>
|
||||
<p>In case you missed it, I just said that Python functions have attributes, and that those attributes are available at runtime. A function, like everything else in Python, is an object.
|
||||
<p>Run the interactive Python shell and follow along:
|
||||
<p class=s><a href=#skip-everything-is-an-object-screen>skip over this interpreter listing</a>
|
||||
<pre class=screen>
|
||||
<a><samp class=p>>>> </samp><kbd>import humansize</kbd> <span>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>print(humansize.approximate_size(4096, True))</kbd> <span>②</span></a>
|
||||
@@ -165,7 +160,7 @@ if __name__ == "__main__":
|
||||
Returns: string
|
||||
|
||||
</samp></pre>
|
||||
<ol id=skip-everything-is-an-object-screen>
|
||||
<ol>
|
||||
<li>The first line imports the <code>humansize</code> program as a module -- a chunk of code that you can use interactively, or from a larger Python program. (You'll see examples of multi-module Python programs in [FIXME xref].) Once you import a module, you can reference any of its public functions, classes, or attributes. Modules can do this to access functionality in other modules, and you can do it in the Python interactive shell too. This is an important concept, and you'll see a lot more of it throughout this book.
|
||||
<li>When you want to use functions defined in imported modules, you need to include the module name. So you can't just say <code>approximate_size</code>; it must be <code>humansize.approximate_size</code>. If you've used classes in Java, this should feel vaguely familiar.
|
||||
<li>Instead of calling the function as you would expect to, you asked for one of the function's attributes, <code>__doc__</code>.
|
||||
@@ -175,7 +170,6 @@ if __name__ == "__main__":
|
||||
</blockquote>
|
||||
<h3 id=importsearchpath>The <code>import</code> search path</h3>
|
||||
<p>Before this goes any further, I want to briefly mention the library search path. Python looks in several places when you try to import a module. Specifically, it looks in all the directories defined in <code>sys.path</code>. This is just a list, and you can easily view it or modify it with standard list methods. (You'll learn more about lists later in this chapter.)
|
||||
<p class=s><a href=#skip-import-search-path-screen>skip over this interpreter listing</a>
|
||||
<pre class=screen>
|
||||
<a><samp class=p>>>> </samp><kbd>import sys</kbd> <span>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>sys.path</kbd> <span>②</span></a>
|
||||
@@ -183,7 +177,7 @@ if __name__ == "__main__":
|
||||
<a><samp class=p>>>> </samp><kbd>sys</kbd> <span>③</span></a>
|
||||
<samp><module 'sys' (built-in)></samp>
|
||||
<a><samp class=p>>>> </samp><kbd>sys.path.append('/my/new/path')</kbd> <span>④</span></a></pre>
|
||||
<ol id=skip-import-search-path-screen>
|
||||
<ol>
|
||||
<li>Importing the <code>sys</code> module makes all of its functions and attributes available.
|
||||
<li><code>sys.path</code> is a list of directory names that constitute the current search path. (Yours will look different, depending on your operating system, what version of Python you're running, and where it was originally installed.) Python will look through these directories (in this order) for a <code>.py</code> file whose name matches what you're trying to import.
|
||||
<li>Actually, I lied; the truth is more complicated than that, because not all modules are stored as <code>.py</code> files. Some, like the <code>sys</code> module, are "built-in modules"; they are actually baked right into Python itself. Built-in modules behave just like regular modules, but their Python source code is not available, because they are not written in Python! (The <code>sys</code> module is written in <abbr>C</abbr>.)
|
||||
@@ -195,7 +189,6 @@ if __name__ == "__main__":
|
||||
<p>This is so important that I'm going to repeat it in case you missed it the first few times: <em>everything in Python is an object</em>. Strings are objects. Lists are objects. Functions are objects. Even modules are objects.
|
||||
<h2 id=indentingcode>Indenting code</h2>
|
||||
<p>Python functions have no explicit <code>begin</code> or <code>end</code>, and no curly braces to mark where the function code starts and stops. The only delimiter is a colon (<code>:</code>) and the indentation of the code itself.
|
||||
<p class=s><a href=#skip-indenting-code>skip over this code listing</a>
|
||||
<pre><code>
|
||||
<a>def approximate_size(size, a_kilobyte_is_1024_bytes=True): <span>①</span></a>
|
||||
<a> if size < 0: <span>②</span></a>
|
||||
@@ -208,7 +201,7 @@ if __name__ == "__main__":
|
||||
return "{0:.1f} {1}".format(size, suffix)
|
||||
|
||||
raise ValueError('number too large')</code></pre>
|
||||
<ol id=skip-indenting-code>
|
||||
<ol>
|
||||
<li>Code blocks are defined by their indentation. By "code block," I mean functions, <code>if</code> statements, <code>for</code> loops, <code>while</code> loops, and so forth. Indenting starts a block and unindenting ends it. There are no explicit braces, brackets, or keywords. This means that whitespace is significant, and must be consistent. In this example, the function code is indented four spaces. It doesn't need to be four spaces, it just needs to be consistent. The first line that is not indented marks the end of the function.
|
||||
<li>In Python, an <code>if</code> statement is followed by a code block. If the <code>if</code> expression evaluates to true, the indented block is executed, otherwise it falls to the <code>else</code> block (if any). (Note the lack of parentheses around the expression.)
|
||||
<li>This line is inside the <code>if</code> code block. This <code>raise</code> statement will raise an exception (of type <code>ValueError</code>), but only if <code>size < 0</code>.
|
||||
@@ -221,22 +214,19 @@ if __name__ == "__main__":
|
||||
</blockquote>
|
||||
<h2 id=runningscripts>Running scripts</h2>
|
||||
<p>Python modules are objects and have several useful attributes. You can use this to easily test your modules as you write them, by including a special block of code that executes when you run the Python file on the command line. Take the last few lines of <code>humansize.py</code>:
|
||||
<p class=s><a href=#skip-running-scripts>skip over this code listing</a>
|
||||
<pre><code>
|
||||
if __name__ == "__main__":
|
||||
print(approximate_size(1000000000000, False))
|
||||
print(approximate_size(1000000000000))</code></pre>
|
||||
<blockquote class="note compare clang" id=skip-running-scripts>
|
||||
<blockquote class="note compare clang">
|
||||
<p><span>☞</span>Like <abbr>C</abbr>, Python uses <code>==</code> for comparison and <code>=</code> for assignment. Unlike <abbr>C</abbr>, Python does not support in-line assignment, so there's no chance of accidentally assigning the value you thought you were comparing.
|
||||
</blockquote>
|
||||
<p>So what makes this <code>if</code> statement special? Well, modules are objects, and all modules have a built-in attribute <code>__name__</code>. A module's <code>__name__</code> depends on how you're using the module. If you <code>import</code> the module, then <code>__name__</code> is the module's filename, without a directory path or file extension.
|
||||
<p class=s><a href=#skip-import-humansize>skip over this interpreter listing</a>
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>import humansize</kbd>
|
||||
<samp class=p>>>> </samp><kbd>humansize.__name__</kbd>
|
||||
<samp>'humansize'</samp></pre>
|
||||
<p id=skip-import-humansize>But you can also run the module directly as a standalone program, in which case <code>__name__</code> will be a special default value, <code>__main__</code>. Python will evaluate this <code>if</code> statement, find a true expression, and execute the <code>if</code> code block. In this case, to print two values.
|
||||
<p class=s><a href=#furtherreading>skip over this command output listing</a>
|
||||
<p>But you can also run the module directly as a standalone program, in which case <code>__name__</code> will be a special default value, <code>__main__</code>. Python will evaluate this <code>if</code> statement, find a true expression, and execute the <code>if</code> code block. In this case, to print two values.
|
||||
<pre class=screen>
|
||||
<samp class=p>c:\home\diveintopython3> </samp><kbd>c:\python30\python.exe humansize.py</kbd>
|
||||
<samp>1.0 TB
|
||||
|
||||
Reference in New Issue
Block a user