you wouldn't believe me if I told you

This commit is contained in:
Mark Pilgrim
2009-06-05 23:39:50 -04:00
parent cbdb346531
commit 654b102d74
64 changed files with 724 additions and 764 deletions
+1 -1
View File
@@ -4,7 +4,7 @@
<title>About the book - Dive Into Python 3</title>
<link rel=stylesheet href=dip3.css>
<style>
h1:before{content:""}
h1:before{content:''}
</style>
<link rel=stylesheet media='only screen and (max-device-width: 480px)' href=mobile.css>
<link rel=stylesheet media=print href=print.css>
+1 -1
View File
@@ -163,7 +163,7 @@ class OrderedDict(dict, collections.MutableMapping):
<h2 id=implementing-fractions>Implementing Fractions</h2>
<p class=nav><a rel=prev class=todo><span>&#x261C;</span></a> <a rel=next class=todo><span>&#x261E;</span></a>
<p class=v><a rel=prev class=todo><span>&#x261C;</span></a> <a rel=next class=todo><span>&#x261E;</span></a>
<p class=c>&copy; 2001&ndash;9 <a href=about.html>Mark Pilgrim</a>
<script src=j/jquery.js></script>
<script src=j/dip3.js></script>
+5 -4
View File
@@ -391,7 +391,7 @@ for guess in itertools.permutations(digits, len(characters)):
<p>Python strings have many methods. You learned about some of those methods in <a href=strings.html>the Strings chapter</a>: <code>lower()</code>, <code>count()</code>, and <code>format()</code>. Now I want to introduce you to a powerful but little-known string manipulation technique: the <code>translate()</code> method.
<pre class=screen>
<a><samp class=p>>>> </samp><kbd>translation_table = {ord("A"): ord("O")}</kbd> <span>&#x2460;</span></a>
<a><samp class=p>>>> </samp><kbd>translation_table = {ord('A'): ord('O')}</kbd> <span>&#x2460;</span></a>
<a><samp class=p>>>> </samp><kbd>translation_table</kbd> <span>&#x2461;</span></a>
<samp>{65: 79}</samp>
<a><samp class=p>>>> </samp><kbd>'MARK'.translate(translation_table)</kbd> <span>&#x2462;</span></a>
@@ -414,7 +414,7 @@ for guess in itertools.permutations(digits, len(characters)):
<a><samp class=p>>>> </samp><kbd>translation_table = dict(zip(characters, guess))</kbd> <span>&#x2462;</span></a>
<samp class=p>>>> </samp><kbd>translation_table</kbd>
<samp>{68: 55, 69: 53, 77: 49, 78: 54, 79: 48, 82: 56, 83: 57, 89: 50}</samp>
<a><samp class=p>>>> </samp><kbd>"SEND + MORE == MONEY".translate(translation_table)</kbd> <span>&#x2463;</span></a>
<a><samp class=p>>>> </samp><kbd>'SEND + MORE == MONEY'.translate(translation_table)</kbd> <span>&#x2463;</span></a>
<samp>'9567 + 1085 == 10652'</samp></pre>
<ol>
<li>Using a <a href=#generator-expressions>generator expression</a>, we quickly compute the byte values for each character in a string. <var>characters</var> is an example of the value of <var>sorted_characters</var> in the <code>alphametics.solve()</code> function.
@@ -490,7 +490,7 @@ for guess in itertools.permutations(digits, len(characters)):
<li>Don&#8217;t do this either.
</ol>
<p class=c style="font-size:1000%;font-weight:bold;line-height:1;margin:0.7em 0">eval() is EVIL
<p class=c style='font-size:1000%;font-weight:bold;line-height:1;margin:0.7em 0'>eval() is EVIL
<p>Well, the evil part is evaluating arbitrary expressions from untrusted sources. You should only use <code>eval()</code> on trusted input. Of course, the trick is figuring out what&#8217;s &#8220;trusted.&#8221; But here&#8217;s something I know for certain: you should <b>NOT</b> take this alphametics solver and put it on the internet as a fun little web service. Don&#8217;t make the mistake of thinking, &#8220;Gosh, the function does a lot of string manipulation before getting a string to evaluate; <em>I can&#8217;t imagine</em> how someone could exploit that.&#8221; Someone <b>WILL</b> figure out how to sneak nasty executable code past all that string manipulation (<a href=http://www.matasano.com/log/1032/this-new-vulnerability-dowds-inhuman-flash-exploit/>stranger things have happened</a>), and then you can kiss your server goodbye.
@@ -591,8 +591,9 @@ NameError: name '__import__' is not defined</samp></pre>
<p>Many, many thanks to Raymond Hettinger for agreeing to relicense his code so I could port it to Python 3 and use it as the basis for this chapter.
<p class=nav><a rel=prev href=iterators.html title="back to &#8220;Iterators&#8221;"><span>&#x261C;</span></a> <a rel=next href=unit-testing.html title="onward to &#8220;Unit Testing&#8221;"><span>&#x261E;</span></a>
<p class=v><a href=iterators.html rel=prev title='back to &#8220;Iterators&#8221;'><span>&#x261C;</span></a> <a href=unit-testing.html rel=next title='onward to &#8220;Unit Testing&#8221;'><span>&#x261E;</span></a>
<p class=v><a rel=prev class=todo><span>&#x261C;</span></a> <a rel=next class=todo><span>&#x261E;</span></a>
<p class=c>&copy; 2001&ndash;9 <a href=about.html>Mark Pilgrim</a>
<script src=j/jquery.js></script>
<script src=j/dip3.js></script>
+7 -7
View File
@@ -50,19 +50,19 @@ del{background:#f87}
<p>The main entry point for the detection algorithm is <code>universaldetector.py</code>, which has one class, <code>UniversalDetector</code>. (You might think the main entry point is the <code>detect</code> function in <code>chardet/__init__.py</code>, but that&#8217;s really just a convenience function that creates a <code>UniversalDetector</code> object, calls it, and returns its result.)
<p>There are 5 categories of encodings that <code>UniversalDetector</code> handles:
<ol>
<li><code>UTF-n</code> with a <abbr title="Byte Order Mark">BOM</abbr>. This includes <code>UTF-8</code>, both <abbr title="Big Endian">BE</abbr> and <abbr title="Little Endian">LE</abbr> variants of <code>UTF-16</code>, and all 4 byte-order variants of <code>UTF-32</code>.
<li><code>UTF-n</code> with a Byte Order Mark (<abbr>BOM</abbr>). This includes <code>UTF-8</code>, both Big-Endian and Little-Endian variants of <code>UTF-16</code>, and all 4 byte-order variants of <code>UTF-32</code>.
<li>Escaped encodings, which are entirely 7-bit <abbr>ASCII</abbr> compatible, where non-<abbr>ASCII</abbr> characters start with an escape sequence. Examples: <code>ISO-2022-JP</code> (Japanese) and <code>HZ-GB-2312</code> (Chinese).
<li>Multi-byte encodings, where each character is represented by a variable number of bytes. Examples: <code>Big5</code> (Chinese), <code>SHIFT_JIS</code> (Japanese), <code>EUC-KR</code> (Korean), and <code>UTF-8</code> without a <abbr title="Byte Order Mark">BOM</abbr>.
<li>Multi-byte encodings, where each character is represented by a variable number of bytes. Examples: <code>Big5</code> (Chinese), <code>SHIFT_JIS</code> (Japanese), <code>EUC-KR</code> (Korean), and <code>UTF-8</code> without a <abbr>BOM</abbr>.
<li>Single-byte encodings, where each character is represented by one byte. Examples: <code>KOI8-R</code> (Russian), <code>windows-1255</code> (Hebrew), and <code>TIS-620</code> (Thai).
<li><code>windows-1252</code>, which is used primarily on Microsoft Windows by middle managers who wouldn&#8217;t know a character encoding from a hole in the ground.
</ol>
<h3 id=how.bom><code>UTF-n</code> With A <abbr title="Byte Order Mark">BOM</abbr></h3>
<p>If the text starts with a <abbr title="Byte Order Mark">BOM</abbr>, we can reasonably assume that the text is encoded in <code>UTF-8</code>, <code>UTF-16</code>, or <code>UTF-32</code>. (The <abbr title="Byte Order Mark">BOM</abbr> will tell us exactly which one; that&#8217;s what it&#8217;s for.) This is handled inline in <code>UniversalDetector</code>, which returns the result immediately without any further processing.
<h3 id=how.bom><code>UTF-n</code> With A <abbr>BOM</abbr></h3>
<p>If the text starts with a <abbr>BOM</abbr>, we can reasonably assume that the text is encoded in <code>UTF-8</code>, <code>UTF-16</code>, or <code>UTF-32</code>. (The <abbr>BOM</abbr> will tell us exactly which one; that&#8217;s what it&#8217;s for.) This is handled inline in <code>UniversalDetector</code>, which returns the result immediately without any further processing.
<h3 id=how.esc>Escaped Encodings</h3>
<p>If the text contains a recognizable escape sequence that might indicate an escaped encoding, <code>UniversalDetector</code> creates an <code>EscCharSetProber</code> (defined in <code>escprober.py</code>) and feeds it the text.
<p><code>EscCharSetProber</code> creates a series of state machines, based on models of <code>HZ-GB-2312</code>, <code>ISO-2022-CN</code>, <code>ISO-2022-JP</code>, and <code>ISO-2022-KR</code> (defined in <code>escsm.py</code>). <code>EscCharSetProber</code> feeds the text to each of these state machines, one byte at a time. If any state machine ends up uniquely identifying the encoding, <code>EscCharSetProber</code> immediately returns the positive result to <code>UniversalDetector</code>, which returns it to the caller. If any state machine hits an illegal sequence, it is dropped and processing continues with the other state machines.
<h3 id=how.mb>Multi-Byte Encodings</h3>
<p>Assuming no <abbr title="Byte Order Mark">BOM</abbr>, <code>UniversalDetector</code> checks whether the text contains any high-bit characters. If so, it creates a series of &#8220;probers&#8221; for detecting multi-byte encodings, single-byte encodings, and as a last resort, <code>windows-1252</code>.
<p>Assuming no <abbr>BOM</abbr>, <code>UniversalDetector</code> checks whether the text contains any high-bit characters. If so, it creates a series of &#8220;probers&#8221; for detecting multi-byte encodings, single-byte encodings, and as a last resort, <code>windows-1252</code>.
<p>The multi-byte encoding prober, <code>MBCSGroupProber</code> (defined in <code>mbcsgroupprober.py</code>), is really just a shell that manages a group of other probers, one for each multi-byte encoding: <code>Big5</code>, <code>GB2312</code>, <code>EUC-TW</code>, <code>EUC-KR</code>, <code>EUC-JP</code>, <code>SHIFT_JIS</code>, and <code>UTF-8</code>. <code>MBCSGroupProber</code> feeds the text to each of these encoding-specific probers and checks the results. If a prober reports that it has found an illegal byte sequence, it is dropped from further processing (so that, for instance, any subsequent calls to <code>UniversalDetector</code>.<code>feed()</code> will skip that prober). If a prober reports that it is reasonably confident that it has detected the encoding, <code>MBCSGroupProber</code> reports this positive result to <code>UniversalDetector</code>, which reports the result to the caller.
<p>Most of the multi-byte encoding probers are inherited from <code>MultiByteCharSetProber</code> (defined in <code>mbcharsetprober.py</code>), and simply hook up the appropriate state machine and distribution analyzer and let <code>MultiByteCharSetProber</code> do the rest of the work. <code>MultiByteCharSetProber</code> runs the text through the encoding-specific state machine, one byte at a time, to look for byte sequences that would indicate a conclusive positive or negative result. At the same time, <code>MultiByteCharSetProber</code> feeds the text to an encoding-specific distribution analyzer.
<p>The distribution analyzers (each defined in <code>chardistribution.py</code>) use language-specific models of which characters are used most frequently. Once <code>MultiByteCharSetProber</code> has fed enough text to the distribution analyzer, it calculates a confidence rating based on the number of frequently-used characters, the total number of characters, and a language-specific distribution ratio. If the confidence is high enough, <code>MultiByteCharSetProber</code> returns the result to <code>MBCSGroupProber</code>, which returns it to <code>UniversalDetector</code>, which returns it to the caller.
@@ -1126,7 +1126,7 @@ tests\Big5\0804.blogspot.com.xml</samp>
File "C:\home\chardet\chardet\latin1prober.py", line 126, in get_confidence
total = reduce(operator.add, self._mFreqCounter)
NameError: global name 'reduce' is not defined</samp></pre>
<p>According to the official <a href=http://docs.python.org/3.0/whatsnew/3.0.html#builtins>What&#8217;s New In Python 3.0</a> guide, the <code>reduce()</code> function has been moved out of the global namespace and into the <code>functools</code> module. Quoting the guide: &#8220;Use <code>functools.reduce()</code> if you really need it; however, 99 percent of the time an explicit <code>for</code> loop is more readable.&#8221; You can read more about the decision from Guido van Rossum&#8217;s weblog: <a href="http://www.artima.com/weblogs/viewpost.jsp?thread=98196">The fate of reduce() in Python 3000</a>.
<p>According to the official <a href=http://docs.python.org/3.0/whatsnew/3.0.html#builtins>What&#8217;s New In Python 3.0</a> guide, the <code>reduce()</code> function has been moved out of the global namespace and into the <code>functools</code> module. Quoting the guide: &#8220;Use <code>functools.reduce()</code> if you really need it; however, 99 percent of the time an explicit <code>for</code> loop is more readable.&#8221; You can read more about the decision from Guido van Rossum&#8217;s weblog: <a href='http://www.artima.com/weblogs/viewpost.jsp?thread=98196'>The fate of reduce() in Python 3000</a>.
<pre><code>def get_confidence(self):
if self.get_state() == constants.eNotMe:
return 0.01
@@ -1192,7 +1192,7 @@ tests\EUC-JP\arclamp.jp.xml EUC-JP with confide
<li>Test cases are essential. Don&#8217;t port anything without them. Don&#8217;t even try. The <em>only</em> reason I have any confidence at all that <code>chardet</code> works in Python 3 is because I had a test suite that exercised every line of code in the entire library. I <em>never</em> would have found half of these problems with manual spot-checking.
</ol>
<p class=nav><a rel=prev class=todo><span>&#x261C;</span></a> <a rel=next href=where-to-go-from-here.html title="onward to &#8220;Where To Go From Here&#8221;"><span>&#x261E;</span></a>
<p class=v><a rel=prev class=todo><span>&#x261C;</span></a> <a rel=next href=where-to-go-from-here.html title='onward to &#8220;Where To Go From Here&#8221;'><span>&#x261E;</span></a>
<p class=c>&copy; 2001&ndash;9 <a href=about.html>Mark Pilgrim</a>
<script src=j/jquery.js></script>
<script src=j/dip3.js></script>
+13 -13
View File
@@ -28,21 +28,21 @@ POSSIBILITY OF SUCH DAMAGE.
Classname Legend
.w = "widgets" = wrapper block for hide/open/download links dynamically inserted into code listings
.b = "block" = internal block dynamically inserted into code listings
.d = "download" = download link for code listings
.p = "prompt" = command-line or interactive shell prompt within code listings
.q = "quote" = quote at beginning of each chapter
.f = "fancy" = first paragraph of each chapter (gets a fancy drop-cap)
.c = "centered" = centered footer text (also clears floats)
.a = "asterism" = section break
.w = "widgets" = wrapper block for hide/open/download links dynamically inserted into code listings
.b = "block" = internal block dynamically inserted into code listings
.d = "download" = download link for code listings
.p = "prompt" = command-line or interactive shell prompt within code listings
.q = "quote" = quote at beginning of each chapter
.f = "fancy" = first paragraph of each chapter (gets a fancy drop-cap)
.c = "centered" = centered footer text (also clears floats)
.a = "asterism" = section break
.v = "navigation" = prev/next navigation links (not breadcrumbs)
.nm = "no mobile" = hide this section on mobile devices
.nd = "no decoration" = hide the widgets on this code block
.note = "note/caution/important" = indented block for tips/gotchas/language comparisons
.baa = "best available ampersand" = wrapper block for ampersands
.nav = "navigation" = prev/next navigation links (not breadcrumbs)
Acknowledgements & Inspirations
@@ -273,18 +273,18 @@ aside {
}
/* previous/next navigation links */
.nav a {
.v a {
text-decoration: none;
border: 0;
display: block;
}
.nav a:first-child {
.v a:first-child {
float: left;
}
.nav a:last-child {
.v a:last-child {
float: right;
}
.nav span {
.v span {
font-size: 1000%;
line-height: 1;
margin: 0;
+2 -2
View File
@@ -1,8 +1,8 @@
"""Find solutions to alphametic equations.
'''Find solutions to alphametic equations.
>>> alphametics.solve('SEND + MORE == MONEY')
'9567 + 1085 == 10652'
"""
'''
import re
import itertools
+15 -15
View File
@@ -3,34 +3,34 @@ import unittest
class KnownValues(unittest.TestCase):
def test_out(self):
"""TO + GO == OUT"""
self.assertEqual(solve("TO + GO == OUT"), "21 + 81 == 102")
'''TO + GO == OUT'''
self.assertEqual(solve('TO + GO == OUT'), '21 + 81 == 102')
def test_too(self):
"""I + DID == TOO"""
self.assertEqual(solve("I + DID == TOO"), "9 + 191 == 200")
'''I + DID == TOO'''
self.assertEqual(solve('I + DID == TOO'), '9 + 191 == 200')
def test_mom(self):
"""AS + A == MOM"""
self.assertEqual(solve("AS + A == MOM"), "92 + 9 == 101")
'''AS + A == MOM'''
self.assertEqual(solve('AS + A == MOM'), '92 + 9 == 101')
def test_best(self):
"""HES + THE == BEST"""
self.assertEqual(solve("HES + THE == BEST"), "426 + 842 == 1268")
'''HES + THE == BEST'''
self.assertEqual(solve('HES + THE == BEST'), '426 + 842 == 1268')
def test_late(self):
"""NO + NO + TOO == LATE"""
self.assertEqual(solve("NO + NO + TOO == LATE"), "74 + 74 + 944 == 1092")
'''NO + NO + TOO == LATE'''
self.assertEqual(solve('NO + NO + TOO == LATE'), '74 + 74 + 944 == 1092')
def test_onze(self):
"""UN + UN + NEUF == ONZE"""
self.assertEqual(solve("UN + UN + NEUF == ONZE"), "81 + 81 + 1987 == 2149")
'''UN + UN + NEUF == ONZE'''
self.assertEqual(solve('UN + UN + NEUF == ONZE'), '81 + 81 + 1987 == 2149')
def test_deux(self):
"""UN + DEUX + DEUX + DEUX + DEUX == NEUF"""
self.assertEqual(solve("UN + DEUX + DEUX + DEUX + DEUX == NEUF"), "25 + 1326 + 1326 + 1326 + 1326 == 5329")
'''UN + DEUX + DEUX + DEUX + DEUX == NEUF'''
self.assertEqual(solve('UN + DEUX + DEUX + DEUX + DEUX == NEUF'), '25 + 1326 + 1326 + 1326 + 1326 == 5329')
if __name__ == "__main__":
if __name__ == '__main__':
unittest.main()
# Copyright (c) 2009, Mark Pilgrim, All rights reserved.
+24 -24
View File
@@ -1,25 +1,25 @@
<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
<?xml version='1.0' encoding='utf-8'?>
<feed xmlns='http://www.w3.org/2005/Atom' xml:lang='en'>
<title>dive into &hellip;</title>
<subtitle>currently between addictions</subtitle>
<id>tag:diveintomark.org,2001-07-29:/</id>
<updated>2009-03-27T21:56:07Z</updated>
<link rel="alternate" type="text/html" href="http://diveintomark.org/"/>
<link rel='alternate' type='text/html' href='http://diveintomark.org/'/>
<entry>
<author>
<name>Mark</name>
<uri>http://diveintomark.org/</uri>
</author>
<title>Dive into history, 2009 edition</title>
<link rel="alternate" type="text/html"
href="http://diveintomark.org/archives/2009/03/27/dive-into-history-2009-edition"/>
<link rel='alternate' type='text/html'
href='http://diveintomark.org/archives/2009/03/27/dive-into-history-2009-edition'/>
<id>tag:diveintomark.org,2009-03-27:/archives/20090327172042</id>
<updated>2009-03-27T21:56:07Z</updated>
<published>2009-03-27T17:20:42Z</published>
<category scheme="http://diveintomark.org" term="diveintopython"/>
<category scheme="http://diveintomark.org" term="docbook"/>
<category scheme="http://diveintomark.org" term="html"/>
<summary type="html">Putting an entire chapter on one page sounds
<category scheme='http://diveintomark.org' term='diveintopython'/>
<category scheme='http://diveintomark.org' term='docbook'/>
<category scheme='http://diveintomark.org' term='html'/>
<summary type='html'>Putting an entire chapter on one page sounds
bloated, but consider this &amp;mdash; my longest chapter so far
would be 75 printed pages, and it loads in under 5 seconds&amp;hellip;
On dialup.</summary>
@@ -30,13 +30,13 @@
<uri>http://diveintomark.org/</uri>
</author>
<title>Accessibility is a harsh mistress</title>
<link rel="alternate" type="text/html"
href="http://diveintomark.org/archives/2009/03/21/accessibility-is-a-harsh-mistress"/>
<link rel='alternate' type='text/html'
href='http://diveintomark.org/archives/2009/03/21/accessibility-is-a-harsh-mistress'/>
<id>tag:diveintomark.org,2009-03-21:/archives/20090321200928</id>
<updated>2009-03-22T01:05:37Z</updated>
<published>2009-03-21T20:09:28Z</published>
<category scheme="http://diveintomark.org" term="accessibility"/>
<summary type="html">The accessibility orthodoxy does not permit people to
<category scheme='http://diveintomark.org' term='accessibility'/>
<summary type='html'>The accessibility orthodoxy does not permit people to
question the value of features that are rarely useful and rarely used.</summary>
</entry>
<entry>
@@ -44,20 +44,20 @@
<name>Mark</name>
</author>
<title>A gentle introduction to video encoding, part 1: container formats</title>
<link rel="alternate" type="text/html"
href="http://diveintomark.org/archives/2008/12/18/give-part-1-container-formats"/>
<link rel='alternate' type='text/html'
href='http://diveintomark.org/archives/2008/12/18/give-part-1-container-formats'/>
<id>tag:diveintomark.org,2008-12-18:/archives/20081218155422</id>
<updated>2009-01-11T19:39:22Z</updated>
<published>2008-12-18T15:54:22Z</published>
<category scheme="http://diveintomark.org" term="asf"/>
<category scheme="http://diveintomark.org" term="avi"/>
<category scheme="http://diveintomark.org" term="encoding"/>
<category scheme="http://diveintomark.org" term="flv"/>
<category scheme="http://diveintomark.org" term="GIVE"/>
<category scheme="http://diveintomark.org" term="mp4"/>
<category scheme="http://diveintomark.org" term="ogg"/>
<category scheme="http://diveintomark.org" term="video"/>
<summary type="html">These notes will eventually become part of a
<category scheme='http://diveintomark.org' term='asf'/>
<category scheme='http://diveintomark.org' term='avi'/>
<category scheme='http://diveintomark.org' term='encoding'/>
<category scheme='http://diveintomark.org' term='flv'/>
<category scheme='http://diveintomark.org' term='GIVE'/>
<category scheme='http://diveintomark.org' term='mp4'/>
<category scheme='http://diveintomark.org' term='ogg'/>
<category scheme='http://diveintomark.org' term='video'/>
<summary type='html'>These notes will eventually become part of a
tech talk on video encoding.</summary>
</entry>
</feed>
+24 -24
View File
@@ -1,25 +1,25 @@
<?xml version="1.0" encoding="utf-8"?>
<ns0:feed xmlns:ns0="http://www.w3.org/2005/Atom" xml:lang="en">
<?xml version='1.0' encoding='utf-8'?>
<ns0:feed xmlns:ns0='http://www.w3.org/2005/Atom' xml:lang='en'>
<ns0:title>dive into mark</ns0:title>
<ns0:subtitle>currently between addictions</ns0:subtitle>
<ns0:id>tag:diveintomark.org,2001-07-29:/</ns0:id>
<ns0:updated>2009-03-27T21:56:07Z</ns0:updated>
<ns0:link rel="alternate" type="text/html" href="http://diveintomark.org/"/>
<ns0:link rel='alternate' type='text/html' href='http://diveintomark.org/'/>
<ns0:entry>
<ns0:author>
<ns0:name>Mark</ns0:name>
<ns0:uri>http://diveintomark.org/</ns0:uri>
</ns0:author>
<ns0:title>Dive into history, 2009 edition</ns0:title>
<ns0:link rel="alternate" type="text/html"
href="http://diveintomark.org/archives/2009/03/27/dive-into-history-2009-edition"/>
<ns0:link rel='alternate' type='text/html'
href='http://diveintomark.org/archives/2009/03/27/dive-into-history-2009-edition'/>
<ns0:id>tag:diveintomark.org,2009-03-27:/archives/20090327172042</ns0:id>
<ns0:updated>2009-03-27T21:56:07Z</ns0:updated>
<ns0:published>2009-03-27T17:20:42Z</ns0:published>
<ns0:category scheme="http://diveintomark.org" term="diveintopython"/>
<ns0:category scheme="http://diveintomark.org" term="docbook"/>
<ns0:category scheme="http://diveintomark.org" term="html"/>
<ns0:summary type="html">Putting an entire chapter on one page sounds
<ns0:category scheme='http://diveintomark.org' term='diveintopython'/>
<ns0:category scheme='http://diveintomark.org' term='docbook'/>
<ns0:category scheme='http://diveintomark.org' term='html'/>
<ns0:summary type='html'>Putting an entire chapter on one page sounds
bloated, but consider this &amp;mdash; my longest chapter so far
would be 75 printed pages, and it loads in under 5 seconds&amp;hellip;
On dialup.</ns0:summary>
@@ -30,13 +30,13 @@
<ns0:uri>http://diveintomark.org/</ns0:uri>
</ns0:author>
<ns0:title>Accessibility is a harsh mistress</ns0:title>
<ns0:link rel="alternate" type="text/html"
href="http://diveintomark.org/archives/2009/03/21/accessibility-is-a-harsh-mistress"/>
<ns0:link rel='alternate' type='text/html'
href='http://diveintomark.org/archives/2009/03/21/accessibility-is-a-harsh-mistress'/>
<ns0:id>tag:diveintomark.org,2009-03-21:/archives/20090321200928</ns0:id>
<ns0:updated>2009-03-22T01:05:37Z</ns0:updated>
<ns0:published>2009-03-21T20:09:28Z</ns0:published>
<ns0:category scheme="http://diveintomark.org" term="accessibility"/>
<ns0:summary type="html">The accessibility orthodoxy does not permit people to
<ns0:category scheme='http://diveintomark.org' term='accessibility'/>
<ns0:summary type='html'>The accessibility orthodoxy does not permit people to
question the value of features that are rarely useful and rarely used.</ns0:summary>
</ns0:entry>
<ns0:entry>
@@ -44,20 +44,20 @@
<ns0:name>Mark</ns0:name>
</ns0:author>
<ns0:title>A gentle introduction to video encoding, part 1: container formats</ns0:title>
<ns0:link rel="alternate" type="text/html"
href="http://diveintomark.org/archives/2008/12/18/give-part-1-container-formats"/>
<ns0:link rel='alternate' type='text/html'
href='http://diveintomark.org/archives/2008/12/18/give-part-1-container-formats'/>
<ns0:id>tag:diveintomark.org,2008-12-18:/archives/20081218155422</ns0:id>
<ns0:updated>2009-01-11T19:39:22Z</ns0:updated>
<ns0:published>2008-12-18T15:54:22Z</ns0:published>
<ns0:category scheme="http://diveintomark.org" term="asf"/>
<ns0:category scheme="http://diveintomark.org" term="avi"/>
<ns0:category scheme="http://diveintomark.org" term="encoding"/>
<ns0:category scheme="http://diveintomark.org" term="flv"/>
<ns0:category scheme="http://diveintomark.org" term="GIVE"/>
<ns0:category scheme="http://diveintomark.org" term="mp4"/>
<ns0:category scheme="http://diveintomark.org" term="ogg"/>
<ns0:category scheme="http://diveintomark.org" term="video"/>
<ns0:summary type="html">These notes will eventually become part of a
<ns0:category scheme='http://diveintomark.org' term='asf'/>
<ns0:category scheme='http://diveintomark.org' term='avi'/>
<ns0:category scheme='http://diveintomark.org' term='encoding'/>
<ns0:category scheme='http://diveintomark.org' term='flv'/>
<ns0:category scheme='http://diveintomark.org' term='GIVE'/>
<ns0:category scheme='http://diveintomark.org' term='mp4'/>
<ns0:category scheme='http://diveintomark.org' term='ogg'/>
<ns0:category scheme='http://diveintomark.org' term='video'/>
<ns0:summary type='html'>These notes will eventually become part of a
tech talk on video encoding.</ns0:summary>
</ns0:entry>
</ns0:feed>
+24 -24
View File
@@ -1,25 +1,25 @@
<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
<?xml version='1.0' encoding='utf-8'?>
<feed xmlns='http://www.w3.org/2005/Atom' xml:lang='en'>
<title>dive into mark</title>
<subtitle>currently between addictions</subtitle>
<id>tag:diveintomark.org,2001-07-29:/</id>
<updated>2009-03-27T21:56:07Z</updated>
<link rel="alternate" type="text/html" href="http://diveintomark.org/"/>
<link rel='alternate' type='text/html' href='http://diveintomark.org/'/>
<entry>
<author>
<name>Mark</name>
<uri>http://diveintomark.org/</uri>
</author>
<title>Dive into history, 2009 edition</title>
<link rel="alternate" type="text/html"
href="http://diveintomark.org/archives/2009/03/27/dive-into-history-2009-edition"/>
<link rel='alternate' type='text/html'
href='http://diveintomark.org/archives/2009/03/27/dive-into-history-2009-edition'/>
<id>tag:diveintomark.org,2009-03-27:/archives/20090327172042</id>
<updated>2009-03-27T21:56:07Z</updated>
<published>2009-03-27T17:20:42Z</published>
<category scheme="http://diveintomark.org" term="diveintopython"/>
<category scheme="http://diveintomark.org" term="docbook"/>
<category scheme="http://diveintomark.org" term="html"/>
<summary type="html">Putting an entire chapter on one page sounds
<category scheme='http://diveintomark.org' term='diveintopython'/>
<category scheme='http://diveintomark.org' term='docbook'/>
<category scheme='http://diveintomark.org' term='html'/>
<summary type='html'>Putting an entire chapter on one page sounds
bloated, but consider this &amp;mdash; my longest chapter so far
would be 75 printed pages, and it loads in under 5 seconds&amp;hellip;
On dialup.</summary>
@@ -30,13 +30,13 @@
<uri>http://diveintomark.org/</uri>
</author>
<title>Accessibility is a harsh mistress</title>
<link rel="alternate" type="text/html"
href="http://diveintomark.org/archives/2009/03/21/accessibility-is-a-harsh-mistress"/>
<link rel='alternate' type='text/html'
href='http://diveintomark.org/archives/2009/03/21/accessibility-is-a-harsh-mistress'/>
<id>tag:diveintomark.org,2009-03-21:/archives/20090321200928</id>
<updated>2009-03-22T01:05:37Z</updated>
<published>2009-03-21T20:09:28Z</published>
<category scheme="http://diveintomark.org" term="accessibility"/>
<summary type="html">The accessibility orthodoxy does not permit people to
<category scheme='http://diveintomark.org' term='accessibility'/>
<summary type='html'>The accessibility orthodoxy does not permit people to
question the value of features that are rarely useful and rarely used.</summary>
</entry>
<entry>
@@ -44,20 +44,20 @@
<name>Mark</name>
</author>
<title>A gentle introduction to video encoding, part 1: container formats</title>
<link rel="alternate" type="text/html"
href="http://diveintomark.org/archives/2008/12/18/give-part-1-container-formats"/>
<link rel='alternate' type='text/html'
href='http://diveintomark.org/archives/2008/12/18/give-part-1-container-formats'/>
<id>tag:diveintomark.org,2008-12-18:/archives/20081218155422</id>
<updated>2009-01-11T19:39:22Z</updated>
<published>2008-12-18T15:54:22Z</published>
<category scheme="http://diveintomark.org" term="asf"/>
<category scheme="http://diveintomark.org" term="avi"/>
<category scheme="http://diveintomark.org" term="encoding"/>
<category scheme="http://diveintomark.org" term="flv"/>
<category scheme="http://diveintomark.org" term="GIVE"/>
<category scheme="http://diveintomark.org" term="mp4"/>
<category scheme="http://diveintomark.org" term="ogg"/>
<category scheme="http://diveintomark.org" term="video"/>
<summary type="html">These notes will eventually become part of a
<category scheme='http://diveintomark.org' term='asf'/>
<category scheme='http://diveintomark.org' term='avi'/>
<category scheme='http://diveintomark.org' term='encoding'/>
<category scheme='http://diveintomark.org' term='flv'/>
<category scheme='http://diveintomark.org' term='GIVE'/>
<category scheme='http://diveintomark.org' term='mp4'/>
<category scheme='http://diveintomark.org' term='ogg'/>
<category scheme='http://diveintomark.org' term='video'/>
<summary type='html'>These notes will eventually become part of a
tech talk on video encoding.</summary>
</entry>
</feed>
+1 -1
View File
@@ -1,4 +1,4 @@
"""Fibonacci generator"""
'''Fibonacci generator'''
def fib(max):
a, b = 0, 1
+2 -2
View File
@@ -1,7 +1,7 @@
"""Fibonacci iterator"""
'''Fibonacci iterator'''
class Fib:
"""iterator that yields numbers in the Fibonacci sequence"""
'''iterator that yields numbers in the Fibonacci sequence'''
def __init__(self, max):
self.max = max
+6 -6
View File
@@ -1,4 +1,4 @@
"""Convert file sizes to human-readable form.
'''Convert file sizes to human-readable form.
Available functions:
approximate_size(size, a_kilobyte_is_1024_bytes)
@@ -10,13 +10,13 @@ Examples:
>>> approximate_size(1000, False)
'1.0 KB'
"""
'''
SUFFIXES = {1000: ['KB', 'MB', 'GB', 'TB', 'PB', 'EB', 'ZB', 'YB'],
1024: ['KiB', 'MiB', 'GiB', 'TiB', 'PiB', 'EiB', 'ZiB', 'YiB']}
def approximate_size(size, a_kilobyte_is_1024_bytes=True):
"""Convert a file size to human-readable form.
'''Convert a file size to human-readable form.
Keyword arguments:
size -- file size in bytes
@@ -25,7 +25,7 @@ def approximate_size(size, a_kilobyte_is_1024_bytes=True):
Returns: string
"""
'''
if size < 0:
raise ValueError('number must be non-negative')
@@ -33,11 +33,11 @@ def approximate_size(size, a_kilobyte_is_1024_bytes=True):
for suffix in SUFFIXES[multiple]:
size /= multiple
if size < multiple:
return "{0:.1f} {1}".format(size, suffix)
return '{0:.1f} {1}'.format(size, suffix)
raise ValueError('number too large')
if __name__ == "__main__":
if __name__ == '__main__':
print(approximate_size(1000000000000, False))
print(approximate_size(1000000000000))
+2 -2
View File
@@ -1,9 +1,9 @@
"""Pluralize English nouns (stage 1)
'''Pluralize English nouns (stage 1)
Command line usage:
$ python3 plural.py noun
nouns
"""
'''
import re
+2 -2
View File
@@ -1,9 +1,9 @@
"""Pluralize English nouns (stage 2)
'''Pluralize English nouns (stage 2)
Command line usage:
$ python plural2.py noun
nouns
"""
'''
import re
+2 -2
View File
@@ -1,9 +1,9 @@
"""Pluralize English nouns (stage 3)
'''Pluralize English nouns (stage 3)
Command line usage:
$ python plural3.py noun
nouns
"""
'''
import re
+2 -2
View File
@@ -1,9 +1,9 @@
"""Pluralize English nouns (stage 4)
'''Pluralize English nouns (stage 4)
Command line usage:
$ python plural4.py noun
nouns
"""
'''
import re
+2 -2
View File
@@ -1,9 +1,9 @@
"""Pluralize English nouns (stage 5)
'''Pluralize English nouns (stage 5)
Command line usage:
$ python plural5.py noun
nouns
"""
'''
import re
+2 -2
View File
@@ -1,9 +1,9 @@
"""Pluralize English nouns (stage 6)
'''Pluralize English nouns (stage 6)
Command line usage:
$ python plural6.py noun
nouns
"""
'''
import re
+6 -6
View File
@@ -1,11 +1,11 @@
"""Unit test for plural1.py"""
'''Unit test for plural1.py'''
import plural1
import unittest
class KnownValues(unittest.TestCase):
def test_sxz(self):
"words ending in S, X, and Z"
'words ending in S, X, and Z'
nouns = {
'bass': 'basses',
'bus': 'buses',
@@ -21,7 +21,7 @@ class KnownValues(unittest.TestCase):
self.assertEqual(plural1.plural(singular), plural)
def test_h(self):
"words ending in H"
'words ending in H'
nouns = {
'coach': 'coaches',
'glitch': 'glitches',
@@ -34,7 +34,7 @@ class KnownValues(unittest.TestCase):
self.assertEqual(plural1.plural(singular), plural)
def test_y(self):
"words ending in Y"
'words ending in Y'
nouns = {
'utility': 'utilities',
'vacancy': 'vacancies',
@@ -45,7 +45,7 @@ class KnownValues(unittest.TestCase):
self.assertEqual(plural1.plural(singular), plural)
def test_default(self):
"unexceptional words"
'unexceptional words'
nouns = {
'papaya': 'papayas',
'whip': 'whips',
@@ -54,7 +54,7 @@ class KnownValues(unittest.TestCase):
for singular, plural in nouns.items():
self.assertEqual(plural1.plural(singular), plural)
if __name__ == "__main__":
if __name__ == '__main__':
unittest.main()
# Copyright (c) 2009, Mark Pilgrim, All rights reserved.
+6 -6
View File
@@ -1,11 +1,11 @@
"""Unit test for plural2.py"""
'''Unit test for plural2.py'''
import plural2
import unittest
class KnownValues(unittest.TestCase):
def test_sxz(self):
"words ending in S, X, and Z"
'words ending in S, X, and Z'
nouns = {
'bass': 'basses',
'bus': 'buses',
@@ -21,7 +21,7 @@ class KnownValues(unittest.TestCase):
self.assertEqual(plural2.plural(singular), plural)
def test_h(self):
"words ending in H"
'words ending in H'
nouns = {
'coach': 'coaches',
'glitch': 'glitches',
@@ -34,7 +34,7 @@ class KnownValues(unittest.TestCase):
self.assertEqual(plural2.plural(singular), plural)
def test_y(self):
"words ending in Y"
'words ending in Y'
nouns = {
'utility': 'utilities',
'vacancy': 'vacancies',
@@ -45,7 +45,7 @@ class KnownValues(unittest.TestCase):
self.assertEqual(plural2.plural(singular), plural)
def test_default(self):
"unexceptional words"
'unexceptional words'
nouns = {
'papaya': 'papayas',
'whip': 'whips',
@@ -54,7 +54,7 @@ class KnownValues(unittest.TestCase):
for singular, plural in nouns.items():
self.assertEqual(plural2.plural(singular), plural)
if __name__ == "__main__":
if __name__ == '__main__':
unittest.main()
# Copyright (c) 2009, Mark Pilgrim, All rights reserved.
+6 -6
View File
@@ -1,11 +1,11 @@
"""Unit test for plural1.py"""
'''Unit test for plural1.py'''
import plural3
import unittest
class KnownValues(unittest.TestCase):
def test_sxz(self):
"words ending in S, X, and Z"
'words ending in S, X, and Z'
nouns = {
'bass': 'basses',
'bus': 'buses',
@@ -21,7 +21,7 @@ class KnownValues(unittest.TestCase):
self.assertEqual(plural3.plural(singular), plural)
def test_h(self):
"words ending in H"
'words ending in H'
nouns = {
'coach': 'coaches',
'glitch': 'glitches',
@@ -34,7 +34,7 @@ class KnownValues(unittest.TestCase):
self.assertEqual(plural3.plural(singular), plural)
def test_y(self):
"words ending in Y"
'words ending in Y'
nouns = {
'utility': 'utilities',
'vacancy': 'vacancies',
@@ -45,7 +45,7 @@ class KnownValues(unittest.TestCase):
self.assertEqual(plural3.plural(singular), plural)
def test_default(self):
"unexceptional words"
'unexceptional words'
nouns = {
'papaya': 'papayas',
'whip': 'whips',
@@ -54,7 +54,7 @@ class KnownValues(unittest.TestCase):
for singular, plural in nouns.items():
self.assertEqual(plural3.plural(singular), plural)
if __name__ == "__main__":
if __name__ == '__main__':
unittest.main()
# Copyright (c) 2009, Mark Pilgrim, All rights reserved.
+6 -6
View File
@@ -1,11 +1,11 @@
"""Unit test for plural1.py"""
'''Unit test for plural1.py'''
import plural4
import unittest
class KnownValues(unittest.TestCase):
def test_sxz(self):
"words ending in S, X, and Z"
'words ending in S, X, and Z'
nouns = {
'bass': 'basses',
'bus': 'buses',
@@ -21,7 +21,7 @@ class KnownValues(unittest.TestCase):
self.assertEqual(plural4.plural(singular), plural)
def test_h(self):
"words ending in H"
'words ending in H'
nouns = {
'coach': 'coaches',
'glitch': 'glitches',
@@ -34,7 +34,7 @@ class KnownValues(unittest.TestCase):
self.assertEqual(plural4.plural(singular), plural)
def test_y(self):
"words ending in Y"
'words ending in Y'
nouns = {
'utility': 'utilities',
'vacancy': 'vacancies',
@@ -45,7 +45,7 @@ class KnownValues(unittest.TestCase):
self.assertEqual(plural4.plural(singular), plural)
def test_default(self):
"unexceptional words"
'unexceptional words'
nouns = {
'papaya': 'papayas',
'whip': 'whips',
@@ -54,7 +54,7 @@ class KnownValues(unittest.TestCase):
for singular, plural in nouns.items():
self.assertEqual(plural4.plural(singular), plural)
if __name__ == "__main__":
if __name__ == '__main__':
unittest.main()
# Copyright (c) 2009, Mark Pilgrim, All rights reserved.
+6 -9
View File
@@ -1,11 +1,11 @@
"""Unit test for plural5.py"""
'''Unit test for plural5.py'''
import plural5
import unittest
class KnownValues(unittest.TestCase):
def test_sxz(self):
"words ending in S, X, and Z"
'words ending in S, X, and Z'
nouns = {
'bass': 'basses',
'bus': 'buses',
@@ -21,7 +21,7 @@ class KnownValues(unittest.TestCase):
self.assertEqual(plural5.plural(singular), plural)
def test_h(self):
"words ending in H"
'words ending in H'
nouns = {
'coach': 'coaches',
'glitch': 'glitches',
@@ -34,7 +34,7 @@ class KnownValues(unittest.TestCase):
self.assertEqual(plural5.plural(singular), plural)
def test_y(self):
"words ending in Y"
'words ending in Y'
nouns = {
'utility': 'utilities',
'vacancy': 'vacancies',
@@ -45,7 +45,7 @@ class KnownValues(unittest.TestCase):
self.assertEqual(plural5.plural(singular), plural)
def test_default(self):
"unexceptional words"
'unexceptional words'
nouns = {
'papaya': 'papayas',
'whip': 'whips',
@@ -54,10 +54,7 @@ class KnownValues(unittest.TestCase):
for singular, plural in nouns.items():
self.assertEqual(plural5.plural(singular), plural)
if __name__ == "__main__":
unittest.main()
if __name__ == "__main__":
if __name__ == '__main__':
unittest.main()
# Copyright (c) 2009, Mark Pilgrim, All rights reserved.
+17 -20
View File
@@ -1,11 +1,11 @@
"""Unit test for plural6.py"""
'''Unit test for plural6.py'''
import plural6
import unittest
class KnownValues(unittest.TestCase):
def test_sxz(self):
"words ending in S, X, and Z"
'words ending in S, X, and Z'
nouns = {
'bass': 'basses',
'bus': 'buses',
@@ -21,7 +21,7 @@ class KnownValues(unittest.TestCase):
self.assertEqual(plural6.plural(singular), plural)
def test_h(self):
"words ending in H"
'words ending in H'
nouns = {
'coach': 'coaches',
'glitch': 'glitches',
@@ -34,7 +34,7 @@ class KnownValues(unittest.TestCase):
self.assertEqual(plural6.plural(singular), plural)
def test_y(self):
"words ending in Y"
'words ending in Y'
nouns = {
'utility': 'utilities',
'vacancy': 'vacancies',
@@ -45,7 +45,7 @@ class KnownValues(unittest.TestCase):
self.assertEqual(plural6.plural(singular), plural)
def test_ouce(self):
"words ending in OUSE"
'words ending in OUSE'
nouns = {
'mouse': 'mice',
'louse': 'lice'
@@ -54,7 +54,7 @@ class KnownValues(unittest.TestCase):
self.assertEqual(plural6.plural(singular), plural)
def test_child(self):
"special case: child"
'special case: child'
nouns = {
'child': 'children'
}
@@ -62,7 +62,7 @@ class KnownValues(unittest.TestCase):
self.assertEqual(plural6.plural(singular), plural)
def test_oot(self):
"special case: foot"
'special case: foot'
nouns = {
'foot': 'feet'
}
@@ -70,7 +70,7 @@ class KnownValues(unittest.TestCase):
self.assertEqual(plural6.plural(singular), plural)
def test_ooth(self):
"words ending in OOTH"
'words ending in OOTH'
nouns = {
'booth': 'booths',
'tooth': 'teeth'
@@ -79,7 +79,7 @@ class KnownValues(unittest.TestCase):
self.assertEqual(plural6.plural(singular), plural)
def test_f_ves(self):
"words ending in F that become VES"
'words ending in F that become VES'
nouns = {
'leaf': 'leaves',
'loaf': 'loaves'
@@ -88,7 +88,7 @@ class KnownValues(unittest.TestCase):
self.assertEqual(plural6.plural(singular), plural)
def test_sis(self):
"words ending in SIS"
'words ending in SIS'
nouns = {
'thesis': 'theses'
}
@@ -96,7 +96,7 @@ class KnownValues(unittest.TestCase):
self.assertEqual(plural6.plural(singular), plural)
def test_man(self):
"words ending in MAN"
'words ending in MAN'
nouns = {
'man': 'men',
'mailman': 'mailmen',
@@ -107,7 +107,7 @@ class KnownValues(unittest.TestCase):
self.assertEqual(plural6.plural(singular), plural)
def test_ife(self):
"words ending in IFE"
'words ending in IFE'
nouns = {
'knife': 'knives',
'wife': 'wives',
@@ -117,7 +117,7 @@ class KnownValues(unittest.TestCase):
self.assertEqual(plural6.plural(singular), plural)
def test_eau(self):
"words ending in EAU"
'words ending in EAU'
nouns = {
'tableau': 'tableaux'
}
@@ -125,7 +125,7 @@ class KnownValues(unittest.TestCase):
self.assertEqual(plural6.plural(singular), plural)
def test_elf(self):
"words ending in ELF"
'words ending in ELF'
nouns = {
'elf': 'elves',
'shelf': 'shelves',
@@ -136,7 +136,7 @@ class KnownValues(unittest.TestCase):
self.assertEqual(plural6.plural(singular), plural)
def test_same(self):
"words that are their own plural"
'words that are their own plural'
nouns = {
'sheep': 'sheep',
'deer': 'deer',
@@ -150,7 +150,7 @@ class KnownValues(unittest.TestCase):
self.assertEqual(plural6.plural(singular), plural)
def test_default(self):
"unexceptional words"
'unexceptional words'
nouns = {
'papaya': 'papayas',
'whip': 'whips',
@@ -159,10 +159,7 @@ class KnownValues(unittest.TestCase):
for singular, plural in nouns.items():
self.assertEqual(plural6.plural(singular), plural)
if __name__ == "__main__":
unittest.main()
if __name__ == "__main__":
if __name__ == '__main__':
unittest.main()
# Copyright (c) 2009, Mark Pilgrim, All rights reserved.
+5 -5
View File
@@ -1,9 +1,9 @@
"""Convert to and from Roman numerals
'''Convert to and from Roman numerals
This program is part of "Dive Into Python 3", a free Python book for
This program is part of 'Dive Into Python 3', a free Python book for
experienced programmers. Visit http://diveintopython3.org/ for the
latest version.
"""
'''
roman_numeral_map = (('M', 1000),
('CM', 900),
@@ -20,8 +20,8 @@ roman_numeral_map = (('M', 1000),
('I', 1))
def to_roman(n):
"""convert integer to Roman numeral"""
result = ""
'''convert integer to Roman numeral'''
result = ''
for numeral, integer in roman_numeral_map:
while n >= integer:
result += numeral
+11 -11
View File
@@ -1,9 +1,9 @@
"""Convert to and from Roman numerals
'''Convert to and from Roman numerals
This program is part of "Dive Into Python 3", a free Python book for
This program is part of 'Dive Into Python 3', a free Python book for
experienced programmers. Visit http://diveintopython3.org/ for the
latest version.
"""
'''
class OutOfRangeError(ValueError): pass
class NotIntegerError(ValueError): pass
@@ -27,26 +27,26 @@ to_roman_table = [ None ]
from_roman_table = {}
def to_roman(n):
"""convert integer to Roman numeral"""
'''convert integer to Roman numeral'''
if not (0 < n < 5000):
raise OutOfRangeError("number out of range (must be 1..4999)")
raise OutOfRangeError('number out of range (must be 1..4999)')
if int(n) != n:
raise NotIntegerError("non-integers can not be converted")
raise NotIntegerError('non-integers can not be converted')
return to_roman_table[n]
def from_roman(s):
"""convert Roman numeral to integer"""
'''convert Roman numeral to integer'''
if not isinstance(s, str):
raise InvalidRomanNumeralError("Input must be a string")
raise InvalidRomanNumeralError('Input must be a string')
if not s:
raise InvalidRomanNumeralError("Input can not be blank")
raise InvalidRomanNumeralError('Input can not be blank')
if s not in from_roman_table:
raise InvalidRomanNumeralError("Invalid Roman numeral: {0}".format(s))
raise InvalidRomanNumeralError('Invalid Roman numeral: {0}'.format(s))
return from_roman_table[s]
def build_lookup_tables():
def to_roman(n):
result = ""
result = ''
for numeral, integer in roman_numeral_map:
if n >= integer:
result = numeral
+6 -6
View File
@@ -1,9 +1,9 @@
"""Convert to and from Roman numerals
'''Convert to and from Roman numerals
This program is part of "Dive Into Python 3", a free Python book for
This program is part of 'Dive Into Python 3', a free Python book for
experienced programmers. Visit http://diveintopython3.org/ for the
latest version.
"""
'''
class OutOfRangeError(ValueError):
pass
@@ -22,11 +22,11 @@ roman_numeral_map = (('M', 1000),
('I', 1))
def to_roman(n):
"""convert integer to Roman numeral"""
'''convert integer to Roman numeral'''
if n > 3999:
raise OutOfRangeError("number out of range (must be less than 3999)")
raise OutOfRangeError('number out of range (must be less than 3999)')
result = ""
result = ''
for numeral, integer in roman_numeral_map:
while n >= integer:
result += numeral
+6 -6
View File
@@ -1,9 +1,9 @@
"""Convert to and from Roman numerals
'''Convert to and from Roman numerals
This program is part of "Dive Into Python 3", a free Python book for
This program is part of 'Dive Into Python 3', a free Python book for
experienced programmers. Visit http://diveintopython3.org/ for the
latest version.
"""
'''
class OutOfRangeError(ValueError): pass
roman_numeral_map = (('M', 1000),
@@ -21,11 +21,11 @@ roman_numeral_map = (('M', 1000),
('I', 1))
def to_roman(n):
"""convert integer to Roman numeral"""
'''convert integer to Roman numeral'''
if not (0 < n < 4000):
raise OutOfRangeError("number out of range (must be 0..3999)")
raise OutOfRangeError('number out of range (must be 0..3999)')
result = ""
result = ''
for numeral, integer in roman_numeral_map:
while n >= integer:
result += numeral
+7 -7
View File
@@ -1,9 +1,9 @@
"""Convert to and from Roman numerals
'''Convert to and from Roman numerals
This program is part of "Dive Into Python 3", a free Python book for
This program is part of 'Dive Into Python 3', a free Python book for
experienced programmers. Visit http://diveintopython3.org/ for the
latest version.
"""
'''
class OutOfRangeError(ValueError): pass
class NotIntegerError(ValueError): pass
@@ -22,13 +22,13 @@ roman_numeral_map = (('M', 1000),
('I', 1))
def to_roman(n):
"""convert integer to Roman numeral"""
'''convert integer to Roman numeral'''
if not (0 < n < 4000):
raise OutOfRangeError("number out of range (must be 0..3999)")
raise OutOfRangeError('number out of range (must be 0..3999)')
if not isinstance(n, int):
raise NotIntegerError("non-integers can not be converted")
raise NotIntegerError('non-integers can not be converted')
result = ""
result = ''
for numeral, integer in roman_numeral_map:
while n >= integer:
result += numeral
+8 -8
View File
@@ -1,9 +1,9 @@
"""Convert to and from Roman numerals
'''Convert to and from Roman numerals
This program is part of "Dive Into Python 3", a free Python book for
This program is part of 'Dive Into Python 3', a free Python book for
experienced programmers. Visit http://diveintopython3.org/ for the
latest version.
"""
'''
class OutOfRangeError(ValueError): pass
class NotIntegerError(ValueError): pass
@@ -22,13 +22,13 @@ roman_numeral_map = (('M', 1000),
('I', 1))
def to_roman(n):
"""convert integer to Roman numeral"""
'''convert integer to Roman numeral'''
if not (0 < n < 4000):
raise OutOfRangeError("number out of range (must be 0..3999)")
raise OutOfRangeError('number out of range (must be 0..3999)')
if not isinstance(n, int):
raise NotIntegerError("non-integers can not be converted")
raise NotIntegerError('non-integers can not be converted')
result = ""
result = ''
for numeral, integer in roman_numeral_map:
while n >= integer:
result += numeral
@@ -36,7 +36,7 @@ def to_roman(n):
return result
def from_roman(s):
"""convert Roman numeral to integer"""
'''convert Roman numeral to integer'''
result = 0
index = 0
for numeral, integer in roman_numeral_map:
+11 -11
View File
@@ -1,9 +1,9 @@
"""Convert to and from Roman numerals
'''Convert to and from Roman numerals
This program is part of "Dive Into Python 3", a free Python book for
This program is part of 'Dive Into Python 3', a free Python book for
experienced programmers. Visit http://diveintopython3.org/ for the
latest version.
"""
'''
import re
class OutOfRangeError(ValueError): pass
@@ -24,7 +24,7 @@ roman_numeral_map = (('M', 1000),
('IV', 4),
('I', 1))
roman_numeral_pattern = re.compile("""
roman_numeral_pattern = re.compile('''
^ # beginning of string
M{0,3} # thousands - 0 to 3 M's
(CM|CD|D?C{0,3}) # hundreds - 900 (CM), 400 (CD), 0-300 (0 to 3 C's),
@@ -34,16 +34,16 @@ roman_numeral_pattern = re.compile("""
(IX|IV|V?I{0,3}) # ones - 9 (IX), 4 (IV), 0-3 (0 to 3 I's),
# or 5-8 (V, followed by 0 to 3 I's)
$ # end of string
""", re.VERBOSE)
''', re.VERBOSE)
def to_roman(n):
"""convert integer to Roman numeral"""
'''convert integer to Roman numeral'''
if not (0 < n < 4000):
raise OutOfRangeError("number out of range (must be 0..3999)")
raise OutOfRangeError('number out of range (must be 0..3999)')
if not isinstance(n, int):
raise NotIntegerError("non-integers can not be converted")
raise NotIntegerError('non-integers can not be converted')
result = ""
result = ''
for numeral, integer in roman_numeral_map:
while n >= integer:
result += numeral
@@ -51,9 +51,9 @@ def to_roman(n):
return result
def from_roman(s):
"""convert Roman numeral to integer"""
'''convert Roman numeral to integer'''
if not roman_numeral_pattern.search(s):
raise InvalidRomanNumeralError("Invalid Roman numeral: {0}".format(s))
raise InvalidRomanNumeralError('Invalid Roman numeral: {0}'.format(s))
result = 0
index = 0
+12 -12
View File
@@ -1,9 +1,9 @@
"""Convert to and from Roman numerals
'''Convert to and from Roman numerals
This program is part of "Dive Into Python 3", a free Python book for
This program is part of 'Dive Into Python 3', a free Python book for
experienced programmers. Visit http://diveintopython3.org/ for the
latest version.
"""
'''
import re
class OutOfRangeError(ValueError): pass
@@ -24,7 +24,7 @@ roman_numeral_map = (('M', 1000),
('IV', 4),
('I', 1))
roman_numeral_pattern = re.compile("""
roman_numeral_pattern = re.compile('''
^ # beginning of string
M{0,3} # thousands - 0 to 3 M's
(CM|CD|D?C{0,3}) # hundreds - 900 (CM), 400 (CD), 0-300 (0 to 3 C's),
@@ -34,16 +34,16 @@ roman_numeral_pattern = re.compile("""
(IX|IV|V?I{0,3}) # ones - 9 (IX), 4 (IV), 0-3 (0 to 3 I's),
# or 5-8 (V, followed by 0 to 3 I's)
$ # end of string
""", re.VERBOSE)
''', re.VERBOSE)
def to_roman(n):
"""convert integer to Roman numeral"""
'''convert integer to Roman numeral'''
if not (0 < n < 4000):
raise OutOfRangeError("number out of range (must be 0..3999)")
raise OutOfRangeError('number out of range (must be 0..3999)')
if not isinstance(n, int):
raise NotIntegerError("non-integers can not be converted")
raise NotIntegerError('non-integers can not be converted')
result = ""
result = ''
for numeral, integer in roman_numeral_map:
while n >= integer:
result += numeral
@@ -51,11 +51,11 @@ def to_roman(n):
return result
def from_roman(s):
"""convert Roman numeral to integer"""
'''convert Roman numeral to integer'''
if not isinstance(s, str):
raise InvalidRomanNumeralError("Input must be a string")
raise InvalidRomanNumeralError('Input must be a string')
if not roman_numeral_pattern.search(s):
raise InvalidRomanNumeralError("Invalid Roman numeral: {0}".format(s))
raise InvalidRomanNumeralError('Invalid Roman numeral: {0}'.format(s))
result = 0
index = 0
+13 -13
View File
@@ -1,9 +1,9 @@
"""Convert to and from Roman numerals
'''Convert to and from Roman numerals
This program is part of "Dive Into Python 3", a free Python book for
This program is part of 'Dive Into Python 3', a free Python book for
experienced programmers. Visit http://diveintopython3.org/ for the
latest version.
"""
'''
import re
class OutOfRangeError(ValueError): pass
@@ -24,7 +24,7 @@ roman_numeral_map = (('M', 1000),
('IV', 4),
('I', 1))
roman_numeral_pattern = re.compile("""
roman_numeral_pattern = re.compile('''
^ # beginning of string
M{0,3} # thousands - 0 to 3 M's
(CM|CD|D?C{0,3}) # hundreds - 900 (CM), 400 (CD), 0-300 (0 to 3 C's),
@@ -34,16 +34,16 @@ roman_numeral_pattern = re.compile("""
(IX|IV|V?I{0,3}) # ones - 9 (IX), 4 (IV), 0-3 (0 to 3 I's),
# or 5-8 (V, followed by 0 to 3 I's)
$ # end of string
""", re.VERBOSE)
''', re.VERBOSE)
def to_roman(n):
"""convert integer to Roman numeral"""
'''convert integer to Roman numeral'''
if not (0 < n < 4000):
raise OutOfRangeError("number out of range (must be 0..3999)")
raise OutOfRangeError('number out of range (must be 0..3999)')
if not isinstance(n, int):
raise NotIntegerError("non-integers can not be converted")
raise NotIntegerError('non-integers can not be converted')
result = ""
result = ''
for numeral, integer in roman_numeral_map:
while n >= integer:
result += numeral
@@ -51,13 +51,13 @@ def to_roman(n):
return result
def from_roman(s):
"""convert Roman numeral to integer"""
'''convert Roman numeral to integer'''
if not isinstance(s, str):
raise InvalidRomanNumeralError("Input must be a string")
raise InvalidRomanNumeralError('Input must be a string')
if not s:
raise InvalidRomanNumeralError("Input can not be blank")
raise InvalidRomanNumeralError('Input can not be blank')
if not roman_numeral_pattern.search(s):
raise InvalidRomanNumeralError("Invalid Roman numeral: {0}".format(s))
raise InvalidRomanNumeralError('Invalid Roman numeral: {0}'.format(s))
result = 0
index = 0
+13 -13
View File
@@ -1,9 +1,9 @@
"""Convert to and from Roman numerals
'''Convert to and from Roman numerals
This program is part of "Dive Into Python 3", a free Python book for
This program is part of 'Dive Into Python 3', a free Python book for
experienced programmers. Visit http://diveintopython3.org/ for the
latest version.
"""
'''
import re
class OutOfRangeError(ValueError): pass
@@ -24,7 +24,7 @@ roman_numeral_map = (('M', 1000),
('IV', 4),
('I', 1))
roman_numeral_pattern = re.compile("""
roman_numeral_pattern = re.compile('''
^ # beginning of string
M{0,4} # thousands - 0 to 4 M's
(CM|CD|D?C{0,3}) # hundreds - 900 (CM), 400 (CD), 0-300 (0 to 3 C's),
@@ -34,16 +34,16 @@ roman_numeral_pattern = re.compile("""
(IX|IV|V?I{0,3}) # ones - 9 (IX), 4 (IV), 0-3 (0 to 3 I's),
# or 5-8 (V, followed by 0 to 3 I's)
$ # end of string
""", re.VERBOSE)
''', re.VERBOSE)
def to_roman(n):
"""convert integer to Roman numeral"""
'''convert integer to Roman numeral'''
if not (0 < n < 5000):
raise OutOfRangeError("number out of range (must be 0..4999)")
raise OutOfRangeError('number out of range (must be 0..4999)')
if not isinstance(n, int):
raise NotIntegerError("non-integers can not be converted")
raise NotIntegerError('non-integers can not be converted')
result = ""
result = ''
for numeral, integer in roman_numeral_map:
while n >= integer:
result += numeral
@@ -51,13 +51,13 @@ def to_roman(n):
return result
def from_roman(s):
"""convert Roman numeral to integer"""
'''convert Roman numeral to integer'''
if not isinstance(s, str):
raise InvalidRomanNumeralError("Input must be a string")
raise InvalidRomanNumeralError('Input must be a string')
if not s:
raise InvalidRomanNumeralError("Input can not be blank")
raise InvalidRomanNumeralError('Input can not be blank')
if not roman_numeral_pattern.search(s):
raise InvalidRomanNumeralError("Invalid Roman numeral: {0}".format(s))
raise InvalidRomanNumeralError('Invalid Roman numeral: {0}'.format(s))
result = 0
index = 0
+5 -5
View File
@@ -1,9 +1,9 @@
"""Unit test for roman1.py
'''Unit test for roman1.py
This program is part of "Dive Into Python 3", a free Python book for
This program is part of 'Dive Into Python 3', a free Python book for
experienced programmers. Visit http://diveintopython3.org/ for the
latest version.
"""
'''
import roman1
import unittest
@@ -67,12 +67,12 @@ class KnownValues(unittest.TestCase):
(3999, 'MMMCMXCIX'))
def test_to_roman_known_values(self):
"""to_roman should give known result with known input"""
'''to_roman should give known result with known input'''
for integer, numeral in self.known_values:
result = roman1.to_roman(integer)
self.assertEqual(numeral, result)
if __name__ == "__main__":
if __name__ == '__main__':
unittest.main()
# Copyright (c) 2009, Mark Pilgrim, All rights reserved.
+17 -17
View File
@@ -1,9 +1,9 @@
"""Unit test for roman1.py
'''Unit test for roman1.py
This program is part of "Dive Into Python 3", a free Python book for
This program is part of 'Dive Into Python 3', a free Python book for
experienced programmers. Visit http://diveintopython3.org/ for the
latest version.
"""
'''
import roman10
import unittest
@@ -71,68 +71,68 @@ class KnownValues(unittest.TestCase):
(4999, 'MMMMCMXCIX'))
def test_to_roman_known_values(self):
"""to_roman should give known result with known input"""
'''to_roman should give known result with known input'''
for integer, numeral in self.known_values:
result = roman10.to_roman(integer)
self.assertEqual(numeral, result)
def test_from_roman_known_values(self):
"""from_roman should give known result with known input"""
'''from_roman should give known result with known input'''
for integer, numeral in self.known_values:
result = roman10.from_roman(numeral)
self.assertEqual(integer, result)
class ToRomanBadInput(unittest.TestCase):
def test_too_large(self):
"""to_roman should fail with large input"""
'''to_roman should fail with large input'''
self.assertRaises(roman10.OutOfRangeError, roman10.to_roman, 5000)
def test_zero(self):
"""to_roman should fail with 0 input"""
'''to_roman should fail with 0 input'''
self.assertRaises(roman10.OutOfRangeError, roman10.to_roman, 0)
def test_negative(self):
"""to_roman should fail with negative input"""
'''to_roman should fail with negative input'''
self.assertRaises(roman10.OutOfRangeError, roman10.to_roman, -1)
def test_non_integer(self):
"""to_roman should fail with non-integer input"""
'''to_roman should fail with non-integer input'''
self.assertRaises(roman10.NotIntegerError, roman10.to_roman, 0.5)
class FromRomanBadInput(unittest.TestCase):
def test_too_many_repeated_numerals(self):
"""from_roman should fail with too many repeated numerals"""
'''from_roman should fail with too many repeated numerals'''
for s in ('MMMMM', 'DD', 'CCCC', 'LL', 'XXXX', 'VV', 'IIII'):
self.assertRaises(roman10.InvalidRomanNumeralError, roman10.from_roman, s)
def test_repeated_pairs(self):
"""from_roman should fail with repeated pairs of numerals"""
'''from_roman should fail with repeated pairs of numerals'''
for s in ('CMCM', 'CDCD', 'XCXC', 'XLXL', 'IXIX', 'IVIV'):
self.assertRaises(roman10.InvalidRomanNumeralError, roman10.from_roman, s)
def test_malformed_antecedents(self):
"""from_roman should fail with malformed antecedents"""
'''from_roman should fail with malformed antecedents'''
for s in ('IIMXCC', 'VX', 'DCM', 'CMM', 'IXIV',
'MCMC', 'XCX', 'IVI', 'LM', 'LD', 'LC'):
self.assertRaises(roman10.InvalidRomanNumeralError, roman10.from_roman, s)
def test_blank(self):
"""from_roman should fail with blank string"""
self.assertRaises(roman10.InvalidRomanNumeralError, roman10.from_roman, "")
'''from_roman should fail with blank string'''
self.assertRaises(roman10.InvalidRomanNumeralError, roman10.from_roman, '')
def test_non_string(self):
"""from_roman should fail with non-string input"""
'''from_roman should fail with non-string input'''
self.assertRaises(roman10.InvalidRomanNumeralError, roman10.from_roman, 1)
class RoundtripCheck(unittest.TestCase):
def test_roundtrip(self):
"""from_roman(to_roman(n))==n for all n"""
'''from_roman(to_roman(n))==n for all n'''
for integer in range(1, 5000):
numeral = roman10.to_roman(integer)
result = roman10.from_roman(numeral)
self.assertEqual(integer, result)
if __name__ == "__main__":
if __name__ == '__main__':
unittest.main()
# Copyright (c) 2009, Mark Pilgrim, All rights reserved.
+6 -6
View File
@@ -1,9 +1,9 @@
"""Unit test for roman1.py
'''Unit test for roman1.py
This program is part of "Dive Into Python 3", a free Python book for
This program is part of 'Dive Into Python 3', a free Python book for
experienced programmers. Visit http://diveintopython3.org/ for the
latest version.
"""
'''
import roman2
import unittest
@@ -67,17 +67,17 @@ class KnownValues(unittest.TestCase):
(3999, 'MMMCMXCIX'))
def test_to_roman_known_values(self):
"""to_roman should give known result with known input"""
'''to_roman should give known result with known input'''
for integer, numeral in self.known_values:
result = roman2.to_roman(integer)
self.assertEqual(numeral, result)
class ToRomanBadInput(unittest.TestCase):
def test_too_large(self):
"""to_roman should fail with large input"""
'''to_roman should fail with large input'''
self.assertRaises(roman2.OutOfRangeError, roman2.to_roman, 4000)
if __name__ == "__main__":
if __name__ == '__main__':
unittest.main()
# Copyright (c) 2009, Mark Pilgrim, All rights reserved.
+8 -8
View File
@@ -1,9 +1,9 @@
"""Unit test for roman1.py
'''Unit test for roman1.py
This program is part of "Dive Into Python 3", a free Python book for
This program is part of 'Dive Into Python 3', a free Python book for
experienced programmers. Visit http://diveintopython3.org/ for the
latest version.
"""
'''
import roman3
import unittest
@@ -67,25 +67,25 @@ class KnownValues(unittest.TestCase):
(3999, 'MMMCMXCIX'))
def test_to_roman_known_values(self):
"""to_roman should give known result with known input"""
'''to_roman should give known result with known input'''
for integer, numeral in self.known_values:
result = roman3.to_roman(integer)
self.assertEqual(numeral, result)
class ToRomanBadInput(unittest.TestCase):
def test_too_large(self):
"""to_roman should fail with large input"""
'''to_roman should fail with large input'''
self.assertRaises(roman3.OutOfRangeError, roman3.to_roman, 4000)
def test_zero(self):
"""to_roman should fail with 0 input"""
'''to_roman should fail with 0 input'''
self.assertRaises(roman3.OutOfRangeError, roman3.to_roman, 0)
def test_negative(self):
"""to_roman should fail with negative input"""
'''to_roman should fail with negative input'''
self.assertRaises(roman3.OutOfRangeError, roman3.to_roman, -1)
if __name__ == "__main__":
if __name__ == '__main__':
unittest.main()
# Copyright (c) 2009, Mark Pilgrim, All rights reserved.
+9 -9
View File
@@ -1,9 +1,9 @@
"""Unit test for roman1.py
'''Unit test for roman1.py
This program is part of "Dive Into Python 3", a free Python book for
This program is part of 'Dive Into Python 3', a free Python book for
experienced programmers. Visit http://diveintopython3.org/ for the
latest version.
"""
'''
import roman4
import unittest
@@ -67,29 +67,29 @@ class KnownValues(unittest.TestCase):
(3999, 'MMMCMXCIX'))
def test_to_roman_known_values(self):
"""to_roman should give known result with known input"""
'''to_roman should give known result with known input'''
for integer, numeral in self.known_values:
result = roman4.to_roman(integer)
self.assertEqual(numeral, result)
class ToRomanBadInput(unittest.TestCase):
def test_too_large(self):
"""to_roman should fail with large input"""
'''to_roman should fail with large input'''
self.assertRaises(roman4.OutOfRangeError, roman4.to_roman, 4000)
def test_zero(self):
"""to_roman should fail with 0 input"""
'''to_roman should fail with 0 input'''
self.assertRaises(roman4.OutOfRangeError, roman4.to_roman, 0)
def test_negative(self):
"""to_roman should fail with negative input"""
'''to_roman should fail with negative input'''
self.assertRaises(roman4.OutOfRangeError, roman4.to_roman, -1)
def test_non_integer(self):
"""to_roman should fail with non-integer input"""
'''to_roman should fail with non-integer input'''
self.assertRaises(roman4.NotIntegerError, roman4.to_roman, 0.5)
if __name__ == "__main__":
if __name__ == '__main__':
unittest.main()
# Copyright (c) 2009, Mark Pilgrim, All rights reserved.
+11 -11
View File
@@ -1,9 +1,9 @@
"""Unit test for roman1.py
'''Unit test for roman1.py
This program is part of "Dive Into Python 3", a free Python book for
This program is part of 'Dive Into Python 3', a free Python book for
experienced programmers. Visit http://diveintopython3.org/ for the
latest version.
"""
'''
import roman5
import unittest
@@ -67,43 +67,43 @@ class KnownValues(unittest.TestCase):
(3999, 'MMMCMXCIX'))
def test_to_roman_known_values(self):
"""to_roman should give known result with known input"""
'''to_roman should give known result with known input'''
for integer, numeral in self.known_values:
result = roman5.to_roman(integer)
self.assertEqual(numeral, result)
def test_from_roman_known_values(self):
"""from_roman should give known result with known input"""
'''from_roman should give known result with known input'''
for integer, numeral in self.known_values:
result = roman5.from_roman(numeral)
self.assertEqual(integer, result)
class ToRomanBadInput(unittest.TestCase):
def test_too_large(self):
"""to_roman should fail with large input"""
'''to_roman should fail with large input'''
self.assertRaises(roman5.OutOfRangeError, roman5.to_roman, 4000)
def test_zero(self):
"""to_roman should fail with 0 input"""
'''to_roman should fail with 0 input'''
self.assertRaises(roman5.OutOfRangeError, roman5.to_roman, 0)
def test_negative(self):
"""to_roman should fail with negative input"""
'''to_roman should fail with negative input'''
self.assertRaises(roman5.OutOfRangeError, roman5.to_roman, -1)
def test_non_integer(self):
"""to_roman should fail with non-integer input"""
'''to_roman should fail with non-integer input'''
self.assertRaises(roman5.NotIntegerError, roman5.to_roman, 0.5)
class RoundtripCheck(unittest.TestCase):
def test_roundtrip(self):
"""from_roman(to_roman(n))==n for all n"""
'''from_roman(to_roman(n))==n for all n'''
for integer in range(1, 4000):
numeral = roman5.to_roman(integer)
result = roman5.from_roman(numeral)
self.assertEqual(integer, result)
if __name__ == "__main__":
if __name__ == '__main__':
unittest.main()
# Copyright (c) 2009, Mark Pilgrim, All rights reserved.
+14 -14
View File
@@ -1,9 +1,9 @@
"""Unit test for roman1.py
'''Unit test for roman1.py
This program is part of "Dive Into Python 3", a free Python book for
This program is part of 'Dive Into Python 3', a free Python book for
experienced programmers. Visit http://diveintopython3.org/ for the
latest version.
"""
'''
import roman6
import unittest
@@ -67,60 +67,60 @@ class KnownValues(unittest.TestCase):
(3999, 'MMMCMXCIX'))
def test_to_roman_known_values(self):
"""to_roman should give known result with known input"""
'''to_roman should give known result with known input'''
for integer, numeral in self.known_values:
result = roman6.to_roman(integer)
self.assertEqual(numeral, result)
def test_from_roman_known_values(self):
"""from_roman should give known result with known input"""
'''from_roman should give known result with known input'''
for integer, numeral in self.known_values:
result = roman6.from_roman(numeral)
self.assertEqual(integer, result)
class ToRomanBadInput(unittest.TestCase):
def test_too_large(self):
"""to_roman should fail with large input"""
'''to_roman should fail with large input'''
self.assertRaises(roman6.OutOfRangeError, roman6.to_roman, 4000)
def test_zero(self):
"""to_roman should fail with 0 input"""
'''to_roman should fail with 0 input'''
self.assertRaises(roman6.OutOfRangeError, roman6.to_roman, 0)
def test_negative(self):
"""to_roman should fail with negative input"""
'''to_roman should fail with negative input'''
self.assertRaises(roman6.OutOfRangeError, roman6.to_roman, -1)
def test_non_integer(self):
"""to_roman should fail with non-integer input"""
'''to_roman should fail with non-integer input'''
self.assertRaises(roman6.NotIntegerError, roman6.to_roman, 0.5)
class FromRomanBadInput(unittest.TestCase):
def test_too_many_repeated_numerals(self):
"""from_roman should fail with too many repeated numerals"""
'''from_roman should fail with too many repeated numerals'''
for s in ('MMMM', 'DD', 'CCCC', 'LL', 'XXXX', 'VV', 'IIII'):
self.assertRaises(roman6.InvalidRomanNumeralError, roman6.from_roman, s)
def test_repeated_pairs(self):
"""from_roman should fail with repeated pairs of numerals"""
'''from_roman should fail with repeated pairs of numerals'''
for s in ('CMCM', 'CDCD', 'XCXC', 'XLXL', 'IXIX', 'IVIV'):
self.assertRaises(roman6.InvalidRomanNumeralError, roman6.from_roman, s)
def test_malformed_antecedents(self):
"""from_roman should fail with malformed antecedents"""
'''from_roman should fail with malformed antecedents'''
for s in ('IIMXCC', 'VX', 'DCM', 'CMM', 'IXIV',
'MCMC', 'XCX', 'IVI', 'LM', 'LD', 'LC'):
self.assertRaises(roman6.InvalidRomanNumeralError, roman6.from_roman, s)
class RoundtripCheck(unittest.TestCase):
def test_roundtrip(self):
"""from_roman(to_roman(n))==n for all n"""
'''from_roman(to_roman(n))==n for all n'''
for integer in range(1, 4000):
numeral = roman6.to_roman(integer)
result = roman6.from_roman(numeral)
self.assertEqual(integer, result)
if __name__ == "__main__":
if __name__ == '__main__':
unittest.main()
# Copyright (c) 2009, Mark Pilgrim, All rights reserved.
+15 -15
View File
@@ -1,9 +1,9 @@
"""Unit test for roman1.py
'''Unit test for roman1.py
This program is part of "Dive Into Python 3", a free Python book for
This program is part of 'Dive Into Python 3', a free Python book for
experienced programmers. Visit http://diveintopython3.org/ for the
latest version.
"""
'''
import roman7
import unittest
@@ -67,64 +67,64 @@ class KnownValues(unittest.TestCase):
(3999, 'MMMCMXCIX'))
def test_to_roman_known_values(self):
"""to_roman should give known result with known input"""
'''to_roman should give known result with known input'''
for integer, numeral in self.known_values:
result = roman7.to_roman(integer)
self.assertEqual(numeral, result)
def test_from_roman_known_values(self):
"""from_roman should give known result with known input"""
'''from_roman should give known result with known input'''
for integer, numeral in self.known_values:
result = roman7.from_roman(numeral)
self.assertEqual(integer, result)
class ToRomanBadInput(unittest.TestCase):
def test_too_large(self):
"""to_roman should fail with large input"""
'''to_roman should fail with large input'''
self.assertRaises(roman7.OutOfRangeError, roman7.to_roman, 4000)
def test_zero(self):
"""to_roman should fail with 0 input"""
'''to_roman should fail with 0 input'''
self.assertRaises(roman7.OutOfRangeError, roman7.to_roman, 0)
def test_negative(self):
"""to_roman should fail with negative input"""
'''to_roman should fail with negative input'''
self.assertRaises(roman7.OutOfRangeError, roman7.to_roman, -1)
def test_non_integer(self):
"""to_roman should fail with non-integer input"""
'''to_roman should fail with non-integer input'''
self.assertRaises(roman7.NotIntegerError, roman7.to_roman, 0.5)
class FromRomanBadInput(unittest.TestCase):
def test_too_many_repeated_numerals(self):
"""from_roman should fail with too many repeated numerals"""
'''from_roman should fail with too many repeated numerals'''
for s in ('MMMM', 'DD', 'CCCC', 'LL', 'XXXX', 'VV', 'IIII'):
self.assertRaises(roman7.InvalidRomanNumeralError, roman7.from_roman, s)
def test_repeated_pairs(self):
"""from_roman should fail with repeated pairs of numerals"""
'''from_roman should fail with repeated pairs of numerals'''
for s in ('CMCM', 'CDCD', 'XCXC', 'XLXL', 'IXIX', 'IVIV'):
self.assertRaises(roman7.InvalidRomanNumeralError, roman7.from_roman, s)
def test_malformed_antecedents(self):
"""from_roman should fail with malformed antecedents"""
'''from_roman should fail with malformed antecedents'''
for s in ('IIMXCC', 'VX', 'DCM', 'CMM', 'IXIV',
'MCMC', 'XCX', 'IVI', 'LM', 'LD', 'LC'):
self.assertRaises(roman7.InvalidRomanNumeralError, roman7.from_roman, s)
def test_non_string(self):
"""from_roman should fail with non-string input"""
'''from_roman should fail with non-string input'''
self.assertRaises(roman7.InvalidRomanNumeralError, roman7.from_roman, 1)
class RoundtripCheck(unittest.TestCase):
def test_roundtrip(self):
"""from_roman(to_roman(n))==n for all n"""
'''from_roman(to_roman(n))==n for all n'''
for integer in range(1, 4000):
numeral = roman7.to_roman(integer)
result = roman7.from_roman(numeral)
self.assertEqual(integer, result)
if __name__ == "__main__":
if __name__ == '__main__':
unittest.main()
# Copyright (c) 2009, Mark Pilgrim, All rights reserved.
+17 -17
View File
@@ -1,9 +1,9 @@
"""Unit test for roman1.py
'''Unit test for roman1.py
This program is part of "Dive Into Python 3", a free Python book for
This program is part of 'Dive Into Python 3', a free Python book for
experienced programmers. Visit http://diveintopython3.org/ for the
latest version.
"""
'''
import roman8
import unittest
@@ -67,68 +67,68 @@ class KnownValues(unittest.TestCase):
(3999, 'MMMCMXCIX'))
def test_to_roman_known_values(self):
"""to_roman should give known result with known input"""
'''to_roman should give known result with known input'''
for integer, numeral in self.known_values:
result = roman8.to_roman(integer)
self.assertEqual(numeral, result)
def test_from_roman_known_values(self):
"""from_roman should give known result with known input"""
'''from_roman should give known result with known input'''
for integer, numeral in self.known_values:
result = roman8.from_roman(numeral)
self.assertEqual(integer, result)
class ToRomanBadInput(unittest.TestCase):
def test_too_large(self):
"""to_roman should fail with large input"""
'''to_roman should fail with large input'''
self.assertRaises(roman8.OutOfRangeError, roman8.to_roman, 4000)
def test_zero(self):
"""to_roman should fail with 0 input"""
'''to_roman should fail with 0 input'''
self.assertRaises(roman8.OutOfRangeError, roman8.to_roman, 0)
def test_negative(self):
"""to_roman should fail with negative input"""
'''to_roman should fail with negative input'''
self.assertRaises(roman8.OutOfRangeError, roman8.to_roman, -1)
def test_non_integer(self):
"""to_roman should fail with non-integer input"""
'''to_roman should fail with non-integer input'''
self.assertRaises(roman8.NotIntegerError, roman8.to_roman, 0.5)
class FromRomanBadInput(unittest.TestCase):
def test_too_many_repeated_numerals(self):
"""from_roman should fail with too many repeated numerals"""
'''from_roman should fail with too many repeated numerals'''
for s in ('MMMM', 'DD', 'CCCC', 'LL', 'XXXX', 'VV', 'IIII'):
self.assertRaises(roman8.InvalidRomanNumeralError, roman8.from_roman, s)
def test_repeated_pairs(self):
"""from_roman should fail with repeated pairs of numerals"""
'''from_roman should fail with repeated pairs of numerals'''
for s in ('CMCM', 'CDCD', 'XCXC', 'XLXL', 'IXIX', 'IVIV'):
self.assertRaises(roman8.InvalidRomanNumeralError, roman8.from_roman, s)
def test_malformed_antecedents(self):
"""from_roman should fail with malformed antecedents"""
'''from_roman should fail with malformed antecedents'''
for s in ('IIMXCC', 'VX', 'DCM', 'CMM', 'IXIV',
'MCMC', 'XCX', 'IVI', 'LM', 'LD', 'LC'):
self.assertRaises(roman8.InvalidRomanNumeralError, roman8.from_roman, s)
def test_blank(self):
"""from_roman should fail with blank string"""
self.assertRaises(roman8.InvalidRomanNumeralError, roman8.from_roman, "")
'''from_roman should fail with blank string'''
self.assertRaises(roman8.InvalidRomanNumeralError, roman8.from_roman, '')
def test_non_string(self):
"""from_roman should fail with non-string input"""
'''from_roman should fail with non-string input'''
self.assertRaises(roman8.InvalidRomanNumeralError, roman8.from_roman, 1)
class RoundtripCheck(unittest.TestCase):
def test_roundtrip(self):
"""from_roman(to_roman(n))==n for all n"""
'''from_roman(to_roman(n))==n for all n'''
for integer in range(1, 4000):
numeral = roman8.to_roman(integer)
result = roman8.from_roman(numeral)
self.assertEqual(integer, result)
if __name__ == "__main__":
if __name__ == '__main__':
unittest.main()
# Copyright (c) 2009, Mark Pilgrim, All rights reserved.
+17 -17
View File
@@ -1,9 +1,9 @@
"""Unit test for roman1.py
'''Unit test for roman1.py
This program is part of "Dive Into Python 3", a free Python book for
This program is part of 'Dive Into Python 3', a free Python book for
experienced programmers. Visit http://diveintopython3.org/ for the
latest version.
"""
'''
import roman9
import unittest
@@ -71,68 +71,68 @@ class KnownValues(unittest.TestCase):
(4999, 'MMMMCMXCIX'))
def test_to_roman_known_values(self):
"""to_roman should give known result with known input"""
'''to_roman should give known result with known input'''
for integer, numeral in self.known_values:
result = roman9.to_roman(integer)
self.assertEqual(numeral, result)
def test_from_roman_known_values(self):
"""from_roman should give known result with known input"""
'''from_roman should give known result with known input'''
for integer, numeral in self.known_values:
result = roman9.from_roman(numeral)
self.assertEqual(integer, result)
class ToRomanBadInput(unittest.TestCase):
def test_too_large(self):
"""to_roman should fail with large input"""
'''to_roman should fail with large input'''
self.assertRaises(roman9.OutOfRangeError, roman9.to_roman, 5000)
def test_zero(self):
"""to_roman should fail with 0 input"""
'''to_roman should fail with 0 input'''
self.assertRaises(roman9.OutOfRangeError, roman9.to_roman, 0)
def test_negative(self):
"""to_roman should fail with negative input"""
'''to_roman should fail with negative input'''
self.assertRaises(roman9.OutOfRangeError, roman9.to_roman, -1)
def test_non_integer(self):
"""to_roman should fail with non-integer input"""
'''to_roman should fail with non-integer input'''
self.assertRaises(roman9.NotIntegerError, roman9.to_roman, 0.5)
class FromRomanBadInput(unittest.TestCase):
def test_too_many_repeated_numerals(self):
"""from_roman should fail with too many repeated numerals"""
'''from_roman should fail with too many repeated numerals'''
for s in ('MMMMM', 'DD', 'CCCC', 'LL', 'XXXX', 'VV', 'IIII'):
self.assertRaises(roman9.InvalidRomanNumeralError, roman9.from_roman, s)
def test_repeated_pairs(self):
"""from_roman should fail with repeated pairs of numerals"""
'''from_roman should fail with repeated pairs of numerals'''
for s in ('CMCM', 'CDCD', 'XCXC', 'XLXL', 'IXIX', 'IVIV'):
self.assertRaises(roman9.InvalidRomanNumeralError, roman9.from_roman, s)
def test_malformed_antecedents(self):
"""from_roman should fail with malformed antecedents"""
'''from_roman should fail with malformed antecedents'''
for s in ('IIMXCC', 'VX', 'DCM', 'CMM', 'IXIV',
'MCMC', 'XCX', 'IVI', 'LM', 'LD', 'LC'):
self.assertRaises(roman9.InvalidRomanNumeralError, roman9.from_roman, s)
def test_blank(self):
"""from_roman should fail with blank string"""
self.assertRaises(roman9.InvalidRomanNumeralError, roman9.from_roman, "")
'''from_roman should fail with blank string'''
self.assertRaises(roman9.InvalidRomanNumeralError, roman9.from_roman, '')
def test_non_string(self):
"""from_roman should fail with non-string input"""
'''from_roman should fail with non-string input'''
self.assertRaises(roman9.InvalidRomanNumeralError, roman9.from_roman, 1)
class RoundtripCheck(unittest.TestCase):
def test_roundtrip(self):
"""from_roman(to_roman(n))==n for all n"""
'''from_roman(to_roman(n))==n for all n'''
for integer in range(1, 5000):
numeral = roman9.to_roman(integer)
result = roman9.from_roman(numeral)
self.assertEqual(integer, result)
if __name__ == "__main__":
if __name__ == '__main__':
unittest.main()
# Copyright (c) 2009, Mark Pilgrim, All rights reserved.
+1 -1
View File
@@ -26,7 +26,7 @@ body{counter-reset:h1 12}
OK, so a string is a sequence of Unicode characters. But a file on disk is not a sequence of Unicode characters; a file on disk is a sequence of bytes. So if you read a &#8220;text file&#8221; from disk, how does Python convert that sequence of bytes into a sequence of characters? The answer is that it decodes the bytes according to a specific character encoding algorithm, and returns a sequence of Unicode characters, otherwise known as a string.
-->
<p class=nav><a rel=prev href=advanced-classes.html title="back to &#8220;Advanced Classes&#8221;"><span>&#x261C;</span></a> <a rel=next href=xml.html title="onward to &#8220;XML&#8221;"><span>&#x261E;</span></a>
<p class=v><a href=advanced-classes.html rel=prev title='back to &#8220;Advanced Classes&#8221;'><span>&#x261C;</span></a> <a href=xml.html rel=next title='onward to &#8220;XML&#8221;'><span>&#x261E;</span></a>
<p class=c>&copy; 2001&ndash;9 <a href=about.html>Mark Pilgrim</a>
<script src=j/jquery.js></script>
+1 -1
View File
@@ -406,7 +406,7 @@ def plural(noun):
<li><a href=http://www.python.org/dev/peps/pep-0255/>PEP 255: Simple Generators</a>
</ul>
<p class=nav><a rel=prev href=regular-expressions.html title="back to &#8220;Regular Expressions&#8221;"><span>&#x261C;</span></a> <a rel=next href=iterators.html title="onward to &#8220;Iterators&#8221;"><span>&#x261E;</span></a>
<p class=v><a href=regular-expressions.html rel=prev title='back to &#8220;Regular Expressions&#8221;'><span>&#x261C;</span></a> <a href=iterators.html rel=next title='onward to &#8220;Iterators&#8221;'><span>&#x261E;</span></a>
<p class=c>&copy; 2001&ndash;9 <a href=about.html>Mark Pilgrim</a>
<script src=j/jquery.js></script>
+13 -12
View File
@@ -30,7 +30,7 @@ mark{display:inline}
<li><a href=http://code.google.com/apis/gdata/>Google Data <abbr>API</abbr>s</a> allow you to interact with a wide variety of Google services, including <a href=http://www.blogger.com/>Blogger</a> and <a href=http://www.youtube.com/>YouTube</a>.
<li><a href=http://www.flickr.com/services/api/>Flickr Services</a> allow you to upload and download photos from <a href=http://www.flickr.com/>Flickr</a>.
<li><a href=http://apiwiki.twitter.com/>Twitter <abbr>API</abbr></a> allows you to publish status updates on <a href=http://twitter.com/>Twitter</a>.
<li><a href="http://www.programmableweb.com/apis/directory/1?sort=mashups">&hellip;and many more</a>
<li><a href='http://www.programmableweb.com/apis/directory/1?sort=mashups'>&hellip;and many more</a>
</ul>
<p>Python 3 comes with two different libraries for interacting with <abbr>HTTP</abbr> web services:
@@ -153,7 +153,7 @@ Cache-Control: max-age=31536000, public</samp></pre>
<h3 id=compression>Compression</h3>
<p>When you talk about <abbr>HTTP</abbr> web services, you&#8217;re almost always talking about moving text-based data back and forth over the wire. Maybe it&#8217;s <abbr>XML</abbr>, maybe it&#8217;s <abbr>JSON</abbr>, maybe it&#8217;s just <a href=strings.html#boring-stuff title="there ain&#8217;t no such thing as plain text">plain text</a>. Regardless of the format, text compresses well. The example feed in <a href=xml.html>the XML chapter</a> is 3070 bytes uncompressed, but would be 941 bytes after gzip compression. That&#8217;s just 30% of the original size!
<p>When you talk about <abbr>HTTP</abbr> web services, you&#8217;re almost always talking about moving text-based data back and forth over the wire. Maybe it&#8217;s <abbr>XML</abbr>, maybe it&#8217;s <abbr>JSON</abbr>, maybe it&#8217;s just <a href=strings.html#boring-stuff title='there ain&#8217;t no such thing as plain text'>plain text</a>. Regardless of the format, text compresses well. The example feed in <a href=xml.html>the XML chapter</a> is 3070 bytes uncompressed, but would be 941 bytes after gzip compression. That&#8217;s just 30% of the original size!
<p><abbr>HTTP</abbr> supports several compression algorithms. The two most common types are <a href=http://www.ietf.org/rfc/rfc1952.txt>gzip</a> and <a href=http://www.ietf.org/rfc/rfc1951.txt>deflate</a>. When you request a resource over <abbr>HTTP</abbr>, you can ask the server to send it in compressed format. You include an <code>Accept-encoding</code> header in your request that lists which compression algorithms you support. If the server supports any of the same algorithms, it will send you back compressed data (with a <code>Content-encoding</code> header that tells you which algorithm it used). Then it&#8217;s up to you to decompress the data.
@@ -190,13 +190,13 @@ Cache-Control: max-age=31536000, public</samp></pre>
<samp class=p>>>> </samp><kbd>import urllib.request</kbd>
<a><samp class=p>>>> </samp><kbd>data = urllib.request.urlopen('http://diveintopython3.org/examples/feed.xml').read()</kbd> <span>&#x2460;</span></a>
<samp class=p>>>> </samp><kbd>print(data)</kbd>
<samp>&lt;?xml version="1.0" encoding="utf-8"?>
&lt;feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
<samp>&lt;?xml version='1.0' encoding='utf-8'?>
&lt;feed xmlns='http://www.w3.org/2005/Atom' xml:lang='en'>
&lt;title>dive into mark&lt;/title>
&lt;subtitle>currently between addictions&lt;/subtitle>
&lt;id>tag:diveintomark.org,2001-07-29:/&lt;/id>
&lt;updated>2009-03-27T21:56:07Z&lt;/updated>
&lt;link rel="alternate" type="text/html" href="http://diveintomark.org/"/>
&lt;link rel='alternate' type='text/html' href='http://diveintomark.org/'/>
&hellip;
</samp></pre>
<ol>
@@ -320,7 +320,7 @@ Content-Type: application/xml</samp>
<samp class=p>>>> </samp><kbd>response.status</kbd>
<samp>200</samp>
<samp class=p>>>> </samp><kbd>content[:52]</kbd>
<samp>b'&lt;?xml version="1.0" encoding="utf-8"?>\r\n&lt;feed xmlns='</samp>
<samp>b"&lt;?xml version='1.0' encoding='utf-8'?>\r\n&lt;feed xmlns="</samp>
<samp class=p>>>> </samp><kbd>len(content)</kbd>
<samp>3070</samp></pre>
<ol>
@@ -337,7 +337,7 @@ Content-Type: application/xml</samp>
<samp class=p>>>> </samp><kbd>response2.status</kbd>
<samp>200</samp>
<samp class=p>>>> </samp><kbd>content2[:52]</kbd>
<samp>b'&lt;?xml version="1.0" encoding="utf-8"?>\r\n&lt;feed xmlns='</samp>
<samp>b"&lt;?xml version='1.0' encoding='utf-8'?>\r\n&lt;feed xmlns="</samp>
<samp class=p>>>> </samp><kbd>len(content2)</kbd>
<samp>3070</samp></pre>
<ol>
@@ -551,9 +551,9 @@ reply: 'HTTP/1.1 301 Moved Permanently'</samp>
<samp class=p>>>> </samp><kbd>import httplib2</kbd>
<samp class=p>>>> </samp><kbd>from urllib.parse import urlencode</kbd>
<samp class=p>>>> </samp><kbd>h = httplib2.Http('.cache')</kbd>
<samp class=p>>>> </samp><kbd>data = {"status": "Test update from Python 3"}</kbd>
<samp class=p>>>> </samp><kbd>h.add_credentials("diveintomark", "<var>MY_SECRET_PASSWORD</var>")</kbd>
<samp class=p>>>> </samp><kbd>resp, content = h.request("http://twitter.com/statuses/update.xml", "POST", urlencode(data))</kbd>
<samp class=p>>>> </samp><kbd>data = {'status': 'Test update from Python 3'}</kbd>
<samp class=p>>>> </samp><kbd>h.add_credentials('diveintomark', '<var>MY_SECRET_PASSWORD</var>')</kbd>
<samp class=p>>>> </samp><kbd>resp, content = h.request('http://twitter.com/statuses/update.xml', 'POST', urlencode(data))</kbd>
<samp class=p>>>> </samp><kbd>resp.status</kbd>
<samp>200</samp>
<samp class=p>>>> </samp><kbd>from xml.etree import ElementTree as etree</kbd>
@@ -608,9 +608,9 @@ reply: 'HTTP/1.1 301 Moved Permanently'</samp>
<pre class=screen>
# continued from the previous example
<samp class=p>>>> </samp><kbd>tree.findtext("id")</kbd>
<samp class=p>>>> </samp><kbd>tree.findtext('id')</kbd>
<samp>'1973974228'</samp>
<samp class=p>>>> </samp><kbd>resp, delete_content = h.request("http://twitter.com/statuses/destroy/{0}.xml".format(tree.findtext("id")), "DELETE")</kbd>
<samp class=p>>>> </samp><kbd>resp, delete_content = h.request('http://twitter.com/statuses/destroy/{0}.xml'.format(tree.findtext('id')), 'DELETE')</kbd>
<samp class=p>>>> </samp><kbd>resp.status</kbd>
<samp>200</samp></pre>
@@ -627,6 +627,7 @@ reply: 'HTTP/1.1 301 Moved Permanently'</samp>
<li><a href=http://code.google.com/p/doctype/wiki/ArticleHttpCaching>How to control caching with <abbr>HTTP</abbr> headers</a> on Google Doctype
</ul>
<p class=v><a rel=prev class=todo><span>&#x261C;</span></a> <a rel=next class=todo><span>&#x261E;</span></a>
<p class=c>&copy; 2001&ndash;9 <a href=about.html>Mark Pilgrim</a>
<script src=j/jquery.js></script>
<script src=j/dip3.js></script>
+6 -6
View File
@@ -5,10 +5,10 @@
<link rel=alternate type=application/atom+xml href=http://hg.diveintopython3.org/atom-log>
<link rel=stylesheet href=dip3.css>
<style>
h1:before{content:""}
h1:before{content:''}
#a,#b{list-style:none;margin:0 0 0 -1.7em}
#a:before{content:"A. \00a0 \00a0"}
#b:before{content:"B. \00a0 \00a0"}
#a:before{content:'A. \00a0 \00a0'}
#b:before{content:'B. \00a0 \00a0'}
</style>
<link rel=stylesheet media='only screen and (max-device-width: 480px)' href=mobile.css>
<link rel=stylesheet media=print href=print.css>
@@ -16,7 +16,7 @@ h1:before{content:""}
</head>
<form action=http://www.google.com/cse><div><input type=hidden name=cx value=014021643941856155761:l5eihuescdw><input type=hidden name=ie value=UTF-8><input name=q size=25>&nbsp;<input type=submit name=sa value=Search></div></form>
<p>You are here:&nbsp;&nbsp;<span title="Ce n'est pas un point" style="cursor:default">&bull;</span>
<p>You are here:&nbsp;&nbsp;<span title="Ce n'est pas un point" style='cursor:default'>&bull;</span>
<h1>Dive Into Python 3</h1>
@@ -51,12 +51,12 @@ h1:before{content:""}
<li id=b><a href=special-method-names.html>Special Method Names</a>
</ol>
<p>There is a <a href=http://hg.diveintopython3.org/>changelog</a>, a <a type=application/atom+xml href=http://hg.diveintopython3.org/atom-log>feed</a>, and <a href="http://www.reddit.com/search?q=%22Dive+Into+Python+3%22&amp;sort=new">discussion on Reddit</a>. During development, you can download the book by cloning the Mercurial repository:
<p>There is a <a href=http://hg.diveintopython3.org/>changelog</a>, a <a type=application/atom+xml href=http://hg.diveintopython3.org/atom-log>feed</a>, and <a href='http://www.reddit.com/search?q=%22Dive+Into+Python+3%22&amp;sort=new'>discussion on Reddit</a>. During development, you can download the book by cloning the Mercurial repository:
<pre><samp class=p>you@localhost:~$ </samp><kbd>hg clone http://hg.diveintopython3.org/ diveintopython3</kbd></pre>
<p>The final version will be downloadable as <abbr>HTML</abbr> and <abbr>PDF</abbr>.
<p class="c nm">This site is optimized for Lynx just because fuck you.<br>I&#8217;m told it also looks good in graphical browsers.
<p class='c nm'>This site is optimized for Lynx just because fuck you.<br>I&#8217;m told it also looks good in graphical browsers.
<p class=c>&copy; 2001&ndash;9 <a href=about.html>Mark Pilgrim</a>
+10 -11
View File
@@ -6,7 +6,7 @@
<link rel=stylesheet href=dip3.css>
<style>
body{counter-reset:h1 -1}
h1:before{counter-increment:h1;content:""}
h1:before{counter-increment:h1;content:''}
</style>
<link rel=stylesheet media='only screen and (max-device-width: 480px)' href=mobile.css>
<link rel=stylesheet media=print href=print.css>
@@ -59,16 +59,15 @@ h1:before{counter-increment:h1;content:""}
<h2 id=furtherreading>Further Reading</h2>
<ul>
<li><a href="http://webtypography.net/toc/">The Elements of Typographic Style Applied to the Web</a>
<li><a href="http://www.alistapart.com/articles/settingtypeontheweb">Setting Type on the Web to a Baseline Grid</a>
<li><a href="http://24ways.org/2006/compose-to-a-vertical-rhythm">Compose to a Vertical Rhythm</a>
<li><a href="http://simplebits.com/notebook/2008/08/14/ampersands.html">Use the Best Available Ampersand</a>
<li><a href="http://alanwood.net/unicode/">Unicode Support in HTML, Fonts, and Web Browsers</a>
<li><a href="http://developer.yahoo.com/yslow/">YSlow</a> for <a href="http://getfirebug.com/">Firebug</a>
<li><a href="http://developer.yahoo.com/performance/rules.html">Best Practices for Speeding Up Your Web Site</a>
<li><a href="http://stevesouders.com/hpws/rules.php">14 Rules for Faster-Loading Web Sites</a>
<li><a href="http://developer.yahoo.com/yui/compressor/">YUI Compressor</a>
<li><a href="http://code.google.com/apis/ajaxlibs/">Google AJAX Libraries API</a>
<li><a href='http://webtypography.net/toc/'>The Elements of Typographic Style Applied to the Web</a>
<li><a href='http://www.alistapart.com/articles/settingtypeontheweb'>Setting Type on the Web to a Baseline Grid</a>
<li><a href='http://24ways.org/2006/compose-to-a-vertical-rhythm'>Compose to a Vertical Rhythm</a>
<li><a href='http://simplebits.com/notebook/2008/08/14/ampersands.html'>Use the Best Available Ampersand</a>
<li><a href='http://alanwood.net/unicode/'>Unicode Support in HTML, Fonts, and Web Browsers</a>
<li><a href='http://developer.yahoo.com/yslow/'>YSlow</a> for <a href='http://getfirebug.com/'>Firebug</a>
<li><a href='http://developer.yahoo.com/performance/rules.html'>Best Practices for Speeding Up Your Web Site</a>
<li><a href='http://stevesouders.com/hpws/rules.php'>14 Rules for Faster-Loading Web Sites</a>
<li><a href='http://developer.yahoo.com/yui/compressor/'>YUI Compressor</a>
</ul>
-->
+5 -5
View File
@@ -26,7 +26,7 @@ body{counter-reset:h1 6}
<p class=d>[<a href=examples/fibonacci2.py>download <code>fibonacci2.py</code></a>]
<pre><code>class Fib:
"""iterator that yields numbers in the Fibonacci sequence"""
'''iterator that yields numbers in the Fibonacci sequence'''
def __init__(self, max):
self.max = max
@@ -67,7 +67,7 @@ class PapayaWhip: <span>&#x2460;</span>
<p>This <code>PapayaWhip</code> class doesn&#8217;t define any methods or attributes, but syntactically, there needs to be something in the definition, thus the <code>pass</code> statement. This is a Python reserved word that just means &#8220;move along, nothing to see here&#8221;. It&#8217;s a statement that does nothing, and it&#8217;s a good placeholder when you&#8217;re stubbing out functions or classes.
<blockquote class="note compare java">
<blockquote class='note compare java'>
<p><span>&#x261E;</span>The <code>pass</code> statement in Python is like a empty set of curly braces (<code>{}</code>) in Java or C.
</blockquote>
@@ -79,7 +79,7 @@ class PapayaWhip: <span>&#x2460;</span>
<pre><code>
class Fib:
<a> """iterator that yields numbers in the Fibonacci sequence""" <span>&#x2460;</span></a>
<a> '''iterator that yields numbers in the Fibonacci sequence''' <span>&#x2460;</span></a>
<a> def __init__(self, max): <span>&#x2461;</span></a></code></pre>
<ol>
@@ -112,7 +112,7 @@ class Fib:
<li>You can access the instance&#8217;s <code>docstring</code> just as with a function or a module. All instances of a class share the same <code>docstring</code>.
</ol>
<blockquote class="note compare java">
<blockquote class='note compare java'>
<p><span>&#x261E;</span>In Python, simply call a class as if it were a function to create a new instance of the class. There is no explicit <code>new</code> operator like <abbr>C++</abbr> or Java.
</blockquote>
@@ -374,7 +374,7 @@ rules = LazyRules()</code></pre>
<li><a href=http://www.python.org/dev/peps/pep-0255/>PEP 255: Simple Generators</a>
</ul>
<p class=nav><a rel=prev href=generators.html title="back to &#8220;Generators&#8221;"><span>&#x261C;</span></a> <a rel=next href=advanced-iterators.html title="onward to &#8220;Advanced Iterators&#8221;"><span>&#x261E;</span></a>
<p class=v><a href=generators.html rel=prev title='back to &#8220;Generators&#8221;'><span>&#x261C;</span></a> <a href=advanced-iterators.html rel=next title='onward to &#8220;Advanced Iterators&#8221;'><span>&#x261E;</span></a>
<p class=c>&copy; 2001&ndash;9 <a href=about.html>Mark Pilgrim</a>
<script src=j/jquery.js></script>
+25 -25
View File
@@ -93,7 +93,7 @@ body{counter-reset:h1 2}
<li>Floating point numbers are accurate to 15 decimal places.
<li>Integers can be arbitrarily large.
</ol>
<blockquote class="note compare python2">
<blockquote class='note compare python2'>
<p><span>&#x261E;</span>Python 2 had separate types for <code>int</code> and <code>long</code>. The <code>int</code> datatype was limited by <code>sys.maxint</code>, which varied by platform but was usually <code>2<sup>32</sup>-1</code>. Python 3 has just one integer type, which behaves mostly like the old <code>long</code> type from Python 2. See <a href=http://www.python.org/dev/peps/pep-0237><abbr>PEP</abbr> 237</a> for details.
</blockquote>
<h3 id=common-numerical-operations>Common Numerical Operations</h3>
@@ -120,7 +120,7 @@ body{counter-reset:h1 2}
<li>The <code>**</code> operator means &#8220;raised to the power of.&#8221; <code>11<sup>2</sup></code> is <code>121</code>.
<li>The <code>%</code> operator gives the remainder after performing integer division. <code>11</code> divided by <code>2</code> is <code>5</code> with a remainder of <code>1</code>, so the result here is <code>1</code>.
</ol>
<blockquote class="note compare python2">
<blockquote class='note compare python2'>
<p><span>&#x261E;</span>In Python 2, the <code>/</code> operator usually meant integer division, but you could make it behave like floating point division by including a special directive in your code. In Python 3, the <code>/</code> operator always means floating point division. See <a href=http://www.python.org/dev/peps/pep-0238/><abbr>PEP</abbr> 238</a> for details.
</blockquote>
<h3 id=fractions>Fractions</h3>
@@ -161,9 +161,9 @@ body{counter-reset:h1 2}
<pre class=screen>
<a><samp class=p>>>> </samp><kbd>def is_it_true(anything):</kbd> <span>&#x2460;</span></a>
<samp class=p>... </samp><kbd> if anything:</kbd>
<samp class=p>... </samp><kbd> print("yes, it's true")</kbd>
<samp class=p>... </samp><kbd> print('yes, it's true')</kbd>
<samp class=p>... </samp><kbd> else:</kbd>
<samp class=p>... </samp><kbd> print("no, it's false")</kbd>
<samp class=p>... </samp><kbd> print('no, it's false')</kbd>
<samp class=p>...</samp>
<a><samp class=p>>>> </samp><kbd>is_it_true(1)</kbd> <span>&#x2461;</span></a>
<samp>yes, it's true</samp>
@@ -190,10 +190,10 @@ body{counter-reset:h1 2}
<h2 id=lists>Lists</h2>
<p>Lists are Python&#8217;s workhorse datatype. When I say &#8220;list,&#8221; you might be thinking &#8220;array whose size I have to declare in advance, that can only contain items of the same type, <i class=baa>&amp;</i>c.&#8221; Don&#8217;t think that. Lists are much cooler than that.
<blockquote class="note compare perl5">
<blockquote class='note compare perl5'>
<p><span>&#x261E;</span>A list in Python is like an array in Perl 5. In Perl 5, variables that store arrays always start with the <code>@</code> character; in Python, variables can be named anything, and Python keeps track of the datatype internally.
</blockquote>
<blockquote class="note compare java">
<blockquote class='note compare java'>
<p><span>&#x261E;</span>A list in Python is much more than an array in Java (although it can be used as one if that&#8217;s really all you want out of life). A better analogy would be to the <code>ArrayList</code> class, which can hold arbitrary objects and can expand dynamically as new items are added.
</blockquote>
<h3 id=creatinglists>Creating A List</h3>
@@ -316,9 +316,9 @@ ValueError: list.index(x): x not in list</samp></pre>
<pre class=screen>
<samp class=p>>>> </samp><kbd>def is_it_true(anything):</kbd>
<samp class=p>... </samp><kbd> if anything:</kbd>
<samp class=p>... </samp><kbd> print("yes, it's true")</kbd>
<samp class=p>... </samp><kbd> print('yes, it's true')</kbd>
<samp class=p>... </samp><kbd> else:</kbd>
<samp class=p>... </samp><kbd> print("no, it's false")</kbd>
<samp class=p>... </samp><kbd> print('no, it's false')</kbd>
<samp class=p>...</samp>
<a><samp class=p>>>> </samp><kbd>is_it_true([])</kbd> <span>&#x2461;</span></a>
<samp>no, it's false</samp>
@@ -341,44 +341,44 @@ ValueError: list.index(x): x not in list</samp></pre>
<h2 id=dictionaries>Dictionaries</h2>
<p>One of Python&#8217;s most important datatypes is the dictionary, which defines one-to-one relationships between keys and values.
<blockquote class="note compare perl5">
<blockquote class='note compare perl5'>
<p><span>&#x261E;</span>A dictionary in Python is like a hash in Perl 5. In Perl 5, variables that store hashes always start with a <code>%</code> character. In Python, variables can be named anything, and Python keeps track of the datatype internally.
</blockquote>
<h3 id=creating-dictionaries>Creating A Dictionary</h3>
<p>Creating a dictionary is easy. The syntax is similar to <a href=#sets>sets</a>, but instead of values, you have key-value pairs. Once you have a dictionary, you can look up values by their key.
<pre class=screen>
<a><samp class=p>>>> </samp><kbd>a_dict = {"server":"db.diveintopython3.org", "database":"mysql"}</kbd> <span>&#x2460;</span></a>
<a><samp class=p>>>> </samp><kbd>a_dict = {'server':'db.diveintopython3.org', 'database':'mysql'}</kbd> <span>&#x2460;</span></a>
<samp class=p>>>> </samp><kbd>a_dict</kbd>
<samp>{'server': 'db.diveintopython3.org', 'database': 'mysql'}</samp>
<a><samp class=p>>>> </samp><kbd>a_dict["server"]</kbd> <span>&#x2461;</span></a>
<a><samp class=p>>>> </samp><kbd>a_dict['server']</kbd> <span>&#x2461;</span></a>
'db.diveintopython3.org'
<a><samp class=p>>>> </samp><kbd>a_dict["database"]</kbd> <span>&#x2462;</span></a>
<a><samp class=p>>>> </samp><kbd>a_dict['database']</kbd> <span>&#x2462;</span></a>
'mysql'
<a><samp class=p>>>> </samp><kbd>a_dict["db.diveintopython3.org"]</kbd> <span>&#x2463;</span></a>
<a><samp class=p>>>> </samp><kbd>a_dict['db.diveintopython3.org']</kbd> <span>&#x2463;</span></a>
<samp class=traceback>Traceback (most recent call last):
File "&lt;stdin>", line 1, in &lt;module>
KeyError: 'db.diveintopython3.org'</samp></pre>
<ol>
<li>First, you create a new dictionary with two items and assign it to the variable <var>a_dict</var>. Each item is a key-value pair, and the whole set of items is enclosed in curly braces.
<li><code>'server'</code> is a key, and its associated value, referenced by <code>a_dict["server"]</code>, is <code>'db.diveintopython3.org'</code>.
<li><code>'database'</code> is a key, and its associated value, referenced by <code>a_dict["database"]</code>, is <code>'mysql'</code>.
<li>You can get values by key, but you can&#8217;t get keys by value. So <code>a_dict["server"]</code> is <code>'db.diveintopython3.org'</code>, but <code>a_dict["db.diveintopython3.org"]</code> raises an exception, because <code>'db.diveintopython3.org'</code> is not a key.
<li><code>'server'</code> is a key, and its associated value, referenced by <code>a_dict['server']</code>, is <code>'db.diveintopython3.org'</code>.
<li><code>'database'</code> is a key, and its associated value, referenced by <code>a_dict['database']</code>, is <code>'mysql'</code>.
<li>You can get values by key, but you can&#8217;t get keys by value. So <code>a_dict['server']</code> is <code>'db.diveintopython3.org'</code>, but <code>a_dict['db.diveintopython3.org']</code> raises an exception, because <code>'db.diveintopython3.org'</code> is not a key.
</ol>
<h3 id=modifying-dictionaries>Modifying A Dictionary</h3>
<p>Dictionaries do not have any predefined size limit. You can add new key-value pairs to a dictionary at any time, or you can modify the value of an existing key. Continuing from the previous example:
<pre class=screen>
<samp class=p>>>> </samp><kbd>a_dict</kbd>
<samp>{'server': 'db.diveintopython3.org', 'database': 'mysql'}</samp>
<a><samp class=p>>>> </samp><kbd>a_dict["database"] = "blog"</kbd> <span>&#x2460;</span></a>
<a><samp class=p>>>> </samp><kbd>a_dict['database'] = 'blog'</kbd> <span>&#x2460;</span></a>
<samp class=p>>>> </samp><kbd>a_dict</kbd>
<samp>{'server': 'db.diveintopython3.org', 'database': 'blog'}</samp>
<a><samp class=p>>>> </samp><kbd>a_dict["user"] = "mark"</kbd> <span>&#x2461;</span></a>
<a><samp class=p>>>> </samp><kbd>a_dict['user'] = 'mark'</kbd> <span>&#x2461;</span></a>
<a><samp class=p>>>> </samp><kbd>a_dict</kbd> <span>&#x2462;</span></a>
<samp>{'server': 'db.diveintopython3.org', 'user': 'mark', 'database': 'blog'}</samp>
<a><samp class=p>>>> </samp><kbd>a_dict["user"] = "dora"</kbd> <span>&#x2463;</span></a>
<a><samp class=p>>>> </samp><kbd>a_dict['user'] = 'dora'</kbd> <span>&#x2463;</span></a>
<samp class=p>>>> </samp><kbd>a_dict</kbd>
<samp>{'server': 'db.diveintopython3.org', 'user': 'dora', 'database': 'blog'}</samp>
<a><samp class=p>>>> </samp><kbd>a_dict["User"] = "mark"</kbd> <span>&#x2464;</span></a>
<a><samp class=p>>>> </samp><kbd>a_dict['User'] = 'mark'</kbd> <span>&#x2464;</span></a>
<samp class=p>>>> </samp><kbd>a_dict</kbd>
<samp>{'User': 'mark', 'server': 'db.diveintopython3.org', 'user': 'dora', 'database': 'blog'}</samp></pre>
<ol>
@@ -417,9 +417,9 @@ KeyError: 'db.diveintopython3.org'</samp></pre>
<pre class=screen>
<samp class=p>>>> </samp><kbd>def is_it_true(anything):</kbd>
<samp class=p>... </samp><kbd> if anything:</kbd>
<samp class=p>... </samp><kbd> print("yes, it's true")</kbd>
<samp class=p>... </samp><kbd> print('yes, it's true')</kbd>
<samp class=p>... </samp><kbd> else:</kbd>
<samp class=p>... </samp><kbd> print("no, it's false")</kbd>
<samp class=p>... </samp><kbd> print('no, it's false')</kbd>
<samp class=p>...</samp>
<a><samp class=p>>>> </samp><kbd>is_it_true({})</kbd> <span>&#x2460;</span></a>
<samp>no, it's false</samp>
@@ -457,9 +457,9 @@ KeyError: 'db.diveintopython3.org'</samp></pre>
<pre class=screen>
<samp class=p>>>> </samp><kbd>def is_it_true(anything):</kbd>
<samp class=p>... </samp><kbd> if anything:</kbd>
<samp class=p>... </samp><kbd> print("yes, it's true")</kbd>
<samp class=p>... </samp><kbd> print('yes, it's true')</kbd>
<samp class=p>... </samp><kbd> else:</kbd>
<samp class=p>... </samp><kbd> print("no, it's false")</kbd>
<samp class=p>... </samp><kbd> print('no, it's false')</kbd>
<samp class=p>...</samp>
<samp class=p>>>> </samp><kbd>is_it_true(None)</kbd>
<samp>no, it's false</samp>
@@ -474,7 +474,7 @@ KeyError: 'db.diveintopython3.org'</samp></pre>
<li><a href=http://www.python.org/dev/peps/pep-0237/><abbr>PEP</abbr> 237: Unifying Long Integers and Integers</a>
<li><a href=http://www.python.org/dev/peps/pep-0238/><abbr>PEP</abbr> 238: Changing the Division Operator</a>
</ul>
<p class=nav><a rel=prev href=your-first-python-program.html title="back to &#8220;Your First Python Program&#8221;"><span>&#x261C;</span></a> <a rel=next href=strings.html title="onward to &#8220;Strings&#8221;"><span>&#x261E;</span></a>
<p class=v><a href=your-first-python-program.html rel=prev title='back to &#8220;Your First Python Program&#8221;'><span>&#x261C;</span></a> <a href=strings.html rel=next title='onward to &#8220;Strings&#8221;'><span>&#x261E;</span></a>
<p class=c>&copy; 2001&ndash;9 <a href=about.html>Mark Pilgrim</a>
<script src=j/jquery.js></script>
<script src=j/dip3.js></script>
+28 -28
View File
@@ -5,9 +5,9 @@
<!--[if IE]><script src=j/html5.js></script><![endif]-->
<link rel=stylesheet href=dip3.css>
<style>
h1:before{counter-increment:h1;content:"Appendix A. "}
h2:before{counter-increment:h2;content:"A." counter(h2) ". "}
h3:before{counter-increment:h3;content:"A." counter(h2) "." counter(h3) ". "}
h1:before{counter-increment:h1;content:'Appendix A. '}
h2:before{counter-increment:h2;content:'A.' counter(h2) '. '}
h3:before{counter-increment:h3;content:'A.' counter(h2) '.' counter(h3) '. '}
tr + tr th:first-child{font:medium 'Arial Unicode MS',FreeSerif,OpenSymbol,'DejaVu Sans',sans-serif}
table{width:100%;border-collapse:collapse}
th,td{width:45%;padding:0 0.5em;border:1px solid #bbb}
@@ -19,9 +19,9 @@ td pre{padding:0;border:0}
</style>
<link rel=stylesheet media='only screen and (max-device-width: 480px)' href=mobile.css>
<link rel=stylesheet media=print href=print.css>
<meta name=viewport content='initial-scale=1.0'>
<meta content='initial-scale=1.0' name=viewport>
</head>
<form action=http://www.google.com/cse><div><input type=hidden name=cx value=014021643941856155761:l5eihuescdw><input type=hidden name=ie value=UTF-8>&nbsp;<input name=q size=25>&nbsp;<input type=submit name=sa value=Search></div></form>
<form action=http://www.google.com/cse><div><input name=cx type=hidden value=014021643941856155761:l5eihuescdw><input name=ie type=hidden value=UTF-8>&nbsp;<input name=q size=25>&nbsp;<input name=sa type=submit value=Search></div></form>
<p>You are here: <a href=index.html>Home</a> <span>&#8227;</span> <a href=table-of-contents.html#porting-code-to-python-3-with-2to3>Dive Into Python 3</a> <span>&#8227;</span>
<p id=level>Difficulty level: <span title=pro>&#x2666;&#x2666;&#x2666;&#x2666;&#x2666;</span>
<h1>Porting Code to Python 3 with <code>2to3</code></h1>
@@ -67,11 +67,11 @@ td pre{padding:0;border:0}
<th>Python 2
<th>Python 3
<tr><th>&#x2460;
<td><code>u"PapayaWhip"</code>
<td><code>"PapayaWhip"</code>
<td><code>u'PapayaWhip'</code>
<td><code>'PapayaWhip'</code>
<tr><th>&#x2461;
<td><code>ur"PapayaWhip\foo"</code>
<td><code>r"PapayaWhip\foo"</code>
<td><code>ur'PapayaWhip\foo'</code>
<td><code>r'PapayaWhip\foo'</code>
</table>
<ol>
<li>Unicode string literals are simply converted into string literals, which, in Python 3, are always Unicode.
@@ -141,8 +141,8 @@ td pre{padding:0;border:0}
<th>Python 2
<th>Python 3
<tr><th>&#x2460;
<td><code>a_dictionary.has_key("PapayaWhip")</code>
<td><code>"PapayaWhip" in a_dictionary</code>
<td><code>a_dictionary.has_key('PapayaWhip')</code>
<td><code>'PapayaWhip' in a_dictionary</code>
<tr><th>&#x2461;</th >
<td><code>a_dictionary.has_key(x) or a_dictionary.has_key(y)</code>
<td><code>x in a_dictionary or y in a_dictionary</code>
@@ -547,8 +547,8 @@ reduce(a, b, c)</code></pre>
<th>Python 2
<th>Python 3
<tr><th>
<td><code>execfile("a_filename")</code>
<td><code>exec(compile(open("a_filename").read(), "a_filename", "exec"))</code>
<td><code>execfile('a_filename')</code>
<td><code>exec(compile(open('a_filename').read(), 'a_filename', 'exec'))</code>
</table>
<blockquote class=note>
<p><span>&#x261E;</span>The version of <code>2to3</code> that shipped with Python 3.0 would not fix the <code>execfile</code> statement automatically. The fix first appeared in the <code>2to3</code> script that shipped with Python 3.1.
@@ -563,8 +563,8 @@ reduce(a, b, c)</code></pre>
<td><code>`x`</code>
<td><code>repr(x)</code>
<tr><th>&#x2461;
<td><code>`"PapayaWhip" + `2``</code>
<td><code>repr("PapayaWhip" + repr(2))</code>
<td><code>`'PapayaWhip' + `2``</code>
<td><code>repr('PapayaWhip' + repr(2))</code>
</table>
<ol>
<li>Remember, <var>x</var> can be anything &mdash; a class, a function, a module, a primitive data type, etc. The <code>repr()</code> function works on everything.
@@ -626,13 +626,13 @@ except:
<td><code>raise MyException</code>
<td><i>unchanged</i>
<tr><th>&#x2461;
<td><code>raise MyException, "error message"</code>
<td><code>raise MyException("error message")</code>
<td><code>raise MyException, 'error message'</code>
<td><code>raise MyException('error message')</code>
<tr><th>&#x2462;
<td><code>raise MyException, "error message", a_traceback</code>
<td><code>raise MyException("error message").with_traceback(a_traceback)</code>
<td><code>raise MyException, 'error message', a_traceback</code>
<td><code>raise MyException('error message').with_traceback(a_traceback)</code>
<tr><th>&#x2463;
<td><code>raise "error message"</code>
<td><code>raise 'error message'</code>
<td><i>unsupported</i>
</table>
<ol>
@@ -651,10 +651,10 @@ except:
<td><code>a_generator.throw(MyException)</code>
<td><i>no change</i>
<tr><th>&#x2461;
<td><code>a_generator.throw(MyException, "error message")</code>
<td><code>a_generator.throw(MyException("error message"))</code>
<td><code>a_generator.throw(MyException, 'error message')</code>
<td><code>a_generator.throw(MyException('error message'))</code>
<tr><th>&#x2462;
<td><code>a_generator.throw("error message")</code>
<td><code>a_generator.throw('error message')</code>
<td><i>unsupported</i>
</table>
<ol>
@@ -701,8 +701,8 @@ except:
<td><code>raw_input()</code>
<td><code>input()</code>
<tr><th>&#x2461;
<td><code>raw_input("prompt")</code>
<td><code>input("prompt")</code>
<td><code>raw_input('prompt')</code>
<td><code>input('prompt')</code>
<tr><th>&#x2462;
<td><code>input()</code>
<td><code>eval(input())</code>
@@ -766,7 +766,7 @@ except:
<li>If you used to call <code>xreadlines()</code> with no arguments, <code>2to3</code> will convert it to just the file object. In Python 3, this will accomplish the same thing: read the file one line at a time and execute the body of the <code>for</code> loop.
<li>If you used to call <code>xreadlines()</code> with an argument (the number of lines to read at a time), keep doing that. It still works in Python 3, and <code>2to3</code> will not change it.
</ol>
<p class=c><span style="font-size:56px;line-height:0.88">&#x2603;</span>
<p class=c><span style='font-size:56px;line-height:0.88'>&#x2603;</span>
<h2 id=tuple_params><code>lambda</code> functions that take a tuple instead of multiple parameters</h2>
<p>In Python 2, you could define anonymous <code>lambda</code> functions which took multiple parameters by defining the function as taking a tuple with a specific number of items. In effect, Python 2 would &#8220;unpack&#8221; the tuple into named arguments, which you could then reference (by name) within the <code>lambda</code> function. In Python 3, you can still pass a tuple to a <code>lambda</code> function, but the Python interpreter will not unpack the tuple into named arguments. Instead, you will need to reference each argument by its positional index.
<table>
@@ -866,7 +866,7 @@ except:
<th>Python 3
<tr><th>
<td><code>callable(anything)</code>
<td><code>hasattr(anything, "__call__")</code>
<td><code>hasattr(anything, '__call__')</code>
</table>
<h2 id=zip><code>zip()</code> global function</h2>
<p>In Python 2, the global <code>zip()</code> function took any number of sequences and returned a list of tuples. The first tuple contained the first item from each sequence; the second tuple contained the second item from each sequence; and so on. In Python 3, <code>zip()</code> returns an iterator instead of a list.
@@ -1119,7 +1119,7 @@ do_stuff(a_list)</code></pre>
</table>
<p>FIXME: once the rest of the book is written, this appendix should contain copious links back to any chapter or section that touches on these features.
<p class=nav><a rel=prev href=where-to-go-from-here.html title="back to &#8220;Where To Go From Here&#8221;"><span>&#x261C;</span></a> <a rel=next href=special-method-names.html title="onward to &#8220;Special Method Names&#8221;"><span>&#x261E;</span></a>
<p class=v><a href=where-to-go-from-here.html rel=prev title='back to &#8220;Where To Go From Here&#8221;'><span>&#x261C;</span></a> <a href=special-method-names.html rel=next title='onward to &#8220;Special Method Names&#8221;'><span>&#x261E;</span></a>
<p class=c>&copy; 2001&ndash;9 <a href=about.html>Mark Pilgrim</a>
<script src=j/jquery.js></script>
<script src=j/dip3.js></script>
+1 -1
View File
@@ -18,7 +18,7 @@ for f in *.html; do
done
# build sitemap
ls build/*.html | sed -e "s|build/|http://diveintopython3.org/|g" > build/sitemap.txt
ls build/*.html | sed -e "s|build/|http://diveintopython3.org/|g" -e "s|/index.html|/|g" > build/sitemap.txt
echo "adding evil tracking code"
+35 -34
View File
@@ -23,7 +23,7 @@ body{counter-reset:h1 10}
<p class=f>Despite your best efforts to write comprehensive unit tests, bugs happen. What do I mean by &#8220;bug&#8221;? A bug is a test case you haven&#8217;t written yet.
<pre class=screen><samp class=p>>>> </samp><kbd>import roman7</kbd>
<a><samp class=p>>>> </samp><kbd>roman7.from_roman("")</kbd> <span>&#x2460;</span></a>
<a><samp class=p>>>> </samp><kbd>roman7.from_roman('')</kbd> <span>&#x2460;</span></a>
<samp>0</samp></pre>
<ol>
<li>Remember in the [FIXME-xref] previous section when you kept seeing that an empty string would match the regular expression you were using to check for valid Roman numerals? Well, it turns out that this is still true for the final version of the regular expression. And that&#8217;s a bug; you want an empty string to raise an <code>InvalidRomanNumeralError</code> exception just like any other sequence of characters that don&#8217;t represent a valid Roman numeral.
@@ -36,8 +36,8 @@ body{counter-reset:h1 10}
.
.
def testBlank(self):
"""from_roman should fail with blank string"""
<a> self.assertRaises(roman6.InvalidRomanNumeralError, roman6.from_roman, "") <span>&#x2460;</span></a></code></pre>
'''from_roman should fail with blank string'''
<a> self.assertRaises(roman6.InvalidRomanNumeralError, roman6.from_roman, '') <span>&#x2460;</span></a></code></pre>
<ol>
<li>Pretty simple stuff here. Call <code>from_roman()</code> with an empty string and make sure it raises an <code>InvalidRomanNumeralError</code> exception. The hard part was finding the bug; now that you know about it, testing for it is the easy part.
</ol>
@@ -62,7 +62,7 @@ FAIL: from_roman should fail with blank string
----------------------------------------------------------------------
Traceback (most recent call last):
File "romantest8.py", line 117, in test_blank
self.assertRaises(roman8.InvalidRomanNumeralError, roman8.from_roman, "")
self.assertRaises(roman8.InvalidRomanNumeralError, roman8.from_roman, '')
<mark>AssertionError: InvalidRomanNumeralError not raised by from_roman</mark>
----------------------------------------------------------------------
@@ -73,7 +73,7 @@ FAILED (failures=1)</samp></pre>
<p><em>Now</em> you can fix the bug.
<pre><code>def from_roman(s):
"""convert Roman numeral to integer"""
'''convert Roman numeral to integer'''
<a> if not s: <span>&#x2460;</span></a>
raise InvalidRomanNumeralError, 'Input can not be blank'
if not re.search(romanNumeralPattern, s):
@@ -137,7 +137,7 @@ class KnownValues(unittest.TestCase):
class ToRomanBadInput(unittest.TestCase):
def test_too_large(self):
"""to_roman should fail with large input"""
'''to_roman should fail with large input'''
<a> self.assertRaises(roman8.OutOfRangeError, roman8.to_roman, 5000) <span>&#x2461;</span></a>
.
@@ -146,7 +146,7 @@ class ToRomanBadInput(unittest.TestCase):
class FromRomanBadInput(unittest.TestCase):
def test_too_many_repeated_numerals(self):
"""from_roman should fail with too many repeated numerals"""
'''from_roman should fail with too many repeated numerals'''
<a> for s in ('MMMMM', 'DD', 'CCCC', 'LL', 'XXXX', 'VV', 'IIII'): <span>&#x2462;</span></a>
self.assertRaises(roman8.InvalidRomanNumeralError, roman8.from_roman, s)
@@ -156,7 +156,7 @@ class FromRomanBadInput(unittest.TestCase):
class RoundtripCheck(unittest.TestCase):
def test_roundtrip(self):
"""from_roman(to_roman(n))==n for all n"""
'''from_roman(to_roman(n))==n for all n'''
<a> for integer in range(1, 5000): <span>&#x2463;</span></a>
numeral = roman8.to_roman(integer)
result = roman8.from_roman(numeral)
@@ -192,7 +192,7 @@ Traceback (most recent call last):
File "romantest9.py", line 82, in test_from_roman_known_values
result = roman9.from_roman(numeral)
File "C:\home\diveintopython3\examples\roman9.py", line 60, in from_roman
raise InvalidRomanNumeralError("Invalid Roman numeral: {0}".format(s))
raise InvalidRomanNumeralError('Invalid Roman numeral: {0}'.format(s))
<mark>roman9.InvalidRomanNumeralError: Invalid Roman numeral: MMMM</mark>
======================================================================
@@ -202,7 +202,7 @@ Traceback (most recent call last):
File "romantest9.py", line 76, in test_to_roman_known_values
result = roman9.to_roman(integer)
File "C:\home\diveintopython3\examples\roman9.py", line 42, in to_roman
raise OutOfRangeError("number out of range (must be 0..3999)")
raise OutOfRangeError('number out of range (must be 0..3999)')
<mark>roman9.OutOfRangeError: number out of range (must be 0..3999)</mark>
======================================================================
@@ -212,7 +212,7 @@ Traceback (most recent call last):
File "romantest9.py", line 131, in testSanity
numeral = roman9.to_roman(integer)
File "C:\home\diveintopython3\examples\roman9.py", line 42, in to_roman
raise OutOfRangeError("number out of range (must be 0..3999)")
raise OutOfRangeError('number out of range (must be 0..3999)')
<mark>roman9.OutOfRangeError: number out of range (must be 0..3999)</mark>
----------------------------------------------------------------------
@@ -229,7 +229,7 @@ FAILED (errors=3)</samp></pre>
<p class=d>[<a href=examples/roman9.py>download <code>roman9.py</code></a>]
<pre><code>
roman_numeral_pattern = re.compile("""
roman_numeral_pattern = re.compile('''
^ # beginning of string
<a> M{0,4} # thousands - 0 to 4 M's <span>&#x2460;</span></a>
(CM|CD|D?C{0,3}) # hundreds - 900 (CM), 400 (CD), 0-300 (0 to 3 C's),
@@ -239,16 +239,16 @@ roman_numeral_pattern = re.compile("""
(IX|IV|V?I{0,3}) # ones - 9 (IX), 4 (IV), 0-3 (0 to 3 I's),
# or 5-8 (V, followed by 0 to 3 I's)
$ # end of string
""", re.VERBOSE)
''', re.VERBOSE)
def to_roman(n):
"""convert integer to Roman numeral"""
'''convert integer to Roman numeral'''
<a> if not (0 < n < 5000): <span>&#x2461;</span></a>
raise OutOfRangeError("number out of range (must be 0..4999)")
raise OutOfRangeError('number out of range (must be 0..4999)')
if not isinstance(n, int):
raise NotIntegerError("non-integers can not be converted")
raise NotIntegerError('non-integers can not be converted')
result = ""
result = ''
for numeral, integer in roman_numeral_map:
while n >= integer:
result += numeral
@@ -299,7 +299,7 @@ Ran 12 tests in 0.203s
<p>Refactoring is the process of taking working code and making it work better. Usually, &#8220;better&#8221; means &#8220;faster&#8221;, although it can also mean &#8220;using less memory&#8221;, or &#8220;using less disk space&#8221;, or simply &#8220;more elegantly&#8221;. Whatever it means to you, to your project, in your environment, refactoring is important to the long-term health of any program.
<p>Here, &#8220;better&#8221; means both &#8220;faster&#8221; and &#8220;easier to maintain.&#8221; Specifically, the <code>from_roman()</code> function is slower and more complex than I&#8217;d like, because of that big nasty regular expression that you use to validate Roman numerals. Now, you might think, "Sure, the regular expression is big and hairy, but how else am I supposed to validate that an arbitrary string is a valid a Roman numeral?"
<p>Here, &#8220;better&#8221; means both &#8220;faster&#8221; and &#8220;easier to maintain.&#8221; Specifically, the <code>from_roman()</code> function is slower and more complex than I&#8217;d like, because of that big nasty regular expression that you use to validate Roman numerals. Now, you might think, &#8220;Sure, the regular expression is big and hairy, but how else am I supposed to validate that an arbitrary string is a valid a Roman numeral?&#8221;
<p>Answer: there&#8217;s only 5000 of them; why don&#8217;t you just build a lookup table? This idea gets even better when you realize that <em>you don&#8217;t need to use regular expressions at all</em>. As you build the lookup table for converting integers to Roman numerals, you can build the reverse lookup table to convert Roman numerals to integers. By the time you need to check whether an arbitrary string is a valid Roman numeral, you will have collected all the valid Roman numerals. &#8220;Validating&#8221; is reduced to a single dictionary lookup.
@@ -328,26 +328,26 @@ to_roman_table = [ None ]
from_roman_table = {}
def to_roman(n):
"""convert integer to Roman numeral"""
'''convert integer to Roman numeral'''
if not (0 < n < 5000):
raise OutOfRangeError("number out of range (must be 1..4999)")
raise OutOfRangeError('number out of range (must be 1..4999)')
if int(n) != n:
raise NotIntegerError("non-integers can not be converted")
raise NotIntegerError('non-integers can not be converted')
return to_roman_table[n]
def from_roman(s):
"""convert Roman numeral to integer"""
'''convert Roman numeral to integer'''
if not isinstance(s, str):
raise InvalidRomanNumeralError("Input must be a string")
raise InvalidRomanNumeralError('Input must be a string')
if not s:
raise InvalidRomanNumeralError("Input can not be blank")
raise InvalidRomanNumeralError('Input can not be blank')
if s not in from_roman_table:
raise InvalidRomanNumeralError("Invalid Roman numeral: {0}".format(s))
raise InvalidRomanNumeralError('Invalid Roman numeral: {0}'.format(s))
return from_roman_table[s]
def build_lookup_tables():
def to_roman(n):
result = ""
result = ''
for numeral, integer in roman_numeral_map:
if n >= integer:
result = numeral
@@ -379,7 +379,7 @@ from_roman_table = {}
.
def build_lookup_tables():
<a> def to_roman(n): <span>&#x2460;</span></a>
result = ""
result = ''
for numeral, integer in roman_numeral_map:
if n >= integer:
result = numeral
@@ -402,21 +402,21 @@ def build_lookup_tables():
<p>Once the lookup tables are built, the rest of the code is both easy and fast.
<pre><code>def to_roman(n):
"""convert integer to Roman numeral"""
'''convert integer to Roman numeral'''
if not (0 < n < 5000):
raise OutOfRangeError("number out of range (must be 1..4999)")
raise OutOfRangeError('number out of range (must be 1..4999)')
if int(n) != n:
raise NotIntegerError("non-integers can not be converted")
raise NotIntegerError('non-integers can not be converted')
<a> return to_roman_table[n] <span>&#x2460;</span></a>
def from_roman(s):
"""convert Roman numeral to integer"""
'''convert Roman numeral to integer'''
if not isinstance(s, str):
raise InvalidRomanNumeralError("Input must be a string")
raise InvalidRomanNumeralError('Input must be a string')
if not s:
raise InvalidRomanNumeralError("Input can not be blank")
raise InvalidRomanNumeralError('Input can not be blank')
if s not in from_roman_table:
raise InvalidRomanNumeralError("Invalid Roman numeral: {0}".format(s))
raise InvalidRomanNumeralError('Invalid Roman numeral: {0}'.format(s))
<a> return from_roman_table[s] <span>&#x2461;</span></a></code></pre>
<ol>
<li>After doing the same bounds checking as before, the <code>to_roman()</code> function simply finds the appropriate value in the lookup table and returns it.
@@ -473,6 +473,7 @@ OK</samp></pre>
<li>Refactoring mercilessly to improve performance, scalability, readability, maintainability, or whatever other -ility you&#8217;re lacking
</ul>
<p class=v><a rel=prev class=todo><span>&#x261C;</span></a> <a rel=next class=todo><span>&#x261E;</span></a>
<p class=c>&copy; 2001&ndash;9 <a href=about.html>Mark Pilgrim</a>
<script src=j/jquery.js></script>
<script src=j/dip3.js></script>
+4 -4
View File
@@ -23,7 +23,7 @@ body{counter-reset:h1 4}
<p class=f>Every modern programming language has built-in functions for working with strings. In Python, strings have methods for searching and replacing: <code>index()</code>, <code>find()</code>, <code>split()</code>, <code>count()</code>, <code>replace()</code>, <i class=baa>&amp;</i>c. But these methods are limited to the simplest of cases. For example, the <code>index()</code> method looks for a single, hard-coded substring, and the search is always case-sensitive. To do case-insensitive searches of a string <var>s</var>, you must call <code>s.lower()</code> or <code>s.upper()</code> and make sure your search strings are the appropriate case to match. The <code>replace()</code> and <code>split()</code> methods have the same limitations.
<p>If your goal can be accomplished with string methods, you should use them. They&#8217;re fast and simple and easy to read, and there&#8217;s a lot to be said for fast, simple, readable code. But if you find yourself using a lot of different string functions with <code>if</code> statements to handle special cases, or if you&#8217;re chaining calls to <code>split()</code> and <code>join()</code> to slice-and-dice your strings, you may need to move up to regular expressions.
<p>Regular expressions are a powerful and (mostly) standardized way of searching, replacing, and parsing text with complex patterns of characters. Although the regular expression syntax is tight and unlike normal code, the result can end up being <em>more</em> readable than a hand-rolled solution that uses a long chain of string functions. There are even ways of embedding comments within regular expressions, so you can include fine-grained documentation within them.
<blockquote class="note compare perl5">
<blockquote class='note compare perl5'>
<p><span>&#x261E;</span>If you&#8217;ve used regular expressions in other languages (like Perl 5), Python&#8217;s syntax will be very familiar. Read the summary of the <a href=http://docs.python.org/dev/library/re.html#module-contents><code>re</code> module</a> to get an overview of the available functions and their arguments.
</blockquote>
<p class=a>&#x2042;
@@ -257,7 +257,7 @@ body{counter-reset:h1 4}
</ul>
<p>This will be more clear with an example. Let&#8217;s revisit the compact regular expression you&#8217;ve been working with, and make it a verbose regular expression. This example shows how.
<pre class=screen>
<samp class=p>>>> </samp><kbd>pattern = """
<samp class=p>>>> </samp><kbd>pattern = '''
^ # beginning of string
M{0,3} # thousands - 0 to 3 M's
(CM|CD|D?C{0,3}) # hundreds - 900 (CM), 400 (CD), 0-300 (0 to 3 C's),
@@ -267,7 +267,7 @@ body{counter-reset:h1 4}
(IX|IV|V?I{0,3}) # ones - 9 (IX), 4 (IV), 0-3 (0 to 3 I's),
# or 5-8 (V, followed by 0 to 3 I's)
$ # end of string
"""</kbd>
'''</kbd>
<a><samp class=p>>>> </samp><kbd>re.search(pattern, 'M', re.VERBOSE)</kbd> <span>&#x2460;</span></a>
<samp>&lt;_sre.SRE_Match object at 0x008EEB48></samp>
<a><samp class=p>>>> </samp><kbd>re.search(pattern, 'MCMLXXXIX', re.VERBOSE)</kbd> <span>&#x2461;</span></a>
@@ -433,7 +433,7 @@ body{counter-reset:h1 4}
<li><code>(x)</code> in general is a <em>remembered group</em>. You can get the value of what matched by using the <code>groups()</code> method of the object returned by <code>re.search</code>.
</ul>
<p>Regular expressions are extremely powerful, but they are not the correct solution for every problem. You should learn enough about them to know when they are appropriate, when they will solve your problems, and when they will cause more problems than they solve.
<p class=nav><a rel=prev href=strings.html title="back to &#8220;Strings&#8221;"><span>&#x261C;</span></a> <a rel=next href=generators.html title="onward to &#8220;Generators&#8221;"><span>&#x261E;</span></a>
<p class=v><a href=strings.html rel=prev title='back to &#8220;Strings&#8221;'><span>&#x261C;</span></a> <a href=generators.html rel=next title='onward to &#8220;Generators&#8221;'><span>&#x261E;</span></a>
<p class=c>&copy; 2001&ndash;9 <a href=about.html>Mark Pilgrim</a>
<script src=j/jquery.js></script>
<script src=j/dip3.js></script>
+35 -35
View File
@@ -5,9 +5,9 @@
<!--[if IE]><script src=j/html5.js></script><![endif]-->
<link rel=stylesheet href=dip3.css>
<style>
h1:before{counter-increment:h1;content:"Appendix B. "}
h2:before{counter-increment:h2;content:"B." counter(h2) ". "}
h3:before{counter-increment:h3;content:"B." counter(h2) "." counter(h3) ". "}
h1:before{counter-increment:h1;content:'Appendix B. '}
h2:before{counter-increment:h2;content:'B.' counter(h2) '. '}
h3:before{counter-increment:h3;content:'B.' counter(h2) '.' counter(h3) '. '}
tr + tr th:first-child{font:medium 'Arial Unicode MS',FreeSerif,OpenSymbol,'DejaVu Sans',sans-serif}
table{width:100%;border-collapse:collapse}
th,td{width:30%;padding:0 0.5em;border:1px solid #bbb}
@@ -111,19 +111,19 @@ td a:link, td a:visited{border:0}
<tr><th>&#x2460;
<td>to get a computed attribute (unconditionally)
<td><code>x.my_property</code>
<td><a href=http://www.python.org/doc/3.0/reference/datamodel.html#object.__getattribute__><code>x.__getattribute__(<var>"my_property"</var>)</code></a>
<td><a href=http://www.python.org/doc/3.0/reference/datamodel.html#object.__getattribute__><code>x.__getattribute__(<var>'my_property'</var>)</code></a>
<tr><th>&#x2461;
<td>to get a computed attribute (fallback)
<td><code>x.my_property</code>
<td><a href=http://www.python.org/doc/3.0/reference/datamodel.html#object.__getattr__><code>x.__getattr__(<var>"my_property"</var>)</code></a>
<td><a href=http://www.python.org/doc/3.0/reference/datamodel.html#object.__getattr__><code>x.__getattr__(<var>'my_property'</var>)</code></a>
<tr><th>&#x2462;
<td>to set an attribute
<td><code>x.my_property = value</code>
<td><a href=http://www.python.org/doc/3.0/reference/datamodel.html#object.__setattr__><code>x.__setattr__(<var>"my_property"</var>, <var>value</var>)</code></a>
<td><a href=http://www.python.org/doc/3.0/reference/datamodel.html#object.__setattr__><code>x.__setattr__(<var>'my_property'</var>, <var>value</var>)</code></a>
<tr><th>&#x2463;
<td>to delete an attribute
<td><code>del x.my_property</code>
<td><a href=http://www.python.org/doc/3.0/reference/datamodel.html#object.__delattr__><code>x.__delattr__(<var>"my_property"</var>)</code></a>
<td><a href=http://www.python.org/doc/3.0/reference/datamodel.html#object.__delattr__><code>x.__delattr__(<var>'my_property'</var>)</code></a>
<tr><th>&#x2464;
<td>to list all attributes and methods
<td><code>dir(x)</code>
@@ -131,7 +131,7 @@ td a:link, td a:visited{border:0}
</table>
<ol>
<li>If your class defines a <code>__getattribute__()</code> method, Python will call it on <em>every reference to any attribute or method name</em> (except special method names, since that would cause an unpleasant infinite loop).
<li>If your class defines a <code>__getattr__()</code> method, Python will call it only after looking for the attribute in all the normal places. If an instance <var>x</var> defines an attribute <var>color</var>, <code>x.color</code> will <em>not</em> call <code>x.__getattr__("color")</code>; it will simply return the already-defined value of <var>x.color</var>.
<li>If your class defines a <code>__getattr__()</code> method, Python will call it only after looking for the attribute in all the normal places. If an instance <var>x</var> defines an attribute <var>color</var>, <code>x.color</code> will <em>not</em> call <code>x.__getattr__('color')</code>; it will simply return the already-defined value of <var>x.color</var>.
<li>The <code>__setattr__()</code> method is called whenever you assign a value to an attribute.
<li>The <code>__delattr__()</code> method is called whenever you delete an attribute.
<li>The <code>__dir__()</code> method is useful if you define a <code>__getattr__()</code> or <code>__getattribute__()</code> method. Normally, calling <code>dir(x)</code> would only list the regular attributes and methods. If your <code>__getattr()__</code> method handles a <var>color</var> attribute dynamically, <code>dir(x)</code> would not list <var>color</var> as one of the available attributes. Overriding the <code>__dir__()</code> method allows you to list <var>color</var> as an available attribute, which is helpful for other people who wish to use your class without digging into the internals of it.
@@ -142,19 +142,19 @@ td a:link, td a:visited{border:0}
<pre class=screen>
<code>class Dynamo:
def __getattr__(self, key):
<a> if key == "color": <span>&#x2460;</span></a>
return "PapayaWhip"
<a> if key == 'color': <span>&#x2460;</span></a>
return 'PapayaWhip'
else:
<a> raise AttributeError <span>&#x2461;</span></a></code>
<samp class=p>>>> </samp><kbd>dyn = Dynamo()</kbd>
<a><samp class=p>>>> </samp><kbd>dyn.color</kbd> <span>&#x2462;</span></a>
<samp>'PapayaWhip'</samp>
<samp class=p>>>> </samp><kbd>dyn.color = "LemonChiffon"</kbd>
<samp class=p>>>> </samp><kbd>dyn.color = 'LemonChiffon'</kbd>
<a><samp class=p>>>> </samp><kbd>dyn.color</kbd> <span>&#x2463;</span></a>
<samp>'LemonChiffon'</samp></pre>
<ol>
<li>The attribute name is passed into the <code>__getattr()__</code> method as a string. If the name is <code>"color"</code>, the method returns a value. (In this case, it&#8217;s just a hard-coded string, but you would normally do some sort of computation and return the result.)
<li>The attribute name is passed into the <code>__getattr()__</code> method as a string. If the name is <code>'color'</code>, the method returns a value. (In this case, it&#8217;s just a hard-coded string, but you would normally do some sort of computation and return the result.)
<li>If the attribute name is unknown, the <code>__getattr()__</code> method needs to raise an <code>AttributeError</code> exception, otherwise your code will silently fail when accessing undefined attributes. (Technically, if the method doesn&#8217;t raise an exception or explicitly return a value, it returns <code>None</code>, the Python null value. This means that <em>all</em> attributes not explicitly defined will be <code>None</code>, which is almost certainly not what you want.)
<li>The <var>dyn</var> instance does not have an attribute named <var>color</var>, so the <code>__getattr__()</code> method is called to provide a computed value.
<li>After explicitly setting <var>dyn.color</var>, the <code>__getattr__()</code> method will no longer be called to provide a value for <var>dyn.color</var>, because <var>dyn.color</var> is already defined on the instance.
@@ -166,16 +166,16 @@ td a:link, td a:visited{border:0}
<code>class SuperDynamo:
def __getattribute__(self, key):
if key == 'color':
return "PapayaWhip"
return 'PapayaWhip'
else:
raise AttributeError</code>
<samp class=p>>>> </samp><kbd>dyn = SuperDynamo()</kbd>
<a><samp class=p>>>> </samp><kbd>dyn.color</kbd> <span>&#x2460;</span></a>
<samp>"PapayaWhip"</samp>
<samp class=p>>>> </samp><kbd>dyn.color = "LemonChiffon"</kbd>
<samp>'PapayaWhip'</samp>
<samp class=p>>>> </samp><kbd>dyn.color = 'LemonChiffon'</kbd>
<a><samp class=p>>>> </samp><kbd>dyn.color</kbd> <span>&#x2461;</span></a>
<samp>"PapayaWhip"</samp></pre>
<samp>'PapayaWhip'</samp></pre>
<ol>
<li>The <code>__getattribute__()</code> method is called to provide a value for <var>dyn.color</var>.
<li>Even after explicitly setting <var>dyn.color</var>, the <code>__getattribute__()</code> method <em>is still called</em> to provide a value for <var>dyn.color</var>. If present, the <code>__getattribute__()</code> method <em>is called unconditionally</em> for every attribute and method lookup, even for attributes that you explicitly set after creating an instance.
@@ -279,7 +279,7 @@ bytes = zef_file.read(12)
# A script which responds to http://example.com/search?q=cgi
import cgi
fs = cgi.FieldStorage()
<a>if "q" in fs: <span>&#x2460;</span></a>
<a>if 'q' in fs: <span>&#x2460;</span></a>
do_search()
# An excerpt from cgi.py that explains how that works
@@ -289,7 +289,7 @@ class FieldStorage:
.
<a> def __contains__(self, key): <span>&#x2461;</span></a>
if self.list is None:
raise TypeError("not indexable")
raise TypeError('not indexable')
<a> return any(item.name == key for item in self.list) <span>&#x2462;</span></a>
<a> def __len__(self): <span>&#x2463;</span></a>
@@ -297,7 +297,7 @@ class FieldStorage:
<ol>
<li>Once you create an instance of the <code>cgi.FieldStorage</code> class, you can use the &#8220;<code>in</code>&#8221; operator to check whether a particular parameter was included in the query string.
<li>The <code>__contains__()</code> method is the magic that makes this work.
<li>When you say <code>if "q" in fs</code>, Python looks for the <code>__contains__()</code> method on the <var>fs</var> object, which is defined in <code>cgi.py</code>. The value <code>"q"</code> is passed into the <code>__contains__()</code> method as the <var>key</var> argument.
<li>When you say <code>if 'q' in fs</code>, Python looks for the <code>__contains__()</code> method on the <var>fs</var> object, which is defined in <code>cgi.py</code>. The value <code>'q'</code> is passed into the <code>__contains__()</code> method as the <var>key</var> argument.
<li>The same <code>FieldStorage</code> class also supports returning its length, so you can say <code>len(<var>fs</var>)</code> and it will call the <code>__len__()</code> method on the <code>FieldStorage</code> class to return the number of query parameters that it identified.
</ol>
@@ -313,19 +313,19 @@ class FieldStorage:
<tr><th>
<td>to get a value by its key
<td><code>x[key]</code>
<td><a href=http://www.python.org/doc/3.0/reference/datamodel.html#object.__getitem__><code>x.__getitem__(<var>"key"</var>)</code></a>
<td><a href=http://www.python.org/doc/3.0/reference/datamodel.html#object.__getitem__><code>x.__getitem__(<var>'key'</var>)</code></a>
<tr><th>
<td>to set a value by its key
<td><code>x[key] = value</code>
<td><a href=http://www.python.org/doc/3.0/reference/datamodel.html#object.__setitem__><code>x.__setitem__(<var>"key"</var>, <var>value</var>)</code></a>
<td><a href=http://www.python.org/doc/3.0/reference/datamodel.html#object.__setitem__><code>x.__setitem__(<var>'key'</var>, <var>value</var>)</code></a>
<tr><th>
<td>to delete a key-value pair
<td><code>del x[key]</code>
<td><a href=http://www.python.org/doc/3.0/reference/datamodel.html#object.__delitem__><code>x.__delitem__(<var>"key"</var>)</code></a>
<td><a href=http://www.python.org/doc/3.0/reference/datamodel.html#object.__delitem__><code>x.__delitem__(<var>'key'</var>)</code></a>
<tr><th>
<td>to provide a default value for missing keys
<td><code>x[nonexistent_key]</code>
<td><a href=http://docs.python.org/3.0/library/collections.html#collections.defaultdict.__missing__><code>x.__missing__(<var>"nonexistent_key"</var>)</code></a>
<td><a href=http://docs.python.org/3.0/library/collections.html#collections.defaultdict.__missing__><code>x.__missing__(<var>'nonexistent_key'</var>)</code></a>
</table>
<p>The <a href=#acts-like-list-example><code>FieldStorage</code> class</a> from the <a href=http://docs.python.org/3.0/library/cgi.html><code>cgi</code> module</a> also defines these special methods, which means you can do things like this:
@@ -334,8 +334,8 @@ class FieldStorage:
# A script which responds to http://example.com/search?q=cgi
import cgi
fs = cgi.FieldStorage()
if "q" in fs:
<a> do_search(fs["q"]) <span>&#x2460;</span></a>
if 'q' in fs:
<a> do_search(fs['q']) <span>&#x2460;</span></a>
# An excerpt from cgi.py that shows how it works
class FieldStorage:
@@ -344,7 +344,7 @@ class FieldStorage:
.
<a> def __getitem__(self, key): <span>&#x2461;</span></a>
if self.list is None:
raise TypeError("not indexable")
raise TypeError('not indexable')
found = []
for item in self.list:
if item.name == key: found.append(item)
@@ -355,8 +355,8 @@ class FieldStorage:
else:
return found</code></pre>
<ol>
<li>The <var>fs</var> object is an instance of <code>cgi.FieldStorage</code>, but you can still evaluate expressions like <code>fs["q"]</code>.
<li><code>fs["q"]</code> invokes the <code>__getitem__()</code> method with the <var>key</var> parameter set to <code>"q"</code>. It then looks up in its internally maintained list of query parameters (<var>self.list</var>) for an item whose <code>.name</code> matches the given key.
<li>The <var>fs</var> object is an instance of <code>cgi.FieldStorage</code>, but you can still evaluate expressions like <code>fs['q']</code>.
<li><code>fs['q']</code> invokes the <code>__getitem__()</code> method with the <var>key</var> parameter set to <code>'q'</code>. It then looks up in its internally maintained list of query parameters (<var>self.list</var>) for an item whose <code>.name</code> matches the given key.
</ol>
<h2 id=acts-like-number>Classes That Act Like Numbers</h2>
@@ -742,19 +742,19 @@ class FieldStorage:
<pre><code># excerpt from io.py:
def _checkClosed(self, msg=None):
"""Internal: raise an ValueError if file is closed
"""
'''Internal: raise an ValueError if file is closed
'''
if self.closed:
raise ValueError("I/O operation on closed file."
raise ValueError('I/O operation on closed file.'
if msg is None else msg)
def __enter__(self) -> "IOBase":
"""Context management protocol. Returns self."""
def __enter__(self) -> 'IOBase':
'''Context management protocol. Returns self.'''
<a> self._checkClosed() <span>&#x2460;</span></a>
<a> return self <span>&#x2461;</span></a>
def __exit__(self, *args) -> None:
"""Context management protocol. Calls close()"""
'''Context management protocol. Calls close()'''
<a> self.close() <span>&#x2462;</span></a></code></pre>
<ol>
<li>The file object defines both an <code>__enter__()</code> and an <code>__exit__()</code> method. The <code>__enter__()</code> method checks that the file is open; if it&#8217;s not, the <code>_checkClosed()</code> method raises an exception.
@@ -817,7 +817,7 @@ def __exit__(self, *args) -> None:
<td><a href=http://docs.python.org/3.0/library/abc.html#abc.ABCMeta.__subclasshook__><code>MyABC.__subclasshook__(C)</code></a>
</table>
<p class=nav><a rel=prev href=porting-code-to-python-3-with-2to3.html title="back to &#8220;Porting code to Python 3 with 2to3&#8221;"><span>&#x261C;</span></a> <a rel=next class=todo><span>&#x261E;</span></a>
<p class=v><a href=porting-code-to-python-3-with-2to3.html rel=prev title='back to &#8220;Porting code to Python 3 with 2to3&#8221;'><span>&#x261C;</span></a> <a rel=next class=todo><span>&#x261E;</span></a>
<p class=c>&copy; 2001&ndash;9 <a href=about.html>Mark Pilgrim</a>
<script src=j/jquery.js></script>
<script src=j/dip3.js></script>
+22 -22
View File
@@ -55,7 +55,7 @@ My alphabet starts where your alphabet ends! <span>&#x275E;</span><br>&mdash; Dr
<p>Unicode is a system designed to represent <em>every</em> character from <em>every</em> language. Unicode represents each letter, character, or ideograph as a 4-byte number. Each number represents a unique character used in at least one of the world&#8217;s languages. (Not all the numbers are used, but more than 65535 of them are, so 2 bytes wouldn&#8217;t be sufficient.) Characters that are used in multiple languages generally have the same number, unless there is a good etymological reason not to. Regardless, there is exactly 1 number per character, and exactly 1 character per number. Every number always means just one thing; there are no &#8220;modes&#8221; to keep track of. <code>U+0041</code> is always <code>'A'</code>, even if your language doesn&#8217;t have an <code>'A'</code> in it.
<p>On the face of it, this seems like a great idea. One encoding to rule them all. Multiple languages per document. No more &#8220;mode switching&#8221; to switch between encodings mid-stream. But right away, the obvious question should leap out at you. Four bytes? For every single character<span title="interrobang!">&#8253;</span> That seems awfully wasteful, especially for languages like English and Spanish, which need less than one byte (256 numbers) to express every possible character. In fact, it&#8217;s wasteful even for ideograph-based languages (like Chinese), which never need more than two bytes per character.
<p>On the face of it, this seems like a great idea. One encoding to rule them all. Multiple languages per document. No more &#8220;mode switching&#8221; to switch between encodings mid-stream. But right away, the obvious question should leap out at you. Four bytes? For every single character<span title='interrobang!'>&#8253;</span> That seems awfully wasteful, especially for languages like English and Spanish, which need less than one byte (256 numbers) to express every possible character. In fact, it&#8217;s wasteful even for ideograph-based languages (like Chinese), which never need more than two bytes per character.
<p>There is a Unicode encoding that uses four bytes per character. It&#8217;s called UTF-32, because 32 bits = 4 bytes. UTF-32 is a straightforward encoding; it takes each Unicode character (a 4-byte number) and represents the character with that same number. This has some advantages, the most important being that you can find the <var>Nth</var> character of a string in constant time, because the <var>Nth</var> character starts at the <var>4&times;Nth</var> byte. It also has several disadvantages, the most obvious being that it takes four freaking bytes to store every freaking character.
@@ -69,7 +69,7 @@ My alphabet starts where your alphabet ends! <span>&#x275E;</span><br>&mdash; Dr
<p>Other people pondered these questions, and they came up with a solution:
<p class=c style="font-size:1000%;font-weight:bold;line-height:1;margin:0.7em 0">UTF-8
<p class=c style='font-size:1000%;font-weight:bold;line-height:1;margin:0.7em 0'>UTF-8
<p>UTF-8 is a <em>variable-length</em> encoding system for Unicode. That is, different characters take up a different number of bytes. For <abbr>ASCII</abbr> characters (A-Z, <i class=baa>&amp;</i>c.) UTF-8 uses just one byte per character. In fact, it uses the exact same bytes; the first 128 characters (0&ndash;127) in UTF-8 are indistinguishable from <abbr>ASCII</abbr>. &#8220;Extended Latin&#8221; characters like &ntilde; and &ouml; end up taking two bytes. (The bytes are not simply the Unicode code point like they would be in UTF-16; there is some serious bit-twiddling involved.) Chinese characters like &#x4E2D; end up taking three bytes. The rarely-used &#8220;astral plane&#8221; characters take four bytes.
@@ -81,7 +81,7 @@ My alphabet starts where your alphabet ends! <span>&#x275E;</span><br>&mdash; Dr
<h2 id=divingin>Diving In</h2>
<p>In Python 3, all strings are sequences of Unicode characters. There is no such thing as a Python string encoded in UTF-8, or a Python string encoded as CP-1252. "Is this string UTF-8?" is an invalid question. UTF-8 is a way of encoding characters as a sequence of bytes. If you want to take a string and turn it into a sequence of bytes in a particular character encoding, Python 3 can help you with that. If you want to take a sequence of bytes and turn it into a string, Python 3 can help you with that too. Bytes are not characters; bytes are bytes. Characters are an abstraction. A string is a sequence of those abstractions.
<p>In Python 3, all strings are sequences of Unicode characters. There is no such thing as a Python string encoded in UTF-8, or a Python string encoded as CP-1252. &#8220;Is this string UTF-8?&#8221; is an invalid question. UTF-8 is a way of encoding characters as a sequence of bytes. If you want to take a string and turn it into a sequence of bytes in a particular character encoding, Python 3 can help you with that. If you want to take a sequence of bytes and turn it into a string, Python 3 can help you with that too. Bytes are not characters; bytes are bytes. Characters are an abstraction. A string is a sequence of those abstractions.
<pre class=screen>
<a><samp class=p>>>> </samp><kbd>s = '深入 Python'</kbd> <span>&#x2460;</span></a>
@@ -111,7 +111,7 @@ My alphabet starts where your alphabet ends! <span>&#x275E;</span><br>&mdash; Dr
1024: ['KiB', 'MiB', 'GiB', 'TiB', 'PiB', 'EiB', 'ZiB', 'YiB']}
def approximate_size(size, a_kilobyte_is_1024_bytes=True):
<a> """Convert a file size to human-readable form. <span>&#x2461;</span></a>
<a> '''Convert a file size to human-readable form. <span>&#x2461;</span></a>
Keyword arguments:
size -- file size in bytes
@@ -120,7 +120,7 @@ def approximate_size(size, a_kilobyte_is_1024_bytes=True):
Returns: string
<a> """ <span>&#x2462;</span></a>
<a> ''' <span>&#x2462;</span></a>
if size &lt; 0:
<a> raise ValueError('number must be non-negative') <span>&#x2463;</span></a>
@@ -128,7 +128,7 @@ def approximate_size(size, a_kilobyte_is_1024_bytes=True):
for suffix in SUFFIXES[multiple]:
size /= multiple
if size &lt; multiple:
<a> return "{0:.1f} {1}".format(size, suffix) <span>&#x2464;</span></a>
<a> return '{0:.1f} {1}'.format(size, suffix) <span>&#x2464;</span></a>
raise ValueError('number too large')</code></pre>
<ol>
@@ -142,8 +142,8 @@ def approximate_size(size, a_kilobyte_is_1024_bytes=True):
<p>Python 3 supports formatting values into strings. Although this can include very complicated expressions, the most basic usage is to insert a value into a string with single placeholder.
<pre class=screen>
<samp class=p>>>> </samp><kbd>username = "mark"</kbd>
<a><samp class=p>>>> </samp><kbd>password = "PapayaWhip"</kbd> <span>&#x2460;</span></a>
<samp class=p>>>> </samp><kbd>username = 'mark'</kbd>
<a><samp class=p>>>> </samp><kbd>password = 'PapayaWhip'</kbd> <span>&#x2460;</span></a>
<a><samp class=p>>>> </samp><kbd>"{0}'s password is {1}".format(username, password)</kbd> <span>&#x2461;</span></a>
<samp>"mark's password is PapayaWhip"</samp></pre>
<ol>
@@ -160,7 +160,7 @@ def approximate_size(size, a_kilobyte_is_1024_bytes=True):
<a><samp class=p>>>> </samp><kbd>si_suffixes = humansize.SUFFIXES[1000]</kbd> <span>&#x2460;</span></a>
<samp class=p>>>> </samp><kbd>si_suffixes</kbd>
<samp>['KB', 'MB', 'GB', 'TB', 'PB', 'EB', 'ZB', 'YB']</samp>
<a><samp class=p>>>> </samp><kbd>"1000{0[0]} = 1{0[1]}".format(si_suffixes)</kbd> <span>&#x2461;</span></a>
<a><samp class=p>>>> </samp><kbd>'1000{0[0]} = 1{0[1]}'.format(si_suffixes)</kbd> <span>&#x2461;</span></a>
<samp>'1000KB = 1MB'</samp>
</pre>
<ol>
@@ -184,7 +184,7 @@ def approximate_size(size, a_kilobyte_is_1024_bytes=True):
<pre class=screen>
<samp class=p>>>> </samp><kbd>import humansize</kbd>
<samp class=p>>>> </samp><kbd>import sys</kbd>
<samp class=p>>>> </samp><kbd>"1MB = 1000{0.modules[humansize].SUFFIXES[1000][0]}".format(sys)</kbd>
<samp class=p>>>> </samp><kbd>'1MB = 1000{0.modules[humansize].SUFFIXES[1000][0]}'.format(sys)</kbd>
<samp>'1MB = 1000KB'</samp></pre>
<p>Here&#8217;s how it works:
@@ -192,10 +192,10 @@ def approximate_size(size, a_kilobyte_is_1024_bytes=True):
<ul>
<li>The <code>sys</code> module holds information about the currently running Python instance. Since you just imported it, you can pass the <code>sys</code> module itself as an argument to the <code>format()</code> method. So the replacement field <code>{0}</code> refers to the <code>sys</code> module.
<li><code>sys.modules</code> is a dictionary of all the modules that have been imported in this Python instance. The keys are the module names as strings; the values are the module objects themselves. So the replacement field <code>{0.modules}</code> refers to the dictionary of imported modules.
<li><code>sys.modules["humansize"]</code> is the <code>humansize</code> module which you just imported. The replacement field <code>{0.modules[humansize]}</code> refers to the <code>humansize</code> module. Note the slight difference in syntax here. In real Python code, the keys of the <code>sys.modules</code> dictionary are strings; to refer to them, you need to put quotes around the module name (<i>e.g.</i> <code>"humansize"</code>). But within a replacement field, you skip the quotes around the dictionary key name (<i>e.g.</i> <code>humansize</code>).
<li><code>sys.modules["humansize"].SUFFIXES</code> is the dictionary defined at the top of the <code>humansize</code> module. The replacement field <code>{0.modules[humansize].SUFFIXES}</code> refers to that dictionary.
<li><code>sys.modules["humansize"].SUFFIXES[1000]</code> is a list of <abbr>SI</abbr> suffixes: <code>['KB', 'MB', 'GB', 'TB', 'PB', 'EB', 'ZB', 'YB']</code>. So the replacement field <code>{0.modules[humansize].SUFFIXES[1000]}</code> refers to that list.
<lI><code>sys.modules["humansize"].SUFFIXES[1000][0]</code> is the first item of the list of <abbr>SI</abbr> suffixes: <code>'KB'</code>. Therefore, the complete replacement field <code>{0.modules[humansize].SUFFIXES[1000][0]}</code> is replaced by the two-character string <code>KB</code>.
<li><code>sys.modules['humansize']</code> is the <code>humansize</code> module which you just imported. The replacement field <code>{0.modules[humansize]}</code> refers to the <code>humansize</code> module. Note the slight difference in syntax here. In real Python code, the keys of the <code>sys.modules</code> dictionary are strings; to refer to them, you need to put quotes around the module name (<i>e.g.</i> <code>'humansize'</code>). But within a replacement field, you skip the quotes around the dictionary key name (<i>e.g.</i> <code>humansize</code>).
<li><code>sys.modules['humansize'].SUFFIXES</code> is the dictionary defined at the top of the <code>humansize</code> module. The replacement field <code>{0.modules[humansize].SUFFIXES}</code> refers to that dictionary.
<li><code>sys.modules['humansize'].SUFFIXES[1000]</code> is a list of <abbr>SI</abbr> suffixes: <code>['KB', 'MB', 'GB', 'TB', 'PB', 'EB', 'ZB', 'YB']</code>. So the replacement field <code>{0.modules[humansize].SUFFIXES[1000]}</code> refers to that list.
<lI><code>sys.modules['humansize'].SUFFIXES[1000][0]</code> is the first item of the list of <abbr>SI</abbr> suffixes: <code>'KB'</code>. Therefore, the complete replacement field <code>{0.modules[humansize].SUFFIXES[1000][0]}</code> is replaced by the two-character string <code>KB</code>.
</ul>
<h3 id=format-specifiers>Format Specifiers</h3>
@@ -203,18 +203,18 @@ def approximate_size(size, a_kilobyte_is_1024_bytes=True):
<p>But wait! There&#8217;s more! Let&#8217;s take another look at that strange line of code from <code>humansize.py</code>:
<pre><code>if size &lt; multiple:
return "{0:.1f} {1}".format(size, suffix)</code></pre>
return '{0:.1f} {1}'.format(size, suffix)</code></pre>
<p><code>{1}</code> is replaced with the second argument passed to the <code>format()</code> method, which is <var>suffix</var>. But what is <code>{0:.1f}</code>? It&#8217;s two things: <code>{0}</code>, which you recognize, and <code>:.1f</code>, which you don&#8217;t. The second half (including and after the colon) defines the <i>format specifier</i>, which further refines how the replaced variable should be formatted.
<blockquote class="note compare clang">
<blockquote class='note compare clang'>
<p><span>&#x261E;</span>Format specifiers allow you to munge the replacement text in a variety of useful ways, like the <code>printf()</code> function in C. You can add zero- or space-padding, align strings, control decimal precision, and even convert numbers to hexadecimal.
</blockquote>
<p>Within a replacement field, a colon (<code>:</code>) marks the start of the format specifier. The format specifier &#8220;<code>.1</code>&#8221; means &#8220;round to the nearest tenth&#8221; (<i>i.e.</i> display only one digit after the decimal point). The format specifier &#8220;<code>f</code>&#8221; means &#8220;fixed-point number&#8221; (as opposed to exponential notation or some other decimal representation). Thus, given a <var>size</var> of <code>698.25</code> and <var>suffix</var> of <code>'GB'</code>, the formatted string would be <code>'698.3 GB'</code>, because <code>698.25</code> gets rounded to one decimal place, then the suffix is appended after the number.
<pre class=screen>
<samp class=p>>>> </samp><kbd>"{0:.1f} {1}".format(698.25, 'GB')</kbd>
<samp class=p>>>> </samp><kbd>'{0:.1f} {1}'.format(698.25, 'GB')</kbd>
<samp>'698.3 GB'</samp></pre>
<p>For all the gory details on format specifiers, consult the <a href=http://docs.python.org/3.0/library/string.html#format-specification-mini-language>Format Specification Mini-Language</a> in the official Python documentation.
@@ -226,10 +226,10 @@ def approximate_size(size, a_kilobyte_is_1024_bytes=True):
<p>Besides formatting, strings can do a number of other useful tricks.
<pre class=screen>
<a><samp class=p>>>> </samp><kbd>s = """Finished files are the re-</kbd> <span>&#x2460;</span></a>
<a><samp class=p>>>> </samp><kbd>s = '''Finished files are the re-</kbd> <span>&#x2460;</span></a>
<samp class=p>... </samp><kbd>sult of years of scientif-</kbd>
<samp class=p>... </samp><kbd>ic study combined with the</kbd>
<samp class=p>... </samp><kbd>experience of years."""</kbd>
<samp class=p>... </samp><kbd>experience of years.'''</kbd>
<a><samp class=p>>>> </samp><kbd>s.splitlines()</kbd> <span>&#x2461;</span></a>
<samp>['Finished files are the re-',
'sult of years of scientif-',
@@ -240,7 +240,7 @@ def approximate_size(size, a_kilobyte_is_1024_bytes=True):
sult of years of scientif-
ic study combined with the
experience of years.</samp>
<a><samp class=p>>>> </samp><kbd>s.lower().count("f")</kbd> <span>&#x2463;</span></a>
<a><samp class=p>>>> </samp><kbd>s.lower().count('f')</kbd> <span>&#x2463;</span></a>
<samp>6</samp></pre>
<ol>
<li>You can input multi-line strings in the Python interactive shell. Once you start a multi-line string with triple quotation marks, just hit <kbd>ENTER</kbd> and the interactive shell will prompt you to continue the string. Typing the closing triple quotation marks ends the string, and the next <kbd>ENTER</kbd> will execute the command (in this case, assigning the string to <var>s</var>).
@@ -381,7 +381,7 @@ TypeError: Can't convert 'bytes' object to str implicitly</samp>
<p>Python 3 assumes that your source code &mdash; <i>i.e.</i> each <code>.py</code> file &mdash; is encoded in UTF-8.
<blockquote class="note compare python2">
<blockquote class='note compare python2'>
<p><span>&#x261E;</span>In Python 2, the default encoding for <code>.py</code> files was <abbr>ASCII</abbr>. In Python 3, <a href=http://www.python.org/dev/peps/pep-3120/>the default encoding is UTF-8</a>.
</blockquote>
@@ -432,7 +432,7 @@ TypeError: Can't convert 'bytes' object to str implicitly</samp>
<li><a href=http://www.python.org/dev/peps/pep-3101/><abbr>PEP</abbr> 3101: Advanced String Formatting</a>
</ul>
<p class=nav><a rel=prev href=native-datatypes.html title="back to &#8220;Native Datatypes&#8221;"><span>&#x261C;</span></a> <a rel=next href=regular-expressions.html title="onward to &#8220;Regular Expressions&#8221;"><span>&#x261E;</span></a>
<p class=v><a href=native-datatypes.html rel=prev title='back to &#8220;Native Datatypes&#8221;'><span>&#x261C;</span></a> <a href=regular-expressions.html rel=next title='onward to &#8220;Regular Expressions&#8221;'><span>&#x261E;</span></a>
<p class=c>&copy; 2001&ndash;9 <a href=about.html>Mark Pilgrim</a>
<script src=j/jquery.js></script>
+1 -1
View File
@@ -4,7 +4,7 @@
<title>Table of contents - Dive Into Python 3</title>
<link rel=stylesheet href=dip3.css>
<style>
h1:before{content:""}
h1:before{content:''}
ol,ul{font-weight:bold}
li ol{font-weight:normal}
ul{list-style:none;margin:0;padding:0}
+22 -59
View File
@@ -119,12 +119,12 @@ import unittest
<a> (3999, 'MMMCMXCIX')) <span>&#x2461;</span></a>
<a> def test_to_roman_known_values(self): <span>&#x2462;</span></a>
"""to_roman should give known result with known input"""
'''to_roman should give known result with known input'''
for integer, numeral in self.known_values:
<a> result = roman1.to_roman(integer) <span>&#x2463;</span></a>
<a> self.assertEqual(numeral, result) <span>&#x2464;</span></a>
if __name__ == "__main__":
if __name__ == '__main__':
unittest.main()</code></pre>
<ol>
<li>To write a test case, first subclass the <code>TestCase</code> class of the <code>unittest</code> module. This class provides many useful methods which you can use in your test case to test specific conditions.
@@ -138,7 +138,7 @@ if __name__ == "__main__":
<pre><code># roman1.py
function to_roman(n):
"""convert integer to Roman numeral"""
'''convert integer to Roman numeral'''
<a> pass <span>&#x2460;</span></a></code></pre>
<ol>
<li>At this stage, you want to define the <abbr>API</abbr> of the <code>to_roman()</code> function, but you don&#8217;t want to code it yet. (Your test needs to fail first.) To stub it out, use the Python reserved word <code>pass</code> [FIXME ref], which does precisely nothing.
@@ -162,7 +162,7 @@ Traceback (most recent call last):
<a>FAILED (failures=1) <span>&#x2463;</span></a></samp></pre>
<ol>
<li>Running the script runs <code>unittest.main()</code>, which runs each test case. Each test case is a method within each class in <code>romantest.py</code> that inherits from <code>unittest.TestCase</code>. For each test case, the <code>unittest</code> module will print out the <code>docstring</code> of the method and whether that test passed or failed. As expected, this test case fails.
<li>For each failed test case, <code>unittest</code> displays the trace information showing exactly what happened. In this case, the call to <code>assertEqual()</code> raised an <code>AssertionError</code> because it was expecting <code>to_roman(1)</code> to return <code>"I"</code>, but it didn&#8217;t. (Since there was no explicit return statement, the function returned <code>None</code>, the Python null value.)
<li>For each failed test case, <code>unittest</code> displays the trace information showing exactly what happened. In this case, the call to <code>assertEqual()</code> raised an <code>AssertionError</code> because it was expecting <code>to_roman(1)</code> to return <code>'I'</code>, but it didn&#8217;t. (Since there was no explicit return statement, the function returned <code>None</code>, the Python null value.)
<li>After the detail of each test, <code>unittest</code> displays a summary of how many tests were performed and how long it took.
<li>Overall, the unit test failed because at least one test case did not pass. When a test case doesn&#8217;t pass, <code>unittest</code> distinguishes between failures and errors. A failure is a call to an <code>assertXYZ</code> method, like <code>assertEqual</code> or <code>assertRaises</code>, that fails because the asserted condition is not true or the expected exception was not raised. An error is any other sort of exception raised in the code you&#8217;re testing or the unit test case itself.
</ol>
@@ -183,8 +183,8 @@ Traceback (most recent call last):
<a> ('I', 1)) <span>&#x2460;</span></a>
def to_roman(n):
"""convert integer to Roman numeral"""
result = ""
'''convert integer to Roman numeral'''
result = ''
for numeral, integer in roman_numeral_map:
<a> while n >= integer: <span>&#x2461;</span></a>
result += numeral
@@ -248,7 +248,7 @@ OK</samp></pre>
<pre><code>
<a>class ToRomanBadInput(unittest.TestCase): <span>&#x2460;</span></a>
<a> def test_too_large(self): <span>&#x2461;</span></a>
"""to_roman should fail with large input"""
'''to_roman should fail with large input'''
<a> self.assertRaises(roman2.OutOfRangeError, roman2.to_roman, 4000) <span>&#x2462;</span></a></code></pre>
<ol>
<li>Like the previous test case, you create a class that inherits from <code>unittest.TestCase</code>. You can have more than one test per class (as you&#8217;ll see later in this chapter), but I chose to create a new class here because this test is something different than the last one. We&#8217;ll keep all the good input tests together in one class, and all the bad input tests together in another.
@@ -311,11 +311,11 @@ FAILED (failures=1)</samp></pre>
<p>Now you can write the code to make this test pass.
<p class=d>[<a href=examples/roman2.py>download <code>roman2.py</code></a>]
<pre><code>def to_roman(n):
"""convert integer to Roman numeral"""
'''convert integer to Roman numeral'''
if n > 3999:
<a> raise OutOfRangeError("number out of range (must be less than 3999)") <span>&#x2460;</span></a>
<a> raise OutOfRangeError('number out of range (must be less than 3999)') <span>&#x2460;</span></a>
result = ""
result = ''
for numeral, integer in roman_numeral_map:
while n >= integer:
result += numeral
@@ -357,15 +357,15 @@ OK</samp></pre>
<pre><code>
class ToRomanBadInput(unittest.TestCase):
def test_too_large(self):
"""to_roman should fail with large input"""
'''to_roman should fail with large input'''
<a> self.assertRaises(roman3.OutOfRangeError, roman3.to_roman, 4000) <span>&#x2460;</span></a>
def test_zero(self):
"""to_roman should fail with 0 input"""
'''to_roman should fail with 0 input'''
<a> self.assertRaises(roman3.OutOfRangeError, roman3.to_roman, 0) <span>&#x2461;</span></a>
def test_negative(self):
"""to_roman should fail with negative input"""
'''to_roman should fail with negative input'''
<a> self.assertRaises(roman3.OutOfRangeError, roman3.to_roman, -1) <span>&#x2462;</span></a></code></pre>
<ol>
<li>The <code>test_too_large()</code> method has not changed since the previous step. I&#8217;m including it here to show where the new code fits.
@@ -407,11 +407,11 @@ FAILED (failures=2)</samp></pre>
<p class=d>[<a href=examples/roman3.py>download <code>roman3.py</code></a>]
<pre><code>def to_roman(n):
"""convert integer to Roman numeral"""
'''convert integer to Roman numeral'''
<a> if not (0 < n < 4000): <span>&#x2460;</span></a>
<a> raise OutOfRangeError("number out of range (must be 0..3999)") <span>&#x2461;</span></a>
<a> raise OutOfRangeError('number out of range (must be 0..3999)') <span>&#x2461;</span></a>
result = ""
result = ''
for numeral, integer in roman_numeral_map:
while n >= integer:
result += numeral
@@ -466,7 +466,7 @@ class OutOfRangeError(ValueError): pass
.
.
def test_non_integer(self):
"""to_roman should fail with non-integer input"""
'''to_roman should fail with non-integer input'''
<mark> self.assertRaises(roman4.NotIntegerError, roman4.to_roman, 0.5)</mark></code></pre>
<p>Now check that the test fails properly.
@@ -495,13 +495,13 @@ FAILED (failures=1)</samp></pre>
<p>Write the code that makes the test pass.
<pre><code>def to_roman(n):
"""convert integer to Roman numeral"""
'''convert integer to Roman numeral'''
if not (0 < n < 4000):
raise OutOfRangeError("number out of range (must be 0..3999)")
raise OutOfRangeError('number out of range (must be 0..3999)')
<a> if not isinstance(n, int): <span>&#x2460;</span></a>
<a> raise NotIntegerError("non-integers can not be converted") <span>&#x2461;</span></a>
<a> raise NotIntegerError('non-integers can not be converted') <span>&#x2461;</span></a>
result = ""
result = ''
for numeral, integer in roman_numeral_map:
while n >= integer:
result += numeral
@@ -529,44 +529,7 @@ OK</samp></pre>
<p>Now stop coding.
<!--
<li><a href="#roman.requirements">Requirement #3</a> specifies that <code>to_roman()</code> cannot accept a non-integer number, so here you test to make sure that <code>to_roman()</code> raises a <code>roman.NotIntegerError</code> exception when called with <code>0.5</code>. If <code>to_roman()</code> does not raise a <code>roman.NotIntegerError</code>, this test is considered failed.
-->
<!--
For instance, the <code>testFromRomanCase</code> method (&#8220;<code>from_roman()</code> should only accept uppercase input&#8221;) was an error, because the call to <code>numeral.upper()</code> raised an <code>AttributeError</code> exception, because <code>to_roman()</code> was supposed to return a string but didn't. But <code>testZero</code> (&#8220;<code>to_roman()</code> should fail with 0 input&#8221;) was a failure, because the call to <code>from_roman()</code> did not raise the <code>InvalidRomanNumeral</code> exception that <code>assertRaises</code> was looking for.
-->
<!--
<li>For each failed test case, <code>unittest</code> displays the trace information showing exactly what happened. In this case, the call to <code>assertRaises</code> (also called <code>failUnlessRaises</code>) raised an <code>AssertionError</code> because it was expecting <code>to_roman()</code> to raise an <code>OutOfRangeError</code> and it didn't.
-->
<!--
<p>Given all of this, what would you expect out of a set of functions to convert to and from Roman numerals?
<ol>
<li><code>to_roman</code> should return the Roman numeral representation for all integers <code>1</code> to <code>3999</code>.
<li><code>to_roman</code> should fail when given an integer outside the range <code>1</code> to <code>3999</code>.
<li><code>to_roman</code> should fail when given a non-integer number.
<li><code>from_roman</code> should take a valid Roman numeral and return the number that it represents.
<li><code>from_roman</code> should fail when given an invalid Roman numeral.
<li>If you take a number, convert it to Roman numerals, then convert that back to a number, you should end up with the number
you started with. So <code>from_roman(to_roman(n)) == n</code> for all <var>n</var> in <code>1..3999</code>.
<li><code>to_roman</code> should always return a Roman numeral using uppercase letters.
<li><code>from_roman</code> should only accept uppercase Roman numerals (<i class=foreignphrase><abbr>i.e.</abbr></i> it should fail when given lowercase input).
</ol>
-->
<!--
<ol>
<li>The <code>re.compile</code> function can take an optional second argument, which is a set of one or more flags that control various options about the
compiled regular expression. Here you're specifying the <code>re.VERBOSE</code> flag, which tells Python that there are in-line comments within the regular expression itself. The comments and all the whitespace around them are
<em>not</em> considered part of the regular expression; the <code>re.compile</code> function simply strips them all out when it compiles the expression. This new, &#8220;verbose&#8221; version is identical to the old version, but it is infinitely more readable.
</ol>
-->
<p class=v><a rel=prev class=todo><span>&#x261C;</span></a> <a rel=next class=todo><span>&#x261E;</span></a>
<p class=c>&copy; 2001&ndash;9 <a href=about.html>Mark Pilgrim</a>
<script src=j/jquery.js></script>
<script src=j/dip3.js></script>
+2 -2
View File
@@ -1,12 +1,12 @@
<!DOCTYPE html>
<head>
<meta charset=utf-8>
<title>What's New In "Dive into Python 3"</title>
<title>What's New In Dive into Python 3</title>
<!--[if IE]><script src=j/html5.js></script><![endif]-->
<link rel=stylesheet href=dip3.css>
<style>
body{counter-reset:h1 -1}
h3:before{content:""}
h3:before{content:''}
</style>
<link rel=stylesheet media='only screen and (max-device-width: 480px)' href=mobile.css>
<link rel=stylesheet media=print href=print.css>
+8 -8
View File
@@ -36,7 +36,7 @@ body{counter-reset:h1 20}
<ul>
<li><a href=http://users.rcn.com/python/download/Descriptor.htm>How-To Guide For Descriptors</a>
<li><a href=http://www.ibm.com/developerworks/linux/library/l-python-elegance-2.html>Charming Python: Python elegance and warts, Part 2</a>
<li><a href="http://www.informit.com/articles/printerfriendly.aspx?p=1309289">Python Descriptors</a>
<li><a href='http://www.informit.com/articles/printerfriendly.aspx?p=1309289'>Python Descriptors</a>
<li><a href=http://docs.python.org/3.0/reference/datamodel.html#invoking-descriptors>Invoking Descriptors</a> in the official Python documentation
</ul>
@@ -55,15 +55,15 @@ body{counter-reset:h1 20}
<p>As Python 3 is relatively new, there is a dearth of compatible libraries. Here are some of the places to look for code that works with Python 3.
<ul>
<li><a href="http://pypi.python.org/pypi?:action=browse&amp;c=533&amp;show=all">Python Package Index: list of Python 3 packages</a>
<li><a href="http://code.activestate.com/recipes/langs/python/tags/python3/">Python Cookbook: list of recipes tagged &#8220;python3&#8221;</a>
<li><a href="http://code.google.com/hosting/search?q=label:python3">Google Project Hosting: list of projects tagged &#8220;python3&#8221;</a>
<li><a href="http://sourceforge.net/search/?words=%22python+3%22">SourceForge: list of projects matching &#8220;Python 3&#8221;</a>
<li><a href="http://github.com/search?type=Repositories&amp;language=python&amp;q=python3">GitHub: list of projects matching &#8220;python3&#8221;</a> (also, <a href="http://github.com/search?type=Repositories&amp;language=python&amp;q=python+3">list of projects matching &#8220;python 3&#8221;</a>)
<li><a href="http://bitbucket.org/repo/all/?name=python3">BitBucket: list of projects matching &#8220;python3&#8221;</a> (and <a href="http://bitbucket.org/repo/all/?name=python+3">those matching &#8220;python 3&#8221;</a>)
<li><a href='http://pypi.python.org/pypi?:action=browse&amp;c=533&amp;show=all'>Python Package Index: list of Python 3 packages</a>
<li><a href='http://code.activestate.com/recipes/langs/python/tags/python3/'>Python Cookbook: list of recipes tagged &#8220;python3&#8221;</a>
<li><a href='http://code.google.com/hosting/search?q=label:python3'>Google Project Hosting: list of projects tagged &#8220;python3&#8221;</a>
<li><a href='http://sourceforge.net/search/?words=%22python+3%22'>SourceForge: list of projects matching &#8220;Python 3&#8221;</a>
<li><a href='http://github.com/search?type=Repositories&amp;language=python&amp;q=python3'>GitHub: list of projects matching &#8220;python3&#8221;</a> (also, <a href='http://github.com/search?type=Repositories&amp;language=python&amp;q=python+3'>list of projects matching &#8220;python 3&#8221;</a>)
<li><a href='http://bitbucket.org/repo/all/?name=python3'>BitBucket: list of projects matching &#8220;python3&#8221;</a> (and <a href='http://bitbucket.org/repo/all/?name=python+3'>those matching &#8220;python 3&#8221;</a>)
</ul>
<p class=nav><a rel=prev href=case-study-porting-chardet-to-python-3.html title="back to &#8220;Case Study: Porting chardet to Python 3&#8221;"><span>&#x261C;</span></a> <a rel=next href=porting-code-to-python-3-with-2to3.html title="onward to &#8220;Porting Code to Python 3 with 2to3&#8221;"><span>&#x261E;</span></a>
<p class=v><a href=case-study-porting-chardet-to-python-3.html rel=prev title='back to &#8220;Case Study: Porting chardet to Python 3&#8221;'><span>&#x261C;</span></a> <a href=porting-code-to-python-3-with-2to3.html rel=next title='onward to &#8220;Porting Code to Python 3 with 2to3&#8221;'><span>&#x261E;</span></a>
<p class=c>&copy; 2001&ndash;9 <a href=about.html>Mark Pilgrim</a>
<script src=j/jquery.js></script>
+93 -92
View File
@@ -17,7 +17,7 @@ mark{display:inline}
<p id=level>Difficulty level: <span title=advanced>&#x2666;&#x2666;&#x2666;&#x2666;&#x2662;</span>
<h1>XML</h1>
<blockquote class=q>
<p><span>&#x275D;</span> In the archonship of Aristaechmus, Draco enacted his ordinances. <span>&#x275E;</span><br>&mdash; <a href="http://www.perseus.tufts.edu/cgi-bin/ptext?doc=Perseus:text:1999.01.0046;query=chapter%3D%235;layout=;loc=3.1">Aristotle</a>
<p><span>&#x275D;</span> In the archonship of Aristaechmus, Draco enacted his ordinances. <span>&#x275E;</span><br>&mdash; <a href='http://www.perseus.tufts.edu/cgi-bin/ptext?doc=Perseus:text:1999.01.0046;query=chapter%3D%235;layout=;loc=3.1'>Aristotle</a>
</blockquote>
<p id=toc>&nbsp;
<h2 id=divingin>Diving In</h2>
@@ -26,29 +26,29 @@ mark{display:inline}
<p>Here, then, is the <abbr>XML</abbr> data we&#8217;ll be working with in this chapter. It&#8217;s a feed &mdash; specifically, an <a href=http://atompub.org/rfc4287.html>Atom syndication feed</a>.
<p class=d>[<a href=examples/feed.xml>download <code>feed.xml</code></a>]
<pre><code>&lt;?xml version="1.0" encoding="utf-8"?>
&lt;feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
<pre><code>&lt;?xml version='1.0' encoding='utf-8'?>
&lt;feed xmlns='http://www.w3.org/2005/Atom' xml:lang='en'>
&lt;title>dive into mark&lt;/title>
&lt;subtitle>currently between addictions&lt;/subtitle>
&lt;id>tag:diveintomark.org,2001-07-29:/&lt;/id>
&lt;updated>2009-03-27T21:56:07Z&lt;/updated>
&lt;link rel="alternate" type="text/html" href="http://diveintomark.org/"/>
&lt;link rel="self" type="application/atom+xml" href="http://diveintomark.org/feed/"/>
&lt;link rel='alternate' type='text/html' href='http://diveintomark.org/'/>
&lt;link rel='self' type='application/atom+xml' href='http://diveintomark.org/feed/'/>
&lt;entry>
&lt;author>
&lt;name>Mark&lt;/name>
&lt;uri>http://diveintomark.org/&lt;/uri>
&lt;/author>
&lt;title>Dive into history, 2009 edition&lt;/title>
&lt;link rel="alternate" type="text/html"
href="http://diveintomark.org/archives/2009/03/27/dive-into-history-2009-edition"/>
&lt;link rel='alternate' type='text/html'
href='http://diveintomark.org/archives/2009/03/27/dive-into-history-2009-edition'/>
&lt;id>tag:diveintomark.org,2009-03-27:/archives/20090327172042&lt;/id>
&lt;updated>2009-03-27T21:56:07Z&lt;/updated>
&lt;published>2009-03-27T17:20:42Z&lt;/published>
&lt;category scheme="http://diveintomark.org" term="diveintopython"/>
&lt;category scheme="http://diveintomark.org" term="docbook"/>
&lt;category scheme="http://diveintomark.org" term="html"/>
&lt;summary type="html">Putting an entire chapter on one page sounds
&lt;category scheme='http://diveintomark.org' term='diveintopython'/>
&lt;category scheme='http://diveintomark.org' term='docbook'/>
&lt;category scheme='http://diveintomark.org' term='html'/>
&lt;summary type='html'>Putting an entire chapter on one page sounds
bloated, but consider this &amp;amp;mdash; my longest chapter so far
would be 75 printed pages, and it loads in under 5 seconds&amp;amp;hellip;
On dialup.&lt;/summary>
@@ -59,13 +59,13 @@ mark{display:inline}
&lt;uri>http://diveintomark.org/&lt;/uri>
&lt;/author>
&lt;title>Accessibility is a harsh mistress&lt;/title>
&lt;link rel="alternate" type="text/html"
href="http://diveintomark.org/archives/2009/03/21/accessibility-is-a-harsh-mistress"/>
&lt;link rel='alternate' type='text/html'
href='http://diveintomark.org/archives/2009/03/21/accessibility-is-a-harsh-mistress'/>
&lt;id>tag:diveintomark.org,2009-03-21:/archives/20090321200928&lt;/id>
&lt;updated>2009-03-22T01:05:37Z&lt;/updated>
&lt;published>2009-03-21T20:09:28Z&lt;/published>
&lt;category scheme="http://diveintomark.org" term="accessibility"/>
&lt;summary type="html">The accessibility orthodoxy does not permit people to
&lt;category scheme='http://diveintomark.org' term='accessibility'/>
&lt;summary type='html'>The accessibility orthodoxy does not permit people to
question the value of features that are rarely useful and rarely used.&lt;/summary>
&lt;/entry>
&lt;entry>
@@ -73,20 +73,20 @@ mark{display:inline}
&lt;name>Mark&lt;/name>
&lt;/author>
&lt;title>A gentle introduction to video encoding, part 1: container formats&lt;/title>
&lt;link rel="alternate" type="text/html"
href="http://diveintomark.org/archives/2008/12/18/give-part-1-container-formats"/>
&lt;link rel='alternate' type='text/html'
href='http://diveintomark.org/archives/2008/12/18/give-part-1-container-formats'/>
&lt;id>tag:diveintomark.org,2008-12-18:/archives/20081218155422&lt;/id>
&lt;updated>2009-01-11T19:39:22Z&lt;/updated>
&lt;published>2008-12-18T15:54:22Z&lt;/published>
&lt;category scheme="http://diveintomark.org" term="asf"/>
&lt;category scheme="http://diveintomark.org" term="avi"/>
&lt;category scheme="http://diveintomark.org" term="encoding"/>
&lt;category scheme="http://diveintomark.org" term="flv"/>
&lt;category scheme="http://diveintomark.org" term="GIVE"/>
&lt;category scheme="http://diveintomark.org" term="mp4"/>
&lt;category scheme="http://diveintomark.org" term="ogg"/>
&lt;category scheme="http://diveintomark.org" term="video"/>
&lt;summary type="html">These notes will eventually become part of a
&lt;category scheme='http://diveintomark.org' term='asf'/>
&lt;category scheme='http://diveintomark.org' term='avi'/>
&lt;category scheme='http://diveintomark.org' term='encoding'/>
&lt;category scheme='http://diveintomark.org' term='flv'/>
&lt;category scheme='http://diveintomark.org' term='GIVE'/>
&lt;category scheme='http://diveintomark.org' term='mp4'/>
&lt;category scheme='http://diveintomark.org' term='ogg'/>
&lt;category scheme='http://diveintomark.org' term='video'/>
&lt;summary type='html'>These notes will eventually become part of a
tech talk on video encoding.&lt;/summary>
&lt;/entry>
&lt;/feed></code></pre>
@@ -120,8 +120,8 @@ mark{display:inline}
<p>Elements can have <i>attributes</i>, which are name-value pairs. Attributes are listed within the start tag of an element and separated by whitespace. <i>Attribute names</i> can not be repeated within an element. <i>Attribute values</i> must be quoted.
<pre class=nd><code><a>&lt;foo <mark>lang="en"</mark>> <span>&#x2460;</span></a>
<a> &lt;bar <mark>lang="fr"</mark>>&lt;/bar> <span>&#x2461;</span></a>
<pre class=nd><code><a>&lt;foo <mark>lang='en'</mark>> <span>&#x2460;</span></a>
<a> &lt;bar <mark>lang='fr'</mark>>&lt;/bar> <span>&#x2461;</span></a>
&lt;/foo>
</code></pre>
<ol>
@@ -133,8 +133,8 @@ mark{display:inline}
<p>Elements can have <i>text content</i>.
<pre class=nd><code>&lt;foo lang="en">
&lt;bar lang="fr"><mark>PapayaWhip</mark>&lt;/bar>
<pre class=nd><code>&lt;foo lang='en'>
&lt;bar lang='fr'><mark>PapayaWhip</mark>&lt;/bar>
&lt;/foo>
</code></pre>
@@ -148,7 +148,7 @@ mark{display:inline}
<p>Like Python functions can be declared in different <i>modules</i>, <abbr>XML</abbr> elements can be declared in different <i>namespaces</i>. Namespaces usually look like URLs. You use an <code>xmlns</code> declaration to define a <i>default namespace</i>. A namespace declaration looks similar to an attribute, but it has a different purpose.
<pre class=nd><code><a>&lt;feed <mark>xmlns="http://www.w3.org/2005/Atom"</mark>> <span>&#x2460;</span></a>
<pre class=nd><code><a>&lt;feed <mark>xmlns='http://www.w3.org/2005/Atom'</mark>> <span>&#x2460;</span></a>
<a> &lt;title>dive into mark&lt;/title> <span>&#x2461;</span></a>
&lt;/feed>
</code></pre>
@@ -159,7 +159,7 @@ mark{display:inline}
<p>You can also use an <code>xmlns:<var>prefix</var></code> declaration to define a namespace and associate it with a <i>prefix</i>. Then each element in that namespace must be explicitly declared with the prefix.
<pre class=nd><code><a>&lt;atom:feed <mark>xmlns:atom="http://www.w3.org/2005/Atom"</mark>> <span>&#x2460;</span></a>
<pre class=nd><code><a>&lt;atom:feed <mark>xmlns:atom='http://www.w3.org/2005/Atom'</mark>> <span>&#x2460;</span></a>
<a> &lt;atom:title>dive into mark&lt;/atom:title> <span>&#x2461;</span></a>
&lt;/atom:feed></code></pre>
<ol>
@@ -171,7 +171,7 @@ mark{display:inline}
<p>Finally, <abbr>XML</abbr> documents can contain <a href=strings.html#one-ring-to-rule-them-all>character encoding information</a> on the first line, before the root element. (If you&#8217;re curious how a document can contain information which needs to be known before the document can be parsed, <a href=http://www.w3.org/TR/REC-xml/#sec-guessing-no-ext-info>Section F of the <abbr>XML</abbr> specification</a> details how to resolve this Catch-22.)
<pre class=nd><code>&lt;?xml version="1.0" <mark>encoding="utf-8"</mark>?></code></pre>
<pre class=nd><code>&lt;?xml version='1.0' <mark>encoding='utf-8'</mark>?></code></pre>
<p>And now you know just enough <abbr>XML</abbr> to be dangerous!
@@ -185,8 +185,8 @@ mark{display:inline}
<p>At the top level is the <i>root element</i>, which every Atom feed shares: the <code>feed</code> element in the <code>http://www.w3.org/2005/Atom</code> namespace.
<pre><code><a>&lt;feed xmlns="http://www.w3.org/2005/Atom" <span>&#x2460;</span></a>
<a> xml:lang="en"> <span>&#x2461;</span></a></code></pre>
<pre><code><a>&lt;feed xmlns='http://www.w3.org/2005/Atom' <span>&#x2460;</span></a>
<a> xml:lang='en'> <span>&#x2461;</span></a></code></pre>
<ol>
<li><code>http://www.w3.org/2005/Atom</code> is the Atom namespace.
<li>Any element can contain an <code>xml:lang</code> attribute, which declares the language of the element and its children. In this case, the <code>xml:lang</code> attribute is declared once on the root element, which means the entire feed is in English.
@@ -194,18 +194,18 @@ mark{display:inline}
<p>An Atom feed contains several pieces of information about the feed itself. These are declared as children of the root-level <code>feed</code> element.
<pre><code>&lt;feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
<pre><code>&lt;feed xmlns='http://www.w3.org/2005/Atom' xml:lang='en'>
<a> &lt;title>dive into mark&lt;/title> <span>&#x2460;</span></a>
<a> &lt;subtitle>currently between addictions&lt;/subtitle> <span>&#x2461;</span></a>
<a> &lt;id>tag:diveintomark.org,2001-07-29:/&lt;/id> <span>&#x2462;</span></a>
<a> &lt;updated>2009-03-27T21:56:07Z&lt;/updated> <span>&#x2463;</span></a>
<a> &lt;link rel="alternate" type="text/html" href="http://diveintomark.org/"/> <span>&#x2464;</span></a></code></pre>
<a> &lt;link rel='alternate' type='text/html' href='http://diveintomark.org/'/> <span>&#x2464;</span></a></code></pre>
<ol>
<li>The title of this feed is <code>dive into mark</code>.
<li>The subtitle of this feed is <code>currently between addictions</code>.
<li>Every feed needs a globally unique identifier. See <a href=http://www.ietf.org/rfc/rfc4151.txt>RFC 4151</a> for how to create one.
<li>This feed was last updated on March 27, 2009, at 21:56 GMT. This is usually equivalent to the last-modified date of the most recent article.
<li>Now things start to get interesting. This <code>link</code> element has no text content, but it has three attributes: <code>rel</code>, <code>type</code>, and <code>href</code>. The <code>rel</code> value tells you what kind of link this is; <code>rel="alternate"</code> means that this is a link to an alternate representation of this feed. The <code>type="text/html"</code> attribute means that this is a link to an <abbr>HTML</abbr> page. And the link target is given in the <code>href</code> attribute.
<li>Now things start to get interesting. This <code>link</code> element has no text content, but it has three attributes: <code>rel</code>, <code>type</code>, and <code>href</code>. The <code>rel</code> value tells you what kind of link this is; <code>rel='alternate'</code> means that this is a link to an alternate representation of this feed. The <code>type='text/html'</code> attribute means that this is a link to an <abbr>HTML</abbr> page. And the link target is given in the <code>href</code> attribute.
</ol>
<p>Now we know that this is a feed for a site named &#8220;dive into mark&#8220; which is available at <a href=http://diveintomark.org/><code>http://diveintomark.org/</code></a> and was last updated on March 27, 2009.
@@ -222,15 +222,15 @@ mark{display:inline}
&lt;uri>http://diveintomark.org/&lt;/uri>
&lt;/author>
<a> &lt;title>Dive into history, 2009 edition&lt;/title> <span>&#x2461;</span></a>
<a> &lt;link rel="alternate" type="text/html" <span>&#x2462;</span></a>
href="http://diveintomark.org/archives/2009/03/27/dive-into-history-2009-edition"/>
<a> &lt;link rel='alternate' type='text/html' <span>&#x2462;</span></a>
href='http://diveintomark.org/archives/2009/03/27/dive-into-history-2009-edition'/>
<a> &lt;id>tag:diveintomark.org,2009-03-27:/archives/20090327172042&lt;/id> <span>&#x2463;</span></a>
<a> &lt;updated>2009-03-27T21:56:07Z&lt;/updated> <span>&#x2464;</span></a>
&lt;published>2009-03-27T17:20:42Z&lt;/published>
<a> &lt;category scheme="http://diveintomark.org" term="diveintopython"/> <span>&#x2465;</span></a>
&lt;category scheme="http://diveintomark.org" term="docbook"/>
&lt;category scheme="http://diveintomark.org" term="html"/>
<a> &lt;summary type="html">Putting an entire chapter on one page sounds <span>&#x2466;</span></a>
<a> &lt;category scheme='http://diveintomark.org' term='diveintopython'/> <span>&#x2465;</span></a>
&lt;category scheme='http://diveintomark.org' term='docbook'/>
&lt;category scheme='http://diveintomark.org' term='html'/>
<a> &lt;summary type='html'>Putting an entire chapter on one page sounds <span>&#x2466;</span></a>
bloated, but consider this &amp;amp;mdash; my longest chapter so far
would be 75 printed pages, and it loads in under 5 seconds&amp;amp;hellip;
On dialup.&lt;/summary>
@@ -242,7 +242,7 @@ mark{display:inline}
<li>Entries, like feeds, need a unique identifier.
<li>Entries have two dates: a first-published date (<code>published</code>) and a last-modified date (<code>updated</code>).
<li>Entries can have an arbitrary number of categories. This article is filed under <code>diveintopython</code>, <code>docbook</code>, and <code>html</code>.
<li>The <code>summary</code> element gives a brief summary of the article. (There is also a <code>content</code> element, not shown here, if you want to include the complete article text in your feed.) This <code>summary</code> element has the Atom-specific <code>type="html"</code> attribute, which specifies that this summary is a snippet of <abbr>HTML</abbr>, not plain text. This is important, since it has <abbr>HTML</abbr>-specific entities in it (<code>&amp;mdash;</code> and <code>&amp;hellip;</code>) which should be rendered as &#8220;&mdash;&#8221; and &#8220;&hellip;&#8221; rather than displayed directly.
<li>The <code>summary</code> element gives a brief summary of the article. (There is also a <code>content</code> element, not shown here, if you want to include the complete article text in your feed.) This <code>summary</code> element has the Atom-specific <code>type='html'</code> attribute, which specifies that this summary is a snippet of <abbr>HTML</abbr>, not plain text. This is important, since it has <abbr>HTML</abbr>-specific entities in it (<code>&amp;mdash;</code> and <code>&amp;hellip;</code>) which should be rendered as &#8220;&mdash;&#8221; and &#8220;&hellip;&#8221; rather than displayed directly.
<li>Finally, the end tag for the <code>entry</code> element, signaling the end of the metadata for this article.
</ol>
@@ -255,7 +255,7 @@ mark{display:inline}
<p class=d>[<a href=examples/feed.xml>download <code>feed.xml</code></a>]
<pre class=screen>
<a><samp class=p>>>> </samp><kbd>import xml.etree.ElementTree as etree</kbd> <span>&#x2460;</span></a>
<a><samp class=p>>>> </samp><kbd>tree = etree.parse("examples/feed.xml")</kbd> <span>&#x2461;</span></a>
<a><samp class=p>>>> </samp><kbd>tree = etree.parse('examples/feed.xml')</kbd> <span>&#x2461;</span></a>
<a><samp class=p>>>> </samp><kbd>root = tree.getroot()</kbd> <span>&#x2462;</span></a>
<a><samp class=p>>>> </samp><kbd>root</kbd> <span>&#x2463;</span></a>
<samp>&lt;Element {http://www.w3.org/2005/Atom}feed at cd1eb0></samp></pre>
@@ -319,7 +319,7 @@ mark{display:inline}
<a><samp class=p>>>> </samp><kbd>root[3].attrib</kbd> <span>&#x2464;</span></a>
<samp>{}</samp></pre>
<ol>
<li>The <code>attrib</code> property is a dictionary of the element&#8217;s attributes. The original markup here was <code>&lt;feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en"></code>. The <code>xml:</code> prefix refers to a built-in namespace that every <abbr>XML</abbr> document can use without declaring it.
<li>The <code>attrib</code> property is a dictionary of the element&#8217;s attributes. The original markup here was <code>&lt;feed xmlns='http://www.w3.org/2005/Atom' xml:lang='en'></code>. The <code>xml:</code> prefix refers to a built-in namespace that every <abbr>XML</abbr> document can use without declaring it.
<li>The fifth child &mdash; <code>[4]</code> in a <code>0</code>-based list &mdash; is the <code>link</code> element.
<li>The <code>link</code> element has three attributes: <code>href</code>, <code>type</code>, and <code>rel</code>.
<li>The fourth child &mdash; <code>[3]</code> in a <code>0</code>-based list &mdash; is the <code>updated</code> element.
@@ -334,17 +334,17 @@ mark{display:inline}
<pre class=screen>
<samp class=p>>>> </samp><kbd>import xml.etree.ElementTree as etree</kbd>
<samp class=p>>>> </samp><kbd>tree = etree.parse("examples/feed.xml")</kbd>
<samp class=p>>>> </samp><kbd>tree = etree.parse('examples/feed.xml')</kbd>
<samp class=p>>>> </samp><kbd>root = tree.getroot()</kbd>
<a><samp class=p>>>> </samp><kbd>root.findall("{http://www.w3.org/2005/Atom}entry")</kbd> <span>&#x2460;</span></a>
<a><samp class=p>>>> </samp><kbd>root.findall('{http://www.w3.org/2005/Atom}entry')</kbd> <span>&#x2460;</span></a>
<samp>[&lt;Element {http://www.w3.org/2005/Atom}entry at e2b4e0>,
&lt;Element {http://www.w3.org/2005/Atom}entry at e2b510>,
&lt;Element {http://www.w3.org/2005/Atom}entry at e2b540>]</samp>
<samp class=p>>>> </samp><kbd>root.tag</kbd>
<samp>'{http://www.w3.org/2005/Atom}feed'</samp>
<a><samp class=p>>>> </samp><kbd>root.findall("{http://www.w3.org/2005/Atom}feed")</kbd> <span>&#x2461;</span></a>
<a><samp class=p>>>> </samp><kbd>root.findall('{http://www.w3.org/2005/Atom}feed')</kbd> <span>&#x2461;</span></a>
<samp>[]</samp>
<a><samp class=p>>>> </samp><kbd>root.findall("{http://www.w3.org/2005/Atom}author")</kbd> <span>&#x2462;</span></a>
<a><samp class=p>>>> </samp><kbd>root.findall('{http://www.w3.org/2005/Atom}author')</kbd> <span>&#x2462;</span></a>
<samp>[]</samp></pre>
<ol>
<li>The <code>findall()</code> method finds child elements that match a specific query. (More on the query format in a minute.)
@@ -353,22 +353,22 @@ mark{display:inline}
</ol>
<pre class=screen>
<a><samp class=p>>>> </samp><kbd>tree.findall("{http://www.w3.org/2005/Atom}entry")</kbd> <span>&#x2460;</span></a>
<a><samp class=p>>>> </samp><kbd>tree.findall('{http://www.w3.org/2005/Atom}entry')</kbd> <span>&#x2460;</span></a>
<samp>[&lt;Element {http://www.w3.org/2005/Atom}entry at e2b4e0>,
&lt;Element {http://www.w3.org/2005/Atom}entry at e2b510>,
&lt;Element {http://www.w3.org/2005/Atom}entry at e2b540>]</samp>
<a><samp class=p>>>> </samp><kbd>tree.findall("{http://www.w3.org/2005/Atom}author")</kbd> <span>&#x2461;</span></a>
<a><samp class=p>>>> </samp><kbd>tree.findall('{http://www.w3.org/2005/Atom}author')</kbd> <span>&#x2461;</span></a>
<samp>[]</samp>
</pre>
<ol>
<li>For convenience, the <code>tree</code> object (returned from the <code>etree.parse()</code> function) has several methods that mirror the methods on the root element. The results are the same as if you had called the <code>tree.getroot().findall()</code> method.
<li>Perhaps surprisingly, this query does not find the <code>author</code> elements in this document. Why not? Because this is just a shortcut for <code>tree.getroot().findall("{http://www.w3.org/2005/Atom}author")</code>, which means &#8220;find all the <code>author</code> elements that are children of the root element.&#8221; The <code>author</code> elements are not children of the root element; they&#8217;re children of the <code>entry</code> elements. Thus the query doesn&#8217;t return any matches.
<li>Perhaps surprisingly, this query does not find the <code>author</code> elements in this document. Why not? Because this is just a shortcut for <code>tree.getroot().findall('{http://www.w3.org/2005/Atom}author')</code>, which means &#8220;find all the <code>author</code> elements that are children of the root element.&#8221; The <code>author</code> elements are not children of the root element; they&#8217;re children of the <code>entry</code> elements. Thus the query doesn&#8217;t return any matches.
</ol>
<p>There <em>is</em> a way to search for <em>descendant</em> elements, <i>i.e.</i> children, grandchildren, and any element at any nesting level.
<pre class=screen>
<a><samp class=p>>>> </samp><kbd>all_links = tree.findall("//{http://www.w3.org/2005/Atom}link")</kbd> <span>&#x2460;</span></a>
<a><samp class=p>>>> </samp><kbd>all_links = tree.findall('//{http://www.w3.org/2005/Atom}link')</kbd> <span>&#x2460;</span></a>
<samp class=p>>>> </samp><kbd>all_links</kbd>
<samp>[&lt;Element {http://www.w3.org/2005/Atom}link at e181b0>,
&lt;Element {http://www.w3.org/2005/Atom}link at e2b570>,
@@ -400,7 +400,7 @@ mark{display:inline}
<pre class=screen>
# continuing from the previous example
<a><samp class=p>>>> </samp><kbd>it = tree.getiterator("{http://www.w3.org/2005/Atom}link")</kbd> <span>&#x2460;</span></a>
<a><samp class=p>>>> </samp><kbd>it = tree.getiterator('{http://www.w3.org/2005/Atom}link')</kbd> <span>&#x2460;</span></a>
<a><samp class=p>>>> </samp><kbd>next(it)</kbd> <span>&#x2461;</span></a>
&lt;Element {http://www.w3.org/2005/Atom}link at 122f1b0>
<samp class=p>>>> </samp><kbd>next(it)</kbd>
@@ -428,9 +428,9 @@ StopIteration</samp></pre>
<pre class=screen>
<a><samp class=p>>>> </samp><kbd>from lxml import etree</kbd> <span>&#x2460;</span></a>
<a><samp class=p>>>> </samp><kbd>tree = etree.parse("examples/feed.xml")</kbd> <span>&#x2461;</span></a>
<a><samp class=p>>>> </samp><kbd>tree = etree.parse('examples/feed.xml')</kbd> <span>&#x2461;</span></a>
<a><samp class=p>>>> </samp><kbd>root = tree.getroot()</kbd> <span>&#x2462;</span></a>
<a><samp class=p>>>> </samp><kbd>root.findall("{http://www.w3.org/2005/Atom}entry")</kbd> <span>&#x2463;</span></a>
<a><samp class=p>>>> </samp><kbd>root.findall('{http://www.w3.org/2005/Atom}entry')</kbd> <span>&#x2463;</span></a>
<samp>[&lt;Element {http://www.w3.org/2005/Atom}entry at e2b4e0>,
&lt;Element {http://www.w3.org/2005/Atom}entry at e2b510>,
&lt;Element {http://www.w3.org/2005/Atom}entry at e2b540>]</samp></pre>
@@ -452,16 +452,16 @@ except ImportError:
<pre class=screen>
<a><samp class=p>>>> </samp><kbd>import lxml.etree</kbd> <span>&#x2460;</span></a>
<samp class=p>>>> </samp><kbd>tree = lxml.etree.parse("examples/feed.xml")</kbd>
<a><samp class=p>>>> </samp><kbd>tree.findall("//{http://www.w3.org/2005/Atom}*[@href]")</kbd> <span>&#x2461;</span></a>
<samp class=p>>>> </samp><kbd>tree = lxml.etree.parse('examples/feed.xml')</kbd>
<a><samp class=p>>>> </samp><kbd>tree.findall('//{http://www.w3.org/2005/Atom}*[@href]')</kbd> <span>&#x2461;</span></a>
[&lt;Element {http://www.w3.org/2005/Atom}link at eeb8a0>,
&lt;Element {http://www.w3.org/2005/Atom}link at eeb990>,
&lt;Element {http://www.w3.org/2005/Atom}link at eeb960>,
&lt;Element {http://www.w3.org/2005/Atom}link at eeb9c0>]
<a><samp class=p>>>> </samp><kbd>tree.findall("//{http://www.w3.org/2005/Atom}*[@href='http://diveintomark.org/']")</kbd> <span>&#x2462;</span></a>
<samp>[&lt;Element {http://www.w3.org/2005/Atom}link at eeb930>]</samp>
<samp class=p>>>> </samp><kbd>NS = "{http://www.w3.org/2005/Atom}"</kbd>
<a><samp class=p>>>> </samp><kbd>tree.findall("//{NS}author[{NS}uri]".format(NS=NS))</kbd> <span>&#x2463;</span></a>
<samp class=p>>>> </samp><kbd>NS = '{http://www.w3.org/2005/Atom}'</kbd>
<a><samp class=p>>>> </samp><kbd>tree.findall('//{NS}author[{NS}uri]'.format(NS=NS))</kbd> <span>&#x2463;</span></a>
<samp>[&lt;Element {http://www.w3.org/2005/Atom}author at eeba80>,
&lt;Element {http://www.w3.org/2005/Atom}author at eebba0>]</samp></pre>
<ol>
@@ -475,18 +475,18 @@ except ImportError:
<pre class=screen>
<samp class=p>>>> </samp><kbd>import lxml.etree</kbd>
<samp class=p>>>> </samp><kbd>tree = lxml.etree.parse("examples/feed.xml")</kbd>
<a><samp class=p>>>> </samp><kbd>NSMAP = {"atom": "http://www.w3.org/2005/Atom"}</kbd> <span>&#x2460;</span></a>
<samp class=p>>>> </samp><kbd>tree = lxml.etree.parse('examples/feed.xml')</kbd>
<a><samp class=p>>>> </samp><kbd>NSMAP = {'atom': 'http://www.w3.org/2005/Atom'}</kbd> <span>&#x2460;</span></a>
<a><samp class=p>>>> </samp><kbd>entries = tree.xpath("//atom:category[@term='accessibility']/..",</kbd> <span>&#x2461;</span></a>
<samp class=p>... </samp><kbd> namespaces=NSMAP)</kbd>
<a><samp class=p>>>> </samp><kbd>entries</kbd> <span>&#x2462;</span></a>
<samp>[&lt;Element {http://www.w3.org/2005/Atom}entry at e2b630>]</samp>
<samp class=p>>>> </samp><kbd>entry = entries[0]</kbd>
<a><samp class=p>>>> </samp><kbd>entry.xpath("./atom:title/text()", namespaces=nsmap)</kbd> <span>&#x2463;</span></a>
<a><samp class=p>>>> </samp><kbd>entry.xpath('./atom:title/text()', namespaces=nsmap)</kbd> <span>&#x2463;</span></a>
<samp>['Accessibility is a harsh mistress']</samp></pre>
<ol>
<li>To perform XPath queries on namespaced elements, you need to define a namespace prefix mapping. This is just a Python dictionary.
<li>Here is an XPath query. The XPath expression searches for <code>category</code> elements (in the Atom namespace) that contain a <code>term</code> attribute with the value <code>accessibility</code>. But that&#8217;s not actually the query result. Look at the very end of the query string; did you notice the <code>/..</code> bit? That means &#8220;and then return the parent element of the <code>category</code> element you just found.&#8221; So this single XPath query will find all entries with a child element of <code>&lt;category term="accessibility"></code>.
<li>Here is an XPath query. The XPath expression searches for <code>category</code> elements (in the Atom namespace) that contain a <code>term</code> attribute with the value <code>accessibility</code>. But that&#8217;s not actually the query result. Look at the very end of the query string; did you notice the <code>/..</code> bit? That means &#8220;and then return the parent element of the <code>category</code> element you just found.&#8221; So this single XPath query will find all entries with a child element of <code>&lt;category term='accessibility'></code>.
<li>The <code>xpath()</code> function returns a list of ElementTree objects. In this document, there is only one entry with a <code>category</code> whose <code>term</code> is <code>accessibility</code>.
<li>XPath expressions don&#8217;t always return a list of elements. Technically, the <abbr>DOM</abbr> of a parsed <abbr>XML</abbr> document doesn&#8217;t contain elements; it contains <i>nodes</i>. Depending on their type, nodes can be elements, attributes, or even text content. The result of an XPath query is a list of nodes. This query returns a list of text nodes: the text content (<code>text()</code>) of the <code>title</code> element (<code>atom:title</code>) that is a child of the current element (<code>./</code>).
</ol>
@@ -499,25 +499,25 @@ except ImportError:
<pre class=screen>
<samp class=p>>>> </samp><kbd>import xml.etree.ElementTree as etree</kbd>
<a><samp class=p>>>> </samp><kbd>new_feed = etree.Element("{http://www.w3.org/2005/Atom}feed",</kbd> <span>&#x2460;</span></a>
<a><samp class=p>... </samp><kbd> attrib={"{http://www.w3.org/XML/1998/namespace}lang": "en"})</kbd> <span>&#x2461;</span></a>
<a><samp class=p>>>> </samp><kbd>new_feed = etree.Element('{http://www.w3.org/2005/Atom}feed',</kbd> <span>&#x2460;</span></a>
<a><samp class=p>... </samp><kbd> attrib={'{http://www.w3.org/XML/1998/namespace}lang': 'en'})</kbd> <span>&#x2461;</span></a>
<a><samp class=p>>>> </samp><kbd>print(etree.tostring(new_feed))</kbd> <span>&#x2462;</span></a>
<samp>&lt;ns0:feed xmlns:ns0="http://www.w3.org/2005/Atom" xml:lang="en"/></samp></pre>
<samp>&lt;ns0:feed xmlns:ns0='http://www.w3.org/2005/Atom' xml:lang='en'/></samp></pre>
<ol>
<li>To create a new element, instantiate the <code>Element</code> class. You pass the element name (namespace + local name) as the first argument. This statement creates a <code>feed</code> element in the Atom namespace. This will be our new document&#8217;s root element.
<li>To add attributes to the newly created element, pass a dictionary of attribute names and values in the <var>attrib</var> argument. Note that the attribute name should be in the standard ElementTree format, <code>{<var>namespace</var>}<var>localname</var></code>.
<li>At any time, you can serialize any element (and its children) with the ElementTree <code>tostring()</code> function.
</ol>
<p>Was that serialization surprising to you? The way ElementTree serializes namespaced <abbr>XML</abbr> elements is technically accurate but not optimal. The sample <abbr>XML</abbr> document at the beginning of this chapter defined a <i>default namespace</i> (<code>xmlns="http://www.w3.org/2005/Atom"</code>). Defining a default namespace is useful for documents &mdash; like Atom feeds &mdash; where every element is in the same namespace, because you can declare the namespace once and declare each element with just its local name (<code>&lt;feed></code>, <code>&lt;link></code>, <code>&lt;entry></code>). There is no need to use any prefixes unless you want to declare elements from another namespace.
<p>Was that serialization surprising to you? The way ElementTree serializes namespaced <abbr>XML</abbr> elements is technically accurate but not optimal. The sample <abbr>XML</abbr> document at the beginning of this chapter defined a <i>default namespace</i> (<code>xmlns='http://www.w3.org/2005/Atom'</code>). Defining a default namespace is useful for documents &mdash; like Atom feeds &mdash; where every element is in the same namespace, because you can declare the namespace once and declare each element with just its local name (<code>&lt;feed></code>, <code>&lt;link></code>, <code>&lt;entry></code>). There is no need to use any prefixes unless you want to declare elements from another namespace.
<p>An <abbr>XML</abbr> parser won&#8217;t &#8220;see&#8221; any difference between an <abbr>XML</abbr> document with a default namespace and an <abbr>XML</abbr> document with a prefixed namespace. The resulting <abbr>DOM</abbr> of this serialization:
<pre class=nd><code>&lt;ns0:feed xmlns:ns0="http://www.w3.org/2005/Atom" xml:lang="en"/></code></pre>
<pre class=nd><code>&lt;ns0:feed xmlns:ns0='http://www.w3.org/2005/Atom' xml:lang='en'/></code></pre>
<p>is identical to the <abbr>DOM</abbr> of this serialization:
<pre class=nd><code>&lt;feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en"/></code></pre>
<pre class=nd><code>&lt;feed xmlns='http://www.w3.org/2005/Atom' xml:lang='en'/></code></pre>
<p>The only practical difference is that the second serialization is several characters shorter. If we were to recast our entire sample feed with a <code>ns0:</code> prefix in every start and end tag, it would add 4 characters per start tag &times; 79 tags + 4 characters for the namespace declaration itself, for a total of 316 characters. Assuming <a href=strings.html#byte-arrays>UTF-8 encoding</a>, that&#8217;s 316 extra bytes. (After gzipping, the difference drops to 21 bytes, but still, 21 bytes is 21 bytes.) Maybe that doesn&#8217;t matter to you, but for something like an Atom feed, which may be downloaded several thousand times whenever it changes, saving a few bytes per request can quickly add up.
@@ -525,13 +525,13 @@ except ImportError:
<pre class=screen>
<samp class=p>>>> </samp><kbd>import lxml.etree</kbd>
<a><samp class=p>>>> </samp><kbd>NSMAP = {None: "http://www.w3.org/2005/Atom"}</kbd> <span>&#x2460;</span></a>
<a><samp class=p>>>> </samp><kbd>new_feed = lxml.etree.Element("feed", nsmap=NSMAP)</kbd> <span>&#x2461;</span></a>
<a><samp class=p>>>> </samp><kbd>NSMAP = {None: 'http://www.w3.org/2005/Atom'}</kbd> <span>&#x2460;</span></a>
<a><samp class=p>>>> </samp><kbd>new_feed = lxml.etree.Element('feed', nsmap=NSMAP)</kbd> <span>&#x2461;</span></a>
<a><samp class=p>>>> </samp><kbd>print(lxml.etree.tounicode(new_feed))</kbd> <span>&#x2462;</span></a>
<samp>&lt;feed xmlns="http://www.w3.org/2005/Atom"/></samp>
<a><samp class=p>>>> </samp><kbd>new_feed.set("{http://www.w3.org/XML/1998/namespace}lang", "en")</kbd> <span>&#x2463;</span></a>
<samp>&lt;feed xmlns='http://www.w3.org/2005/Atom'/></samp>
<a><samp class=p>>>> </samp><kbd>new_feed.set('{http://www.w3.org/XML/1998/namespace}lang', 'en')</kbd> <span>&#x2463;</span></a>
<samp class=p>>>> </samp><kbd>print(lxml.etree.tounicode(new_feed))</kbd>
<samp>&lt;feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en"/></samp></pre>
<samp>&lt;feed xmlns='http://www.w3.org/2005/Atom' xml:lang='en'/></samp></pre>
<ol>
<li>To start, define a namespace mapping as a dictionary. Dictionary values are namespaces; dictionary keys are the desired prefix. Using <code>None</code> as a prefix effectively declares a default namespace.
<li>Now you can pass the <code>lxml</code>-specific <var>nsmap</var> argument when you create an element, and <code>lxml</code> will respect the namespace prefixes you&#8217;ve defined.
@@ -542,16 +542,16 @@ except ImportError:
<p>Are <abbr>XML</abbr> documents limited to one element per document? No, of course not. You can easily create child elements, too.
<pre class=screen>
<a><samp class=p>>>> </samp><kbd>title = lxml.etree.SubElement(new_feed, "title",</kbd> <span>&#x2460;</span></a>
<a><samp class=p>... </samp><kbd> attrib={"type":"html"})</kbd> <span>&#x2461;</span></a>
<a><samp class=p>>>> </samp><kbd>title = lxml.etree.SubElement(new_feed, 'title',</kbd> <span>&#x2460;</span></a>
<a><samp class=p>... </samp><kbd> attrib={'type':'html'})</kbd> <span>&#x2461;</span></a>
<samp class=p>>>> </samp><kbd>print(lxml.etree.tounicode(new_feed))</kbd>
<samp>&lt;feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">&lt;title type="html"/>&lt;/feed></samp>
<a><samp class=p>>>> </samp><kbd>title.text = "dive into &amp;hellip;"</kbd> <span>&#x2462;</span></a>
<samp>&lt;feed xmlns='http://www.w3.org/2005/Atom' xml:lang='en'>&lt;title type='html'/>&lt;/feed></samp>
<a><samp class=p>>>> </samp><kbd>title.text = 'dive into &amp;hellip;'</kbd> <span>&#x2462;</span></a>
<a><samp class=p>>>> </samp><kbd>print(lxml.etree.tounicode(new_feed))</kbd> <span>&#x2463;</span></a>
<samp>&lt;feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">&lt;title type="html">dive into &amp;amp;hellip;&lt;/title>&lt;/feed></samp>
<samp>&lt;feed xmlns='http://www.w3.org/2005/Atom' xml:lang='en'>&lt;title type='html'>dive into &amp;amp;hellip;&lt;/title>&lt;/feed></samp>
<a><samp class=p>>>> </samp><kbd>print(lxml.etree.tounicode(new_feed, pretty_print=True))</kbd> <span>&#x2464;</span></a>
<samp>&lt;feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
&lt;title type="html">dive into&amp;amp;hellip;&lt;/title>
<samp>&lt;feed xmlns='http://www.w3.org/2005/Atom' xml:lang='en'>
&lt;title type='html'>dive into&amp;amp;hellip;&lt;/title>
&lt;/feed></samp></pre>
<ol>
<li>To create a child element of an existing element, instantiate the <code>SubElement</code> class. The only required arguments are the parent element (<var>new_feed</var> in this case) and the new element&#8217;s name. Since this child element will inherit the namespace mapping of its parent, there is no need to redeclare the namespace or prefix here.
@@ -574,8 +574,8 @@ except ImportError:
<p>Here is a fragment of a broken <abbr>XML</abbr> document. I&#8217;ve highlighted the wellformedness error.
<pre class=nd><code>&lt;?xml version="1.0" encoding="utf-8"?>
&lt;feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
<pre class=nd><code>&lt;?xml version='1.0' encoding='utf-8'?>
&lt;feed xmlns='http://www.w3.org/2005/Atom' xml:lang='en'>
&lt;title>dive into <mark>&hellip;</mark>&lt;/title>
...
&lt;/feed></code></pre>
@@ -584,7 +584,7 @@ except ImportError:
<pre class=screen>
<samp class=p>>>> </samp><kbd>import lxml.etree</kbd>
<samp class=p>>>> </samp><kbd>tree = lxml.etree.parse("examples/feed-broken.xml")</kbd>
<samp class=p>>>> </samp><kbd>tree = lxml.etree.parse('examples/feed-broken.xml')</kbd>
<samp class=traceback>Traceback (most recent call last):
File "&lt;stdin>", line 1, in &lt;module>
File "lxml.etree.pyx", line 2693, in lxml.etree.parse (src/lxml/lxml.etree.c:52591)
@@ -601,16 +601,16 @@ lxml.etree.XMLSyntaxError: Entity 'hellip' not defined, line 3, column 28</samp>
<pre class=screen>
<a><samp class=p>>>> </samp><kbd>parser = lxml.etree.XMLParser(recover=True)</kbd> <span>&#x2460;</span></a>
<a><samp class=p>>>> </samp><kbd>tree = lxml.etree.parse("examples/feed-broken.xml", parser)</kbd> <span>&#x2461;</span></a>
<a><samp class=p>>>> </samp><kbd>tree = lxml.etree.parse('examples/feed-broken.xml', parser)</kbd> <span>&#x2461;</span></a>
<a><samp class=p>>>> </samp><kbd>parser.error_log</kbd> <span>&#x2462;</span></a>
<samp>examples/feed-broken.xml:3:28:FATAL:PARSER:ERR_UNDECLARED_ENTITY: Entity 'hellip' not defined</samp>
<samp class=p>>>> </samp><kbd>tree.findall("{http://www.w3.org/2005/Atom}title")</kbd>
<samp class=p>>>> </samp><kbd>tree.findall('{http://www.w3.org/2005/Atom}title')</kbd>
<samp>[&lt;Element {http://www.w3.org/2005/Atom}title at ead510>]</samp>
<samp class=p>>>> </samp><kbd>title = tree.findall("{http://www.w3.org/2005/Atom}title")[0]</kbd>
<samp class=p>>>> </samp><kbd>title = tree.findall('{http://www.w3.org/2005/Atom}title')[0]</kbd>
<a><samp class=p>>>> </samp><kbd>title.text</kbd> <span>&#x2463;</span></a>
<samp>'dive into '</samp>
<a><samp class=p>>>> </samp><kbd>print(lxml.etree.tounicode(tree.getroot()))</kbd> <span>&#x2464;</span></a>
<samp>&lt;feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
<samp>&lt;feed xmlns='http://www.w3.org/2005/Atom' xml:lang='en'>
&lt;title>dive into &lt;/title>
.
. [rest of serialization snipped for brevity]
@@ -619,7 +619,7 @@ lxml.etree.XMLSyntaxError: Entity 'hellip' not defined, line 3, column 28</samp>
<li>To create a custom parser, instantiate the <code>lxml.etree.XMLParser</code> class. It can take <a href=http://codespeak.net/lxml/parsing.html#parser-options>a number of different named arguments</a>. The one we&#8217;re interested in here is the <var>recover</var> argument. When set to <code>True</code>, the <abbr>XML</abbr> parser will try its best to &#8220;recover&#8221; from wellformedness errors.
<li>To parse an <code>XML</code> document with your custom parser, pass the <var>parser</var> object as the second argument to the <code>parse()</code> function. Note that <code>lxml</code> does not raise an exception about the undefined <code>&amp;hellip;</code> entity.
<li>The parser keeps a log of the wellformedness errors that it has encountered. (This is actually true regardless of whether it is set to recover from those errors or not.)
<li>Since it didn&#8217;t know what to do with the undefined <code>&amp;hellip;</code> entity, the parser just silently dropped it. The text content of the <code>title</code> element becomes <code>"dive into "</code>.
<li>Since it didn&#8217;t know what to do with the undefined <code>&amp;hellip;</code> entity, the parser just silently dropped it. The text content of the <code>title</code> element becomes <code>'dive into '</code>.
<li>As you can see from the serialization, the <code>&amp;hellip;</code> entity didn&#8217;t get moved; it was simply dropped.
</ol>
@@ -640,6 +640,7 @@ lxml.etree.XMLSyntaxError: Entity 'hellip' not defined, line 3, column 28</samp>
<li><a href=http://codespeak.net/lxml/1.3/xpathxslt.html>XPath and <abbr>XSLT</abbr> with <code>lxml</code></a>
</ul>
<p class=v><a rel=prev class=todo><span>&#x261C;</span></a> <a rel=next class=todo><span>&#x261E;</span></a>
<p class=c>&copy; 2001&ndash;9 <a href=about.html>Mark Pilgrim</a>
<script src=j/jquery.js></script>
<script src=j/dip3.js></script>
+17 -17
View File
@@ -29,7 +29,7 @@ th{text-align:left}
1024: ['KiB', 'MiB', 'GiB', 'TiB', 'PiB', 'EiB', 'ZiB', 'YiB']}
def approximate_size(size, a_kilobyte_is_1024_bytes=True):
"""Convert a file size to human-readable form.
'''Convert a file size to human-readable form.
Keyword arguments:
size -- file size in bytes
@@ -38,7 +38,7 @@ def approximate_size(size, a_kilobyte_is_1024_bytes=True):
Returns: string
"""
'''
if size &lt; 0:
raise ValueError('number must be non-negative')
@@ -46,11 +46,11 @@ def approximate_size(size, a_kilobyte_is_1024_bytes=True):
for suffix in SUFFIXES[multiple]:
size /= multiple
if size &lt; multiple:
return "{0:.1f} {1}".format(size, suffix)
return '{0:.1f} {1}'.format(size, suffix)
raise ValueError('number too large')
if __name__ == "__main__":
if __name__ == '__main__':
print(approximate_size(1000000000000, False))
print(approximate_size(1000000000000))</code></pre>
<p>Now let&#8217;s run this program on the command line. On Windows, it will look something like this:
@@ -82,7 +82,7 @@ if __name__ == "__main__":
<p><span>&#x261E;</span>In some languages, functions (that return a value) start with <code>function</code>, and subroutines (that do not return a value) start with <code>sub</code>. There are no subroutines in Python. Everything is a function, all functions return a value (even if it&#8217;s <code>None</code>), and all functions start with <code>def</code>.
</blockquote>
<p>The <code>approximate_size()</code> function takes the two arguments &mdash; <var>size</var> and <var>a_kilobyte_is_1024_bytes</var> &mdash; but neither argument specifies a datatype. In Python, variables are never explicitly typed. Python figures out what type a variable is and keeps track of it internally.
<blockquote class="note compare java">
<blockquote class='note compare java'>
<p><span>&#x261E;</span>In Java and other statically-typed languages, you must specify the datatype of the function return value and each function argument. In Python, you never explicitly specify the datatype of anything. Based on what value you assign, Python keeps track of the datatype internally.
</blockquote>
@@ -99,7 +99,7 @@ if __name__ == "__main__":
<p>Now look at the bottom of the script:
<pre><code>
if __name__ == "__main__":
if __name__ == '__main__':
<a> print(approximate_size(1000000000000, False)) <span>&#x2460;</span></a>
<a> print(approximate_size(1000000000000)) <span>&#x2461;</span></a></code></pre>
<ol>
@@ -138,7 +138,7 @@ SyntaxError: non-keyword arg after keyword arg</samp></pre>
<h3 id=docstrings>Documentation Strings</h3>
<p>You can document a Python function by giving it a documentation string (<code>docstring</code> for short). In this program, the <code>approximate_size()</code> function has a <code>docstring</code>:
<pre><code>def approximate_size(size, a_kilobyte_is_1024_bytes=True):
"""Convert a file size to human-readable form.
'''Convert a file size to human-readable form.
Keyword arguments:
size -- file size in bytes
@@ -147,10 +147,10 @@ SyntaxError: non-keyword arg after keyword arg</samp></pre>
Returns: string
"""</code></pre>
'''</code></pre>
<aside>Every function deserves a decent docstring.</aside>
<p>Triple quotes signify a multi-line string. Everything between the start and end quotes is part of a single string, including carriage returns, leading white space, and other quote characters. You can use them anywhere, but you&#8217;ll see them most often used when defining a <code>docstring</code>.
<blockquote class="note compare perl5">
<blockquote class='note compare perl5'>
<p><span>&#x261E;</span>Triple quotes are also an easy way to define a string with both single and double quotes, like <code>qq/.../</code> in Perl 5.
</blockquote>
<p>Everything between the triple quotes is the function&#8217;s <code>docstring</code>, which documents what the function does. A <code>docstring</code>, if it exists, must be the first thing defined in a function (that is, on the next line after the function declaration). You don&#8217;t technically need to give your function a <code>docstring</code>, but you always should. I know you&#8217;ve heard this in every programming class you&#8217;ve ever taken, but Python gives you an added incentive: the <code>docstring</code> is available at runtime as an attribute of the function.
@@ -182,7 +182,7 @@ SyntaxError: non-keyword arg after keyword arg</samp></pre>
<li>When you want to use functions defined in imported modules, you need to include the module name. So you can&#8217;t just say <code>approximate_size</code>; it must be <code>humansize.approximate_size</code>. If you&#8217;ve used classes in Java, this should feel vaguely familiar.
<li>Instead of calling the function as you would expect to, you asked for one of the function&#8217;s attributes, <code>__doc__</code>.
</ol>
<blockquote class="note compare perl5">
<blockquote class='note compare perl5'>
<p><span>&#x261E;</span><code>import</code> in Python is like <code>require</code> in Perl. Once you <code>import</code> a Python module, you access its functions with <code><var>module</var>.<var>function</var></code>; once you <code>require</code> a Perl module, you access its functions with <code><var>module</var>::<var>function</var></code>.
</blockquote>
<h3 id=importsearchpath>The <code>import</code> Search Path</h3>
@@ -212,7 +212,7 @@ SyntaxError: non-keyword arg after keyword arg</samp></pre>
<ol>
<li>Importing the <code>sys</code> module makes all of its functions and attributes available.
<li><code>sys.path</code> is a list of directory names that constitute the current search path. (Yours will look different, depending on your operating system, what version of Python you&#8217;re running, and where it was originally installed.) Python will look through these directories (in this order) for a <code>.py</code> file whose name matches what you&#8217;re trying to import.
<li>Actually, I lied; the truth is more complicated than that, because not all modules are stored as <code>.py</code> files. Some, like the <code>sys</code> module, are "built-in modules"; they are actually baked right into Python itself. Built-in modules behave just like regular modules, but their Python source code is not available, because they are not written in Python! (The <code>sys</code> module is written in <abbr>C</abbr>.)
<li>Actually, I lied; the truth is more complicated than that, because not all modules are stored as <code>.py</code> files. Some, like the <code>sys</code> module, are <i>built-in modules</i>; they are actually baked right into Python itself. Built-in modules behave just like regular modules, but their Python source code is not available, because they are not written in Python! (The <code>sys</code> module is written in <abbr>C</abbr>.)
<li>You can add a new directory to Python&#8217;s search path at runtime by adding the directory name to <code>sys.path</code>, and then Python will look in that directory as well, whenever you try to import a module. The effect lasts as long as Python is running.
<li>By using <code>sys.path.insert(0, <var>new_path</var>)</code>, you inserted a new directory as the first item of the <code>sys.path</code> list, and therefore at the beginning of Python&#8217;s search path. This is almost always what you want. In case of naming conflicts (for example, if Python ships with version 2 of a particular library but you want to use version 3), this ensures that your modules will be found and used instead of the modules that came with Python.
</ol>
@@ -234,18 +234,18 @@ SyntaxError: non-keyword arg after keyword arg</samp></pre>
<a> for suffix in SUFFIXES[multiple]: <span>&#x2464;</span></a>
size /= multiple
if size &lt; multiple:
return "{0:.1f} {1}".format(size, suffix)
return '{0:.1f} {1}'.format(size, suffix)
raise ValueError('number too large')</code></pre>
<ol>
<li>Code blocks are defined by their indentation. By "code block," I mean functions, <code>if</code> statements, <code>for</code> loops, <code>while</code> loops, and so forth. Indenting starts a block and unindenting ends it. There are no explicit braces, brackets, or keywords. This means that whitespace is significant, and must be consistent. In this example, the function code is indented four spaces. It doesn&#8217;t need to be four spaces, it just needs to be consistent. The first line that is not indented marks the end of the function.
<li>Code blocks are defined by their indentation. By &#8220;code block,&#8221; I mean functions, <code>if</code> statements, <code>for</code> loops, <code>while</code> loops, and so forth. Indenting starts a block and unindenting ends it. There are no explicit braces, brackets, or keywords. This means that whitespace is significant, and must be consistent. In this example, the function code is indented four spaces. It doesn&#8217;t need to be four spaces, it just needs to be consistent. The first line that is not indented marks the end of the function.
<li>In Python, an <code>if</code> statement is followed by a code block. If the <code>if</code> expression evaluates to true, the indented block is executed, otherwise it falls to the <code>else</code> block (if any). (Note the lack of parentheses around the expression.)
<li>This line is inside the <code>if</code> code block. This <code>raise</code> statement will raise an exception (of type <code>ValueError</code>), but only if <code>size &lt; 0</code>.
<li>This is <em>not</em> the end of the function. Completely blank lines don&#8217;t count. They can make the code more readable, but they don&#8217;t count as code block delimiters. The function continues on the next line.
<li>The <code>for</code> loop also marks the start of a code block. Code blocks can contain multiple lines, as long as they are all indented the same amount. This <code>for</code> loop has three lines of code in it. There is no other special syntax for multi-line code blocks. Just indent and get on with your life.
</ol>
<p>After some initial protests and several snide analogies to Fortran, you will make peace with this and start seeing its benefits. One major benefit is that all Python programs look similar, since indentation is a language requirement and not a matter of style. This makes it easier to read and understand other people&#8217;s Python code.
<blockquote class="note compare java">
<blockquote class='note compare java'>
<p><span>&#x261E;</span>Python uses carriage returns to separate statements and a colon and indentation to separate code blocks. <abbr>C++</abbr> and Java use semicolons to separate statements and curly braces to separate code blocks.
</blockquote>
<p class=a>&#x2042;
@@ -254,10 +254,10 @@ SyntaxError: non-keyword arg after keyword arg</samp></pre>
<aside>Everything in Python is an object.</aside>
<p>Python modules are objects and have several useful attributes. You can use this to easily test your modules as you write them, by including a special block of code that executes when you run the Python file on the command line. Take the last few lines of <code>humansize.py</code>:
<pre><code>
if __name__ == "__main__":
if __name__ == '__main__':
print(approximate_size(1000000000000, False))
print(approximate_size(1000000000000))</code></pre>
<blockquote class="note compare clang">
<blockquote class='note compare clang'>
<p><span>&#x261E;</span>Like <abbr>C</abbr>, Python uses <code>==</code> for comparison and <code>=</code> for assignment. Unlike <abbr>C</abbr>, Python does not support in-line assignment, so there&#8217;s no chance of accidentally assigning the value you thought you were comparing.
</blockquote>
<p>So what makes this <code>if</code> statement special? Well, modules are objects, and all modules have a built-in attribute <code>__name__</code>. A module&#8217;s <code>__name__</code> depends on how you&#8217;re using the module. If you <code>import</code> the module, then <code>__name__</code> is the module&#8217;s filename, without a directory path or file extension.
@@ -280,7 +280,7 @@ if __name__ == "__main__":
<li><a href=http://www.python.org/dev/peps/pep-0008/>PEP 8: Style Guide for Python Code</a> discusses good indentation style.
<li><a href=http://docs.python.org/3.0/reference/><cite>Python Reference Manual</cite></a> explains what it means to say that <a href=http://docs.python.org/3.0/reference/datamodel.html#objects-values-and-types>everything in Python is an object</a>, because some people are <a href=http://www.douglasadams.com/dna/pedants.html>pedants</a> and like to discuss that sort of thing at great length.
</ul>
<p class=nav><a rel=prev class=todo><span>&#x261C;</span></a> <a rel=next href=native-datatypes.html title="onward to &#8220;Native Datatypes&#8221;"><span>&#x261E;</span></a>
<p class=v><a rel=prev class=todo><span>&#x261C;</span></a> <a rel=next href=native-datatypes.html title='onward to &#8220;Native Datatypes&#8221;'><span>&#x261E;</span></a>
<p class=c>&copy; 2001&ndash;9 <a href=about.html>Mark Pilgrim</a>
<script src=j/jquery.js></script>
<script src=j/dip3.js></script>