mirror of
https://github.com/kennethreitz/dive-into-python3.git
synced 2026-06-05 15:00:18 +00:00
quoting attribute values is a hard habit to break
This commit is contained in:
+2
-2
@@ -12,7 +12,7 @@ h1:before{content:""}
|
||||
<p>You are here: <a href=index.html>Home</a> <span>‣</span> <a href=table-of-contents.html>Dive Into Python 3</a> <span>‣</span>
|
||||
<h1>About the book</h1>
|
||||
<p>The text of <cite>Dive Into Python 3</cite> is licensed under the <a href=http://creativecommons.org/licenses/by-sa/3.0/ rel=license>Creative Commons Attribution-ShareAlike 3.0 Unported License</a>.
|
||||
<p>The <code>chardet</code> library referenced in <a href=case-study-porting-chardet-to-python-3.html>Case study: porting <code>chardet</code> to Python 3</a> is licensed under the LGPL 2.1 or later. The alphametics solver referenced in <a href=advanced-iterators.html>Advanced Iterators</a> is based on <a href="http://code.activestate.com/recipes/576615/">Raymond Hettinger's solver for Python 2</a>, which he has graciously relicensed under the MIT license so I could port it to Python 3. <a href=advanced-classes.html>Advanced Classes</a> and <a href=special-method-names.html>Special Method Names</a> contain snippets of code from the Python standard library which are released under the Python Software Foundation License version 2. All other example code is my original work and is licensed under the MIT license. Full licensing terms are included in each source code file.
|
||||
<p>The <code>chardet</code> library referenced in <a href=case-study-porting-chardet-to-python-3.html>Case study: porting <code>chardet</code> to Python 3</a> is licensed under the LGPL 2.1 or later. The alphametics solver referenced in <a href=advanced-iterators.html>Advanced Iterators</a> is based on <a href=http://code.activestate.com/recipes/576615/>Raymond Hettinger's solver for Python 2</a>, which he has graciously relicensed under the MIT license so I could port it to Python 3. <a href=advanced-classes.html>Advanced Classes</a> and <a href=special-method-names.html>Special Method Names</a> contain snippets of code from the Python standard library which are released under the Python Software Foundation License version 2. All other example code is my original work and is licensed under the MIT license. Full licensing terms are included in each source code file.
|
||||
<p>The dynamic highlighting effects in the online edition are built on top of <a href=http://jquery.com>jQuery</a>, which is dual-licensed under the MIT and GPL licenses.
|
||||
<p>The online edition loads as quickly as it does because
|
||||
<ol>
|
||||
@@ -22,7 +22,7 @@ h1:before{content:""}
|
||||
<li>The text uses <a href=http://www.alanwood.net/unicode/unicode_samples.html>Unicode characters</a> in place of graphics wherever possible.
|
||||
<li>The entire book was <a href=http://diveintomark.org/archives/2009/03/27/dive-into-history-2009-edition>lovingly hand-authored in HTML 5</a> to avoid markup cruft.
|
||||
</ol>
|
||||
<p>Send corrections and feedback to <a href="mailto:mark@diveintomark.org">mark@diveintomark.org</a>.
|
||||
<p>Send corrections and feedback to <a href=mailto:mark@diveintomark.org>mark@diveintomark.org</a>.
|
||||
<p class=c>© 2001–9 Mark Pilgrim
|
||||
<script src=jquery.js></script>
|
||||
<script src=dip3.js></script>
|
||||
|
||||
@@ -184,7 +184,7 @@ gen = ord_map(unique_characters)</code></pre>
|
||||
|
||||
<h2 id=permutations>Calculating Permutations… The Lazy Way!</h2>
|
||||
|
||||
<p>First of all, what the heck are permutations? Permutations are a mathematical concept. (There are actually several definitions, depending on what kind of math you’re doing. Here I’m talking about combinatorics, but if that doesn’t mean anything to you, don’t worry about it. As always, <a href="http://en.wikipedia.org/wiki/Permutation">Wikipedia is your friend</a>.)
|
||||
<p>First of all, what the heck are permutations? Permutations are a mathematical concept. (There are actually several definitions, depending on what kind of math you’re doing. Here I’m talking about combinatorics, but if that doesn’t mean anything to you, don’t worry about it. As always, <a href=http://en.wikipedia.org/wiki/Permutation>Wikipedia is your friend</a>.)
|
||||
|
||||
<p>The idea is that you take a list of things (could be numbers, could be letters, could be dancing bears) and find all the possible ways to split them up into smaller lists. All the smaller lists have the same size, which can be as small as 1 and as large as the total number of items. Oh, and nothing can be repeated. Mathematicians say things like “let’s find the permutations of 3 different items taken 2 at a time,” which means you have a sequence of 3 items and you want to find all the possible ordered pairs.
|
||||
|
||||
@@ -559,11 +559,11 @@ NameError: name '__import__' is not defined</samp></pre>
|
||||
<h2 id=furtherreading>Further Reading</h2>
|
||||
|
||||
<ul>
|
||||
<li><a href="http://blip.tv/file/1947373/">Watch Raymond Hettinger’s "Easy AI with Python" talk</a> at PyCon 2009
|
||||
<li><a href="http://code.activestate.com/recipes/576615/">Recipe 576615: Alphametics solver</a>, Raymond Hettinger’s original alphametics solver for Python 2
|
||||
<li><a href="http://code.activestate.com/recipes/users/178123/">More of Raymond Hettinger’s recipes</a> in the ActiveState Code repository
|
||||
<li><a href="http://en.wikipedia.org/wiki/Verbal_arithmetic">Alphametics on Wikipedia</a>
|
||||
<li><a href="http://www.tkcs-collins.com/truman/alphamet/index.shtml">Alphametics Index</a>, including <a href="http://www.tkcs-collins.com/truman/alphamet/alphamet.shtml">lots of puzzles</a> and <a href="http://www.tkcs-collins.com/truman/alphamet/alpha_gen.shtml">a generator to make your own</a>
|
||||
<li><a href=http://blip.tv/file/1947373/>Watch Raymond Hettinger’s “Easy AI with Python” talk</a> at PyCon 2009
|
||||
<li><a href=http://code.activestate.com/recipes/576615/>Recipe 576615: Alphametics solver</a>, Raymond Hettinger’s original alphametics solver for Python 2
|
||||
<li><a href=http://code.activestate.com/recipes/users/178123/>More of Raymond Hettinger’s recipes</a> in the ActiveState Code repository
|
||||
<li><a href=http://en.wikipedia.org/wiki/Verbal_arithmetic>Alphametics on Wikipedia</a>
|
||||
<li><a href=http://www.tkcs-collins.com/truman/alphamet/index.shtml>Alphametics Index</a>, including <a href="ttp://www.tkcs-collins.com/truman/alphamet/alphamet.shtml>lots of puzzles</a> and <a href=http://www.tkcs-collins.com/truman/alphamet/alpha_gen.shtml>a generator to make your own</a>
|
||||
</ul>
|
||||
|
||||
<p>Many, many thanks to Raymond Hettinger for agreeing to relicense his code so I could port it to Python 3 and use it as the basis for this chapter.
|
||||
|
||||
@@ -20,7 +20,7 @@ $(document).ready(function() {
|
||||
$(this).next("pre").find("div.w").append(" " + $(this).html());
|
||||
this.parentNode.removeChild(this);
|
||||
});
|
||||
|
||||
|
||||
/* create skip links */
|
||||
var postelm = $(this).next().get(0);
|
||||
var postid = postelm.id || ("postautopre" + i);
|
||||
|
||||
+1
-1
@@ -9,7 +9,7 @@ out = open(output_file, 'w', encoding="utf-8") # encoding argument! important!
|
||||
for line in open(input_file).readlines():
|
||||
# replace entities with Unicode characters
|
||||
for e in re.findall('&(.+?);', line):
|
||||
if e in ('lt', 'gt', 'amp'):
|
||||
if e in ('lt', 'gt', 'amp', 'nbsp'):
|
||||
continue
|
||||
n = html.entities.name2codepoint.get(e)
|
||||
if not n:
|
||||
|
||||
@@ -150,7 +150,7 @@ body{counter-reset:h1 2}
|
||||
</ol>
|
||||
<h3 id=numbers-in-a-boolean-context>Numbers In A Boolean Context</h3>
|
||||
<aside>Zero values are false, and non-zero values are true.</aside>
|
||||
<p>You can use numbers <a href="#booleans">in a boolean context</a>, such as an <code>if</code> statement. Zero values are false, and non-zero values are true.
|
||||
<p>You can use numbers <a href=#booleans>in a boolean context</a>, such as an <code>if</code> statement. Zero values are false, and non-zero values are true.
|
||||
<pre class=screen>
|
||||
<a><samp class=p>>>> </samp><kbd>def is_it_true(anything):</kbd> <span>①</span></a>
|
||||
<samp class=p>... </samp><kbd> if anything:</kbd>
|
||||
@@ -452,10 +452,10 @@ KeyError: 'db.diveintopython3.org'</samp></pre>
|
||||
<samp>yes, it's true</samp></pre>
|
||||
<h2 id=furtherreading>Further Reading</h2>
|
||||
<ul>
|
||||
<li><a href="http://docs.python.org/3.0/library/fractions.html">The <code>fractions</code> module</a>
|
||||
<li><a href="http://docs.python.org/3.0/library/math.html">The <code>math</code> module</a>
|
||||
<li><a href="http://www.python.org/dev/peps/pep-0237/"><abbr>PEP</abbr> 237: Unifying Long Integers and Integers</a>
|
||||
<li><a href="http://www.python.org/dev/peps/pep-0238/"><abbr>PEP</abbr> 238: Changing the Division Operator</a>
|
||||
<li><a href=http://docs.python.org/3.0/library/fractions.html>The <code>fractions</code> module</a>
|
||||
<li><a href=http://docs.python.org/3.0/library/math.html>The <code>math</code> module</a>
|
||||
<li><a href=http://www.python.org/dev/peps/pep-0237/><abbr>PEP</abbr> 237: Unifying Long Integers and Integers</a>
|
||||
<li><a href=http://www.python.org/dev/peps/pep-0238/><abbr>PEP</abbr> 238: Changing the Division Operator</a>
|
||||
</ul>
|
||||
<p class=nav><a rel=prev href=your-first-python-program.html title="back to “Your First Python Program”"><span>☜</span></a> <a rel=next href=strings.html title="onward to “Strings”"><span>☞</span></a>
|
||||
<p class=c>© 2001–9 <a href=about.html>Mark Pilgrim</a>
|
||||
|
||||
@@ -295,7 +295,7 @@ body{counter-reset:h1 4}
|
||||
<a><samp class=p>>>> </samp><kbd>phonePattern.search('800-555-1212-1234')</kbd> <span>③</span></a>
|
||||
<samp class=p>>>> </samp></pre>
|
||||
<ol>
|
||||
<li>Always read regular expressions from left to right. This one matches the beginning of the string, and then <code>(\d{3})</code>. What’s <code>\d{3}</code>? Well, the <code>{3}</code> means “match exactly three numeric digits”; it’s a variation on the <a href="#re.nm" title="7.4. Using the {n,m} Syntax"><code>{n,m} syntax</code></a> you saw earlier. <code>\d</code> means “any numeric digit” (<code>0</code> through <code>9</code>). Putting it in parentheses means “match exactly three numeric digits, <em>and then remember them as a group that I can ask for later</em>”. Then match a literal hyphen. Then match another group of exactly three digits. Then another literal hyphen. Then another group of exactly four digits. Then match the end of the string.
|
||||
<li>Always read regular expressions from left to right. This one matches the beginning of the string, and then <code>(\d{3})</code>. What’s <code>\d{3}</code>? Well, the <code>{3}</code> means “match exactly three numeric digits”; it’s a variation on the <a href=#nmsyntax><code>{n,m} syntax</code></a> you saw earlier. <code>\d</code> means “any numeric digit” (<code>0</code> through <code>9</code>). Putting it in parentheses means “match exactly three numeric digits, <em>and then remember them as a group that I can ask for later</em>”. Then match a literal hyphen. Then match another group of exactly three digits. Then another literal hyphen. Then another group of exactly four digits. Then match the end of the string.
|
||||
<li>To get access to the groups that the regular expression parser remembered along the way, use the <code>groups()</code> method on the object that the <code>search()</code> method returns. It will return a tuple of however many groups were defined in the regular expression. In this case, you defined three groups, one with three digits, one with three digits, and one with four digits.
|
||||
<li>This regular expression is not the final answer, because it doesn’t handle a phone number with an extension on the end. For that, you’ll need to expand the regular expression.
|
||||
</ol>
|
||||
|
||||
@@ -218,7 +218,7 @@ AttributeError</samp></pre>
|
||||
<td><a href=http://www.python.org/doc/3.0/reference/datamodel.html#object.__call__><code>my_instance.__call__()</code></a>
|
||||
</table>
|
||||
|
||||
<p>The <a href="http://docs.python.org/3.0/library/zipfile.html"><code>zipfile</code> module</a> uses this to define a class that can decrypt an encrypted zip file with a given password. The zip decryption algorithm requires you to store state during decryption. Defining the decryptor as a class allows you to maintain this state within a single instance of the decryptor class. The state is initialized in the <code>__init__()</code> method and updated as the file is decrypted. But since the class is also “callable” like a function, you can pass the instance as the first argument of the <code>map()</code> function, like so:
|
||||
<p>The <a href=http://docs.python.org/3.0/library/zipfile.html><code>zipfile</code> module</a> uses this to define a class that can decrypt an encrypted zip file with a given password. The zip decryption algorithm requires you to store state during decryption. Defining the decryptor as a class allows you to maintain this state within a single instance of the decryptor class. The state is initialized in the <code>__init__()</code> method and updated as the file is decrypted. But since the class is also “callable” like a function, you can pass the instance as the first argument of the <code>map()</code> function, like so:
|
||||
|
||||
<pre><code>
|
||||
# excerpt from zipfile.py
|
||||
|
||||
+16
-16
@@ -19,7 +19,7 @@ My alphabet starts where your alphabet ends! <span>❞</span><br>— Dr
|
||||
</blockquote>
|
||||
<p id=toc>
|
||||
<h2 id=boring-stuff>Some Boring Stuff You Need To Understand Before You Can Dive In</h2>
|
||||
<p class=f>Did you know that the people of <a href="http://en.wikipedia.org/wiki/Bougainville_Province">Bougainville</a> have the smallest alphabet in the world? Their <a href="http://en.wikipedia.org/wiki/Rotokas_alphabet">Rotokas alphabet</a> is composed of only 12 letters: A, E, G, I, K, O, P, R, S, T, U, and V. On the other end of the spectrum, languages like Chinese, Japanese, and Korean have thousands of characters. English, of course, has 26 letters — 52 if you count uppercase and lowercase separately — plus a handful of <i class=baa>!@#$%&</i> punctuation marks.
|
||||
<p class=f>Did you know that the people of <a href=http://en.wikipedia.org/wiki/Bougainville_Province>Bougainville</a> have the smallest alphabet in the world? Their <a href=http://en.wikipedia.org/wiki/Rotokas_alphabet>Rotokas alphabet</a> is composed of only 12 letters: A, E, G, I, K, O, P, R, S, T, U, and V. On the other end of the spectrum, languages like Chinese, Japanese, and Korean have thousands of characters. English, of course, has 26 letters — 52 if you count uppercase and lowercase separately — plus a handful of <i class=baa>!@#$%&</i> punctuation marks.
|
||||
|
||||
<p>When people talk about “text,” they’re thinking of “characters and symbols on the computer screen.” But computers don’t deal in characters and symbols; they deal in bits and bytes. Every piece of text you’ve ever seen on a computer screen is actually stored in a particular <i>character encoding</i>. Very roughly speaking, the character encoding provides a mapping between the stuff you see on your screen and the stuff your computer actually stores in memory and on disk. There are many different character encodings, some optimized for particular languages like Russian or Chinese or English, and others that can be used for multiple languages.
|
||||
|
||||
@@ -209,7 +209,7 @@ def approximate_size(size, a_kilobyte_is_1024_bytes=True):
|
||||
<samp class=p>>>> </samp><kbd>"{0:.1f} {1}".format(698.25, 'GB')</kbd>
|
||||
<samp>'698.3 GB'</samp></pre>
|
||||
|
||||
<p>For all the gory details on format specifiers, consult the <a href="http://docs.python.org/3.0/library/string.html#format-specification-mini-language">Format Specification Mini-Language</a> in the official Python documentation.
|
||||
<p>For all the gory details on format specifiers, consult the <a href=http://docs.python.org/3.0/library/string.html#format-specification-mini-language>Format Specification Mini-Language</a> in the official Python documentation.
|
||||
|
||||
<h2 id=common-string-methods>Other Common String Methods</h2>
|
||||
|
||||
@@ -373,7 +373,7 @@ FIXME: move this to the intro of the upcoming files chapter?
|
||||
<p>Python 3 assumes that your source code — <i>i.e.</i> each <code>.py</code> file — is encoded in UTF-8.
|
||||
|
||||
<blockquote class="note compare python2">
|
||||
<p><span>☞</span>In Python 2, the default encoding for <code>.py</code> files was <abbr>ASCII</abbr>. In Python 3, <a href="http://www.python.org/dev/peps/pep-3120/">the default encoding is UTF-8</a>.
|
||||
<p><span>☞</span>In Python 2, the default encoding for <code>.py</code> files was <abbr>ASCII</abbr>. In Python 3, <a href=http://www.python.org/dev/peps/pep-3120/>the default encoding is UTF-8</a>.
|
||||
</blockquote>
|
||||
|
||||
<p>If you would like to use a different encoding within your Python code, you can put an encoding declaration on the first line of each file. This declaration defines a <code>.py</code> file to be windows-1252:
|
||||
@@ -385,40 +385,40 @@ FIXME: move this to the intro of the upcoming files chapter?
|
||||
<pre><code>#!/usr/bin/python3
|
||||
# -*- coding: windows-1252 -*-</code></pre>
|
||||
|
||||
<p>For more information, consult <a href="http://www.python.org/dev/peps/pep-0263/"><abbr>PEP</abbr> 263: Defining Python Source Code Encodings</a>.
|
||||
<p>For more information, consult <a href=http://www.python.org/dev/peps/pep-0263/><abbr>PEP</abbr> 263: Defining Python Source Code Encodings</a>.
|
||||
|
||||
<h2 id=furtherreading>Further Reading</h2>
|
||||
|
||||
<p>On Unicode in Python:
|
||||
|
||||
<ul>
|
||||
<li><a href="http://docs.python.org/3.0/howto/unicode.html">Python Unicode HOWTO</a>
|
||||
<li><a href="http://docs.python.org/3.0/whatsnew/3.0.html#text-vs-data-instead-of-unicode-vs-8-bit">What’s New In Python 3: Text vs. Data Instead Of Unicode vs. 8-bit</a>
|
||||
<li><a href=http://docs.python.org/3.0/howto/unicode.html>Python Unicode HOWTO</a>
|
||||
<li><a href=http://docs.python.org/3.0/whatsnew/3.0.html#text-vs-data-instead-of-unicode-vs-8-bit>What’s New In Python 3: Text vs. Data Instead Of Unicode vs. 8-bit</a>
|
||||
</ul>
|
||||
|
||||
<p>On Unicode in general:
|
||||
|
||||
<ul>
|
||||
<li><a href="http://www.joelonsoftware.com/articles/Unicode.html">The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)</a>
|
||||
<li><a href="http://www.tbray.org/ongoing/When/200x/2003/04/06/Unicode">On the Goodness of Unicode</a>
|
||||
<li><a href="http://www.tbray.org/ongoing/When/200x/2003/04/13/Strings">On Character Strings</a>
|
||||
<li><a href="http://www.tbray.org/ongoing/When/200x/2003/04/26/UTF">Characters vs. Bytes</a>
|
||||
<li><a href=http://www.joelonsoftware.com/articles/Unicode.html>The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)</a>
|
||||
<li><a href=http://www.tbray.org/ongoing/When/200x/2003/04/06/Unicode>On the Goodness of Unicode</a>
|
||||
<li><a href=http://www.tbray.org/ongoing/When/200x/2003/04/13/Strings>On Character Strings</a>
|
||||
<li><a href=http://www.tbray.org/ongoing/When/200x/2003/04/26/UTF>Characters vs. Bytes</a>
|
||||
</ul>
|
||||
|
||||
<p>On character encoding in other formats:
|
||||
|
||||
<ul>
|
||||
<li><a href="http://feedparser.org/docs/character-encoding.html">Character encoding in XML</a>
|
||||
<li><a href="http://blog.whatwg.org/the-road-to-html-5-character-encoding">Character encoding in HTML</a>
|
||||
<li><a href=http://feedparser.org/docs/character-encoding.html>Character encoding in XML</a>
|
||||
<li><a href=http://blog.whatwg.org/the-road-to-html-5-character-encoding>Character encoding in HTML</a>
|
||||
</ul>
|
||||
|
||||
<p>On strings and string formatting:
|
||||
|
||||
<ul>
|
||||
<li><a href="http://docs.python.org/3.0/library/string.html"><code>string</code> — Common string operations</a>
|
||||
<li><a href="http://docs.python.org/3.0/library/string.html#formatstrings">Format String Syntax</a>
|
||||
<li><a href="http://docs.python.org/3.0/library/string.html#format-specification-mini-language">Format Specification Mini-Language</a>
|
||||
<li><a href="http://www.python.org/dev/peps/pep-3101/"><abbr>PEP</abbr> 3101: Advanced String Formatting</a>
|
||||
<li><a href=http://docs.python.org/3.0/library/string.html><code>string</code> — Common string operations</a>
|
||||
<li><a href=http://docs.python.org/3.0/library/string.html#formatstrings>Format String Syntax</a>
|
||||
<li><a href=http://docs.python.org/3.0/library/string.html#format-specification-mini-language>Format Specification Mini-Language</a>
|
||||
<li><a href=http://www.python.org/dev/peps/pep-3101/><abbr>PEP</abbr> 3101: Advanced String Formatting</a>
|
||||
</ul>
|
||||
|
||||
<p class=nav><a rel=prev href=native-datatypes.html title="back to “Native Datatypes”"><span>☜</span></a> <a rel=next href=regular-expressions.html title="onward to “Regular Expressions”"><span>☞</span></a>
|
||||
|
||||
+3
-3
@@ -17,8 +17,8 @@ body{counter-reset:h1 8}
|
||||
</blockquote>
|
||||
<p id=toc>
|
||||
<h2 id=divingin>(Not) Diving In</h2>
|
||||
<p class=f>In this chapter, you’re going to write and debug a set of utility functions to convert to and from Roman numerals. You saw the mechanics of constructing and validating Roman numerals in <a href="regular-expressions.html#romannumerals">“Case study: roman numerals”</a>. Now step back and consider what it would take to expand that into a two-way utility.
|
||||
<p><a href="regular-expressions.html#romannumerals">The rules for Roman numerals</a> lead to a number of interesting observations:
|
||||
<p class=f>In this chapter, you’re going to write and debug a set of utility functions to convert to and from Roman numerals. You saw the mechanics of constructing and validating Roman numerals in <a href=regular-expressions.html#romannumerals>“Case study: roman numerals”</a>. Now step back and consider what it would take to expand that into a two-way utility.
|
||||
<p><a href=regular-expressions.html#romannumerals>The rules for Roman numerals</a> lead to a number of interesting observations:
|
||||
<ol>
|
||||
<li>There is only one correct way to represent a particular number as a Roman numeral.
|
||||
<li>The converse is also true: if a string of characters is a valid Roman numeral, it represents only one number (that is, it can only be interpreted one way).
|
||||
@@ -249,7 +249,7 @@ OK</samp></pre>
|
||||
<li>The <code>unittest.TestCase</code> class provides the <code>assertRaises</code> method, which takes the following arguments: the exception you’re expecting, the function you’re testing, and the arguments you’re passing to that function. (If the function you’re testing takes more than one argument, pass them all to <code>assertRaises</code>, in order, and it will pass them right along to the function you’re testing.)
|
||||
</ol>
|
||||
<p>Pay close attention to this last line of code. Instead of calling <code>to_roman()</code> directly and manually checking that it raises a particular exception (by wrapping it in a <code>try...except</code> block [FIXME xref]), the <code>assertRaises</code> method has encapsulated all of that for us. All you do is tell it what exception you’re expecting (<code>roman2.OutOfRangeError</code>), the function (<code>to_roman()</code>), and the function’s arguments (<code>4000</code>). The <code>assertRaises</code> method takes care of calling <code>to_roman()</code> and checking that it raises <code>roman2.OutOfRangeError</code>.
|
||||
<p>Also note that you’re passing the <code>to_roman()</code> function itself as an argument; you’re not calling it, and you’re not passing the name of it as a string. Have I mentioned recently how handy it is that <a href="your-first-python-program.html#everythingisanobject">everything in Python is an object</a>?
|
||||
<p>Also note that you’re passing the <code>to_roman()</code> function itself as an argument; you’re not calling it, and you’re not passing the name of it as a string. Have I mentioned recently how handy it is that <a href=your-first-python-program.html#everythingisanobject>everything in Python is an object</a>?
|
||||
<p>So what happens when you run the test suite with this new test?
|
||||
<pre class=screen>
|
||||
<samp class=p>you@localhost:~$ </samp><kbd>python3 romantest2.py -v</kbd>
|
||||
|
||||
+4
-3
@@ -34,11 +34,12 @@ h3:before{content:""}
|
||||
|
||||
<p>Iterators are everywhere in Python 3, and I understand them a lot better than I did five years ago when I wrote “Dive Into Python”. You need to understand them too, because lots of functions that used to return lists in Python 2 will now return iterators in Python 3. At a minimum, you should read <a href=iterators.html#a-fibonacci-iterator>the second half of the Iterators chapter</a> and <a href=advanced-iterators.html#generator-expressions>the second half of the Advanced Iterators chapter</a>.
|
||||
|
||||
<p>By popular request, I’ve added an appendix on <a href=special-method-names.html>Special Method Names</a>, which is kind of like <a href="http://www.python.org/doc/3.0/reference/datamodel.html#special-method-names">the Python docs “Data Model” chapter</a> but with more snark.
|
||||
<p>By popular request, I’ve added an appendix on <a href=special-method-names.html>Special Method Names</a>, which is kind of like <a href=http://www.python.org/doc/3.0/reference/datamodel.html#special-method-names>the Python docs “Data Model” chapter</a> but with more snark.
|
||||
|
||||
<p>That’s it for now; the book’s not finished yet! The file I/O subsystem is totally different now; I hope to write about that soon. There are much better choices for XML processing now; I hope to write about that, too.
|
||||
<p>When I was writing “Dive Into Python”, all of the available XML libraries sucked. Then Fredrik Lundh wrote <a href=http://effbot.org/zone/element-index.htm>ElementTree</a>, which doesn’t suck at all. Then the Python gods wisely <a href=http://docs.python.org/3.0/library/xml.etree.elementtree.html>incorporated ElementTree into the standard library</a>, and now it forms the basis for <a href=xml.html>my new XML chapter</a>. The old ways of parsing XML are still around, but you should avoid them, because they suck!
|
||||
|
||||
<p>That’s it for now; the book’s not finished yet! The file I/O subsystem is totally different now; I hope to write about that soon.
|
||||
|
||||
<!--<p class=nav><a rel=prev class=todo><span>☜</span></a> <a rel=next href=your-first-python-program.html><span>☞</span></a>-->
|
||||
<p class=c>© 2001–9 <a href=about.html>Mark Pilgrim</a>
|
||||
<script src=jquery.js></script>
|
||||
<script src=dip3.js></script>
|
||||
|
||||
@@ -14,7 +14,7 @@ mark{display:inline}
|
||||
<p id=level>Difficulty level: <span title=beginner>♦♦♦♢♢</span>
|
||||
<h1>XML</h1>
|
||||
<blockquote class=q>
|
||||
<p><span>❝</span> FIXME <span>❞</span><br>— FIXME
|
||||
<p><span>❝</span> In the archonship of Aristaechmus, Draco enacted his ordinances. <span>❞</span><br>— <a href="http://www.perseus.tufts.edu/cgi-bin/ptext?doc=Perseus:text:1999.01.0046;query=chapter%3D%235;layout=;loc=3.1">Aristotle</a>
|
||||
</blockquote>
|
||||
<p id=toc>
|
||||
<h2 id=divingin>Diving In</h2>
|
||||
@@ -270,39 +270,67 @@ mark{display:inline}
|
||||
|
||||
<pre class=screen>
|
||||
# continued from the previous example
|
||||
<samp class=p>>>> </samp><kbd>root.tag</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>root.tag</kbd> <span>①</span></a>
|
||||
<samp>'{http://www.w3.org/2005/Atom}feed'</samp>
|
||||
<samp class=p>>>> </samp><kbd>len(root)</kbd>
|
||||
<samp>9</kbd>
|
||||
<samp class=p>>>> </samp><kbd>for child in root:</kbd>
|
||||
<samp class=p>... </samp><kbd> print(child)</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>len(root)</kbd> <span>②</span></a>
|
||||
<samp>8</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>for child in root:</kbd> <span>③</span></a>
|
||||
<a><samp class=p>... </samp><kbd> print(child)</kbd> <span>④</span></a>
|
||||
<samp class=p>... </samp>
|
||||
<samp><Element {http://www.w3.org/2005/Atom}title at e2b5d0>
|
||||
<Element {http://www.w3.org/2005/Atom}subtitle at e2b4e0>
|
||||
<Element {http://www.w3.org/2005/Atom}id at e2b6c0>
|
||||
<Element {http://www.w3.org/2005/Atom}updated at e2b6f0>
|
||||
<Element {http://www.w3.org/2005/Atom}link at e181b0>
|
||||
<Element {http://www.w3.org/2005/Atom}link at e2b4b0>
|
||||
<Element {http://www.w3.org/2005/Atom}entry at e2b720>
|
||||
<Element {http://www.w3.org/2005/Atom}entry at e2b510>
|
||||
<Element {http://www.w3.org/2005/Atom}entry at e2b750></samp></pre>
|
||||
<ol>
|
||||
<li>Continuing from the previous example, the root element is <code>{http://www.w3.org/2005/Atom}feed</code>.
|
||||
<li>The “length” of the root element is the number of child elements.
|
||||
<li>You can use the element itself as an iterator to loop through all of its child elements.
|
||||
<li>As you can see from the output, there are indeed 8 child elements: all of the feed-level metadata (<code>title</code>, <code>subtitle</code>, <code>id</code>, <code>updated</code>, and <code>link</code>) followed by the three <code>entry</code> elements.
|
||||
</ol>
|
||||
|
||||
<p>You may have guessed this already, but I want to point it out explicitly: the list of child elements only includes <em>direct</em> children. Each of the <code>entry</code> elements contain their own children, but those are not included in the list. They would be included in the list of each <code>entry</code>’s children, but they are not included in the list of the <code>feed</code>’s children. There are ways to find elements no matter how deeply nested they are; we’ll look at two such ways later in this chapter.
|
||||
|
||||
<h3 id=xml-attributes>Attributes Are Dictonaries</h3>
|
||||
|
||||
<p>FIXME
|
||||
<p>XML isn’t just a collection of elements; each element can also have its own set of attributes. Once you have a reference to a specific element, you can easily get its attributes as a Python dictionary.
|
||||
|
||||
<p>To refresh your memory, here is the first few lines of <code>feed.xml</code>, the XML document we’re working with.
|
||||
|
||||
<pre><code><feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
|
||||
<title>dive into mark</title>
|
||||
<subtitle>currently between addictions</subtitle>
|
||||
<id>tag:diveintomark.org,2001-07-29:/</id>
|
||||
<updated>2009-03-27T21:56:07Z</updated>
|
||||
<link rel="alternate" type="text/html" href="http://diveintomark.org/"/>
|
||||
<link rel="self" type="application/atom+xml" href="http://diveintomark.org/feed/"/>
|
||||
...</code></pre>
|
||||
<pre class=screen>
|
||||
>>> root.attrib
|
||||
{'{http://www.w3.org/XML/1998/namespace}lang': 'en'}
|
||||
>>> root[4]
|
||||
<Element {http://www.w3.org/2005/Atom}link at e181b0>
|
||||
>>> root[4].attrib
|
||||
{'href': 'http://diveintomark.org/', 'type': 'text/html', 'rel': 'alternate'}
|
||||
>>> root[3]
|
||||
<Element {http://www.w3.org/2005/Atom}updated at e2b4e0>
|
||||
>>> root[3].attrib
|
||||
{}
|
||||
</pre>
|
||||
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
|
||||
FIXME
|
||||
# continuing from the previous example
|
||||
<a><samp class=p>>>> </samp><kbd>root.attrib</kbd> <span>①</span></a>
|
||||
<samp>{'{http://www.w3.org/XML/1998/namespace}lang': 'en'}</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>root[4]</kbd> <span>②</span></a>
|
||||
<samp><Element {http://www.w3.org/2005/Atom}link at e181b0></samp>
|
||||
<a><samp class=p>>>> </samp><kbd>root[4].attrib</kbd> <span>③</span></a>
|
||||
<samp>{'href': 'http://diveintomark.org/',
|
||||
'type': 'text/html',
|
||||
'rel': 'alternate'}</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>root[3]</kbd> <span>④</span></a>
|
||||
<samp><Element {http://www.w3.org/2005/Atom}updated at e2b4e0></samp>
|
||||
<a><samp class=p>>>> </samp><kbd>root[3].attrib</kbd> <span>⑤</span></a>
|
||||
<samp>{}</samp></pre>
|
||||
<ol>
|
||||
<li>The <code>attrib</code> property is a dictionary of the element’s attributes. The original markup here was <code><feed xmlns
|
||||
<li>
|
||||
<li>
|
||||
<li>
|
||||
<li>
|
||||
</ol>
|
||||
|
||||
<h2 id=xml-find>Searching For Nodes Within An XML Document</h2>
|
||||
|
||||
|
||||
Reference in New Issue
Block a user