mirror of
https://github.com/kennethreitz/dive-into-python3.git
synced 2026-06-05 15:00:18 +00:00
colorize interactive shell examples
This commit is contained in:
+15
-15
@@ -105,29 +105,29 @@ class OrderedDict(dict, collections.MutableMapping):
|
||||
<p>FIXME
|
||||
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>import ordereddict</kbd>
|
||||
<samp class=p>>>> </samp><kbd>od = ordereddict.OrderedDict()</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>klass = od.__class__</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd>type(klass)</kbd>
|
||||
<samp><class 'abc.ABCMeta'></samp>
|
||||
<samp class=p>>>> </samp><kbd>klass.__name__</kbd>
|
||||
<samp>'OrderedDict'</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>import ordereddict</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>od = ordereddict.OrderedDict()</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>klass = od.__class__</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>type(klass)</kbd>
|
||||
<samp class=pp><class 'abc.ABCMeta'></samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>klass.__name__</kbd>
|
||||
<samp class=pp>'OrderedDict'</samp>
|
||||
<!--
|
||||
<samp class=p>>>> </samp><kbd>klass.__doc__</kbd>
|
||||
<samp>FIXME</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>klass.__doc__</kbd>
|
||||
<samp class=pp>FIXME</samp>
|
||||
-->
|
||||
<samp class=p>>>> </samp><kbd>klass.__module__</kbd>
|
||||
<samp>'ordereddict'</samp>
|
||||
<samp class=p>>>> </samp><kbd>klass.__bases__</kbd>
|
||||
<samp>(<class 'dict'>, <class '_abcoll.MutableMapping'>)</samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>klass.__module__</kbd>
|
||||
<samp class=pp>'ordereddict'</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>klass.__bases__</kbd>
|
||||
<samp class=pp>(<class 'dict'>, <class '_abcoll.MutableMapping'>)</samp></pre>
|
||||
<ol>
|
||||
<li>FIXME
|
||||
</ol>
|
||||
|
||||
<pre class=screen>
|
||||
# continued from previous example
|
||||
<samp class=p>>>> </samp><kbd>klass.__dict__</kbd>
|
||||
<samp>{'__abstractmethods__': frozenset(),
|
||||
<samp class=p>>>> </samp><kbd class=pp>klass.__dict__</kbd>
|
||||
<samp class=pp>{'__abstractmethods__': frozenset(),
|
||||
'__delitem__': <function __delitem__ at 0x00DCB6A8>,
|
||||
'__dict__': <attribute '__dict__' of 'OrderedDict' objects>,
|
||||
'__doc__': None,
|
||||
|
||||
+163
-163
@@ -90,11 +90,11 @@ if __name__ == '__main__':
|
||||
<p>The first thing this alphametics solver does is find all the letters (A–Z) in the puzzle.
|
||||
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>import re</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>re.findall('[0-9]+', '16 2-by-4s in rows of 8')</kbd> <span class=u>①</span></a>
|
||||
<samp>['16', '2', '4', '8']</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>re.findall('[A-Z]+', 'SEND + MORE == MONEY')</kbd> <span class=u>②</span></a>
|
||||
<samp>['SEND', 'MORE', 'MONEY']</samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>import re</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.findall('[0-9]+', '16 2-by-4s in rows of 8')</kbd> <span class=u>①</span></a>
|
||||
<samp class=pp>['16', '2', '4', '8']</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.findall('[A-Z]+', 'SEND + MORE == MONEY')</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>['SEND', 'MORE', 'MONEY']</samp></pre>
|
||||
<ol>
|
||||
<li>The <code>re</code> module is Python’s implementation of <a href=regular-expressions.html>regular expressions</a>. It has a nifty function called <code>findall()</code> which takes a regular expression pattern and a string, and finds all occurrences of the pattern within the string. In this case, the pattern matches sequences of numbers. The <code>findall()</code> function returns a list of all the substrings that matched the pattern.
|
||||
<li>Here the regular expression pattern matches sequences of letters. Again, the return value is a list, and each item in the list is a string that matched the regular expression pattern.
|
||||
@@ -107,17 +107,17 @@ if __name__ == '__main__':
|
||||
<p>Set comprehensions make it trivial to find the unique items in a sequence. [FIXME-not sure if I’m going to cover set comprehensions in an earlier chapter; if not, this is certainly an abrupt and inadequate introduction to the topic.]
|
||||
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>a_list = ['a', 'c', 'b', 'a', 'd', 'b']</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>{c for c in a_list}</kbd> <span class=u>①</span></a>
|
||||
<samp>{'a', 'c', 'b', 'd'}</samp>
|
||||
<samp class=p>>>> </samp><kbd>a_string = 'EAST IS EAST'</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>{c for c in a_string}</kbd> <span class=u>②</span></a>
|
||||
<samp>{'A', ' ', 'E', 'I', 'S', 'T'}</samp>
|
||||
<samp class=p>>>> </samp><kbd>words = ['SEND', 'MORE', 'MONEY']</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>''.join(words)</kbd> <span class=u>③</span></a>
|
||||
<samp>'SENDMOREMONEY'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>{c for c in ''.join(words)}</kbd> <span class=u>④</span></a>
|
||||
<samp>{'E', 'D', 'M', 'O', 'N', 'S', 'R', 'Y'}</samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>a_list = ['a', 'c', 'b', 'a', 'd', 'b']</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>{c for c in a_list}</kbd> <span class=u>①</span></a>
|
||||
<samp class=pp>{'a', 'c', 'b', 'd'}</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>a_string = 'EAST IS EAST'</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>{c for c in a_string}</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>{'A', ' ', 'E', 'I', 'S', 'T'}</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>words = ['SEND', 'MORE', 'MONEY']</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>''.join(words)</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp>'SENDMOREMONEY'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>{c for c in ''.join(words)}</kbd> <span class=u>④</span></a>
|
||||
<samp class=pp>{'E', 'D', 'M', 'O', 'N', 'S', 'R', 'Y'}</samp></pre>
|
||||
<ol>
|
||||
<li>Given a list of several strings, a set comprehension with the identity function will return a set of unique strings from the list. This makes sense if you think of it like a <code>for</code> loop. Take the first item from the list, put it in the set. Second. Third. Fourth — wait, that’s in the set already, so it only gets listed once. Fifth. Sixth — again, a duplicate, so it only gets listed once. The end result? All the unique items in the original list, without any duplicates. The original list doesn’t even need to be sorted first.
|
||||
<li>The same technique works with strings, since a string is just a sequence of characters.
|
||||
@@ -138,8 +138,8 @@ if __name__ == '__main__':
|
||||
<p>Like many programming languages, Python has an <code>assert</code> statement. Here’s how it works.
|
||||
|
||||
<pre class=screen>
|
||||
<a><samp class=p>>>> </samp><kbd>assert 1 + 1 == 2</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>assert 1 + 1 == 3</kbd> <span class=u>②</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>assert 1 + 1 == 2</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>assert 1 + 1 == 3</kbd> <span class=u>②</span></a>
|
||||
<samp class=traceback>Traceback (most recent call last):
|
||||
File "<stdin>", line 1, in <module>
|
||||
AssertionError</samp></pre>
|
||||
@@ -168,16 +168,16 @@ AssertionError</samp></pre>
|
||||
<p>A generator expression is like a <a href=generators.html>generator function</a> without the function.
|
||||
|
||||
<pre class=screen>
|
||||
<samp>>>> </samp><kbd>unique_characters = {'E', 'D', 'M', 'O', 'N', 'S', 'R', 'Y'}</kbd>
|
||||
<a><samp>>>> </samp><kbd>gen = (ord(c) for c in unique_characters)</kbd> <span class=u>①</span></a>
|
||||
<a><samp>>>> </samp><kbd>gen</kbd> <span class=u>②</span></a>
|
||||
<samp><generator object <genexpr> at 0x00BADC10></samp>
|
||||
<a><samp>>>> </samp><kbd>next(gen)</kbd> <span class=u>③</span></a>
|
||||
<samp>69</samp>
|
||||
<samp>>>> </samp><kbd>next(gen)</kbd>
|
||||
<samp>68</samp>
|
||||
<a><samp>>>> </samp><kbd>tuple(ord(c) for c in unique_characters)</kbd> <span class=u>④</span></a>
|
||||
<samp>(69, 68, 77, 79, 78, 83, 82, 89)</samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>unique_characters = {'E', 'D', 'M', 'O', 'N', 'S', 'R', 'Y'}</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>gen = (ord(c) for c in unique_characters)</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>gen</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp><generator object <genexpr> at 0x00BADC10></samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>next(gen)</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp>69</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>next(gen)</kbd>
|
||||
<samp class=pp>68</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>tuple(ord(c) for c in unique_characters)</kbd> <span class=u>④</span></a>
|
||||
<samp class=pp>(69, 68, 77, 79, 78, 83, 82, 89)</samp></pre>
|
||||
<ol>
|
||||
<li>A generator expression is like an anonymous function that yields values. The expression itself looks like a list comprehension [FIXME have we introduced this yet?], but it’s wrapped in parentheses instead of square brackets.
|
||||
<li>The generator expression returns… an iterator.
|
||||
@@ -202,21 +202,21 @@ gen = ord_map(unique_characters)</code></pre>
|
||||
<p>The idea is that you take a list of things (could be numbers, could be letters, could be dancing bears) and find all the possible ways to split them up into smaller lists. All the smaller lists have the same size, which can be as small as 1 and as large as the total number of items. Oh, and nothing can be repeated. Mathematicians say things like “let’s find the permutations of 3 different items taken 2 at a time,” which means you have a sequence of 3 items and you want to find all the possible ordered pairs.
|
||||
|
||||
<pre class=screen>
|
||||
<a><samp class=p>>>> </samp><kbd>import itertools</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>perms = itertools.permutations([1, 2, 3], 2)</kbd> <span class=u>②</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>next(perms)</kbd> <span class=u>③</span></a>
|
||||
<samp>(1, 2)</samp>
|
||||
<samp class=p>>>> </samp><kbd>next(perms)</kbd>
|
||||
<samp>(1, 3)</samp>
|
||||
<samp class=p>>>> </samp><kbd>next(perms)</kbd>
|
||||
<a><samp>(2, 1)</samp> <span class=u>④</span></a>
|
||||
<samp class=p>>>> </samp><kbd>next(perms)</kbd>
|
||||
<samp>(2, 3)</samp>
|
||||
<samp class=p>>>> </samp><kbd>next(perms)</kbd>
|
||||
<samp>(3, 1)</samp>
|
||||
<samp class=p>>>> </samp><kbd>next(perms)</kbd>
|
||||
<samp>(3, 2)</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>next(perms)</kbd> <span class=u>⑤</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>import itertools</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>perms = itertools.permutations([1, 2, 3], 2)</kbd> <span class=u>②</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>next(perms)</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp>(1, 2)</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>next(perms)</kbd>
|
||||
<samp class=pp>(1, 3)</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>next(perms)</kbd>
|
||||
<a><samp class=pp>(2, 1)</samp> <span class=u>④</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>next(perms)</kbd>
|
||||
<samp class=pp>(2, 3)</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>next(perms)</kbd>
|
||||
<samp class=pp>(3, 1)</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>next(perms)</kbd>
|
||||
<samp class=pp>(3, 2)</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>next(perms)</kbd> <span class=u>⑤</span></a>
|
||||
<samp class=traceback>Traceback (most recent call last):
|
||||
File "<stdin>", line 1, in <module>
|
||||
StopIteration</samp></pre>
|
||||
@@ -231,26 +231,26 @@ StopIteration</samp></pre>
|
||||
<p>The <code>permutations()</code> function doesn’t have to take a list. It can take any sequence — even a string.
|
||||
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>import itertools</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>perms = itertools.permutations('ABC', 3)</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd>next(perms)</kbd>
|
||||
<a><samp>('A', 'B', 'C')</samp> <span class=u>②</span></a>
|
||||
<samp class=p>>>> </samp><kbd>next(perms)</kbd>
|
||||
<samp>('A', 'C', 'B')</samp>
|
||||
<samp class=p>>>> </samp><kbd>next(perms)</kbd>
|
||||
<samp>('B', 'A', 'C')</samp>
|
||||
<samp class=p>>>> </samp><kbd>next(perms)</kbd>
|
||||
<samp>('B', 'C', 'A')</samp>
|
||||
<samp class=p>>>> </samp><kbd>next(perms)</kbd>
|
||||
<samp>('C', 'A', 'B')</samp>
|
||||
<samp class=p>>>> </samp><kbd>next(perms)</kbd>
|
||||
<samp>('C', 'B', 'A')</samp>
|
||||
<samp class=p>>>> </samp><kbd>next(perms)</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>import itertools</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>perms = itertools.permutations('ABC', 3)</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>next(perms)</kbd>
|
||||
<a><samp class=pp>('A', 'B', 'C')</samp> <span class=u>②</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>next(perms)</kbd>
|
||||
<samp class=pp>('A', 'C', 'B')</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>next(perms)</kbd>
|
||||
<samp class=pp>('B', 'A', 'C')</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>next(perms)</kbd>
|
||||
<samp class=pp>('B', 'C', 'A')</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>next(perms)</kbd>
|
||||
<samp class=pp>('C', 'A', 'B')</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>next(perms)</kbd>
|
||||
<samp class=pp>('C', 'B', 'A')</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>next(perms)</kbd>
|
||||
<samp class=traceback>Traceback (most recent call last):
|
||||
File "<stdin>", line 1, in <module>
|
||||
StopIteration</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>list(itertools.permutations('ABC', 3))</kbd> <span class=u>③</span></a>
|
||||
<samp>[('A', 'B', 'C'), ('A', 'C', 'B'),
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>list(itertools.permutations('ABC', 3))</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp>[('A', 'B', 'C'), ('A', 'C', 'B'),
|
||||
('B', 'A', 'C'), ('B', 'C', 'A'),
|
||||
('C', 'A', 'B'), ('C', 'B', 'A')]</samp></pre>
|
||||
<ol>
|
||||
@@ -263,13 +263,13 @@ StopIteration</samp>
|
||||
|
||||
<h2 id=more-itertools>Other Fun Stuff in the <code>itertools</code> Module</h2>
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>import itertools</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>list(itertools.product('ABC', '123'))</kbd> <span class=u>①</span></a>
|
||||
<samp>[('A', '1'), ('A', '2'), ('A', '3'),
|
||||
<samp class=p>>>> </samp><kbd class=pp>import itertools</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>list(itertools.product('ABC', '123'))</kbd> <span class=u>①</span></a>
|
||||
<samp class=pp>[('A', '1'), ('A', '2'), ('A', '3'),
|
||||
('B', '1'), ('B', '2'), ('B', '3'),
|
||||
('C', '1'), ('C', '2'), ('C', '3')]</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>list(itertools.combinations('ABC', 2))</kbd> <span class=u>②</span></a>
|
||||
<samp>[('A', 'B'), ('A', 'C'), ('B', 'C')]</samp></pre>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>list(itertools.combinations('ABC', 2))</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>[('A', 'B'), ('A', 'C'), ('B', 'C')]</samp></pre>
|
||||
<ol>
|
||||
<li>The <code>itertools.product()</code> function returns an iterator containing the Cartesian product of two sequences.
|
||||
<li>The <code>itertools.combinations()</code> function returns an iterator containing all the possible combinations of the given sequence of the given length. This is like the <code>itertools.permutations()</code> function, except combinations don’t include items that are duplicates of other items in a different order. So <code>itertools.permutations('ABC', 2)</code> will return both <code>('A', 'B')</code> and <code>('B', 'A')</code> (among others), but <code>itertools.combinations('ABC', 2)</code> will not return <code>('B', 'A')</code> because it is a duplicate of <code>('A', 'B')</code> in a different order.
|
||||
@@ -277,21 +277,21 @@ StopIteration</samp>
|
||||
|
||||
<p class=d>[<a href=examples/favorite-people.txt>download <code>favorite-people.txt</code></a>]
|
||||
<pre class=screen>
|
||||
<a><samp class=p>>>> </samp><kbd>names = list(open('examples/favorite-people.txt'))</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd>names</kbd>
|
||||
<samp>['Dora\n', 'Ethan\n', 'Wesley\n', 'John\n', 'Anne\n',
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>names = list(open('examples/favorite-people.txt'))</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>names</kbd>
|
||||
<samp class=pp>['Dora\n', 'Ethan\n', 'Wesley\n', 'John\n', 'Anne\n',
|
||||
'Mike\n', 'Chris\n', 'Sarah\n', 'Alex\n', 'Lizzie\n']</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>names = [name.rstrip() for name in names]</kbd> <span class=u>②</span></a>
|
||||
<samp class=p>>>> </samp><kbd>names</kbd>
|
||||
<samp>['Dora', 'Ethan', 'Wesley', 'John', 'Anne',
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>names = [name.rstrip() for name in names]</kbd> <span class=u>②</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>names</kbd>
|
||||
<samp class=pp>['Dora', 'Ethan', 'Wesley', 'John', 'Anne',
|
||||
'Mike', 'Chris', 'Sarah', 'Alex', 'Lizzie']</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>names = sorted(names)</kbd> <span class=u>③</span></a>
|
||||
<samp class=p>>>> </samp><kbd>names</kbd>
|
||||
<samp>['Alex', 'Anne', 'Chris', 'Dora', 'Ethan',
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>names = sorted(names)</kbd> <span class=u>③</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>names</kbd>
|
||||
<samp class=pp>['Alex', 'Anne', 'Chris', 'Dora', 'Ethan',
|
||||
'John', 'Lizzie', 'Mike', 'Sarah', 'Wesley']</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>names = sorted(names, key=len)</kbd> <span class=u>④</span></a>
|
||||
<samp class=p>>>> </samp><kbd>names</kbd>
|
||||
<samp>['Alex', 'Anne', 'Dora', 'John', 'Mike',
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>names = sorted(names, key=len)</kbd> <span class=u>④</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>names</kbd>
|
||||
<samp class=pp>['Alex', 'Anne', 'Dora', 'John', 'Mike',
|
||||
'Chris', 'Ethan', 'Sarah', 'Lizzie', 'Wesley']</samp></pre>
|
||||
<ol>
|
||||
<li>This idiom returns a list of the lines in a text file.
|
||||
@@ -304,19 +304,19 @@ StopIteration</samp>
|
||||
|
||||
<pre class=screen>
|
||||
<p>…continuing from the previous interactive shell…
|
||||
<samp class=p>>>> </samp><kbd>import itertools</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>groups = itertools.groupby(names, len)</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd>groups</kbd>
|
||||
<samp><itertools.groupby object at 0x00BB20C0></samp>
|
||||
<samp class=p>>>> </samp><kbd>list(groups)</kbd>
|
||||
<samp>[(4, <itertools._grouper object at 0x00BA8BF0>),
|
||||
<samp class=p>>>> </samp><kbd class=pp>import itertools</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>groups = itertools.groupby(names, len)</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>groups</kbd>
|
||||
<samp class=pp><itertools.groupby object at 0x00BB20C0></samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>list(groups)</kbd>
|
||||
<samp class=pp>[(4, <itertools._grouper object at 0x00BA8BF0>),
|
||||
(5, <itertools._grouper object at 0x00BB4050>),
|
||||
(6, <itertools._grouper object at 0x00BB4030>)]</samp>
|
||||
<samp class=p>>>> </samp><kbd>groups = itertools.groupby(names, len)</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>for name_length, name_iter in groups:</kbd> <span class=u>②</span></a>
|
||||
<samp class=p>... </samp><kbd> print('Names with {0:d} letters:'.format(name_length))</kbd>
|
||||
<samp class=p>... </samp><kbd> for name in name_iter:</kbd>
|
||||
<samp class=p>... </samp><kbd> print(name)</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>groups = itertools.groupby(names, len)</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>for name_length, name_iter in groups:</kbd> <span class=u>②</span></a>
|
||||
<samp class=p>... </samp><kbd class=pp> print('Names with {0:d} letters:'.format(name_length))</kbd>
|
||||
<samp class=p>... </samp><kbd class=pp> for name in name_iter:</kbd>
|
||||
<samp class=p>... </samp><kbd class=pp> print(name)</kbd>
|
||||
<samp class=p>... </samp>
|
||||
<samp>Names with 4 letters:
|
||||
Alex
|
||||
@@ -338,18 +338,18 @@ Wesley</samp></pre>
|
||||
<!-- YO DAWG, WE HEARD YOU LIKE LOOPING, SO WE PUT AN ITERATOR IN YOUR ITERATOR SO YOU CAN LOOP WHILE YOU LOOP. -->
|
||||
<p>Are you watching closely?
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>list(range(0, 3))</kbd>
|
||||
<samp>[0, 1, 2]</samp>
|
||||
<samp class=p>>>> </samp><kbd>list(range(10, 13))</kbd>
|
||||
<samp>[10, 11, 12]</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>list(itertools.chain(range(0, 3), range(10, 13)))</kbd> <span class=u>①</span></a>
|
||||
<samp>[0, 1, 2, 10, 11, 12]</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>list(zip(range(0, 3), range(10, 13)))</kbd> <span class=u>②</span></a>
|
||||
<samp>[(0, 10), (1, 11), (2, 12)]</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>list(zip(range(0, 3), range(10, 14)))</kbd> <span class=u>③</span></a>
|
||||
<samp>[(0, 10), (1, 11), (2, 12)]</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>list(itertools.zip_longest(range(0, 3), range(10, 14)))</kbd> <span class=u>④</span></a>
|
||||
<samp>[(0, 10), (1, 11), (2, 12), (None, 13)]</samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>list(range(0, 3))</kbd>
|
||||
<samp class=pp>[0, 1, 2]</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>list(range(10, 13))</kbd>
|
||||
<samp class=pp>[10, 11, 12]</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>list(itertools.chain(range(0, 3), range(10, 13)))</kbd> <span class=u>①</span></a>
|
||||
<samp class=pp>[0, 1, 2, 10, 11, 12]</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>list(zip(range(0, 3), range(10, 13)))</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>[(0, 10), (1, 11), (2, 12)]</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>list(zip(range(0, 3), range(10, 14)))</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp>[(0, 10), (1, 11), (2, 12)]</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>list(itertools.zip_longest(range(0, 3), range(10, 14)))</kbd> <span class=u>④</span></a>
|
||||
<samp class=pp>[(0, 10), (1, 11), (2, 12), (None, 13)]</samp></pre>
|
||||
<ol>
|
||||
<li>The <code>itertools.chain()</code> function takes two iterators and returns an iterator that contains all the items from the first iterator, followed by all the items from the second iterator. (Actually, it can take any number of iterators, and it chains them all in the order they were passed to the function.)
|
||||
<li>The <code>zip()</code> function does something prosaic that turns out to be extremely useful: it any number of sequences and returns an iterator with the first items of each sequence, then the second items of each, then the third, and so on.
|
||||
@@ -360,13 +360,13 @@ Wesley</samp></pre>
|
||||
<p id=dict-zip>OK, that was all very interesting, but how does it relate to the alphametics solver? Here’s how:
|
||||
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>characters = ('S', 'M', 'E', 'D', 'O', 'N', 'R', 'Y')</kbd>
|
||||
<samp class=p>>>> </samp><kbd>guess = ('1', '2', '0', '3', '4', '5', '6', '7')</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>tuple(zip(characters, guess))</kbd> <span class=u>①</span></a>
|
||||
<samp>(('S', '1'), ('M', '2'), ('E', '0'), ('D', '3'),
|
||||
<samp class=p>>>> </samp><kbd class=pp>characters = ('S', 'M', 'E', 'D', 'O', 'N', 'R', 'Y')</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>guess = ('1', '2', '0', '3', '4', '5', '6', '7')</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>tuple(zip(characters, guess))</kbd> <span class=u>①</span></a>
|
||||
<samp class=pp>(('S', '1'), ('M', '2'), ('E', '0'), ('D', '3'),
|
||||
('O', '4'), ('N', '5'), ('R', '6'), ('Y', '7'))</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>dict(zip(characters, guess))</kbd> <span class=u>②</span></a>
|
||||
<samp>{'E': '0', 'D': '3', 'M': '2', 'O': '4',
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>dict(zip(characters, guess))</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>{'E': '0', 'D': '3', 'M': '2', 'O': '4',
|
||||
'N': '5', 'S': '1', 'R': '6', 'Y': '7'}</samp></pre>
|
||||
<ol>
|
||||
<li>Given a list of letters and a list of digits (each represented here as 1-character strings), the <code>zip</code> function will create a pairing of letters and digits, in order.
|
||||
@@ -391,11 +391,11 @@ for guess in itertools.permutations(digits, len(characters)):
|
||||
<p>Python strings have many methods. You learned about some of those methods in <a href=strings.html>the Strings chapter</a>: <code>lower()</code>, <code>count()</code>, and <code>format()</code>. Now I want to introduce you to a powerful but little-known string manipulation technique: the <code>translate()</code> method.
|
||||
|
||||
<pre class=screen>
|
||||
<a><samp class=p>>>> </samp><kbd>translation_table = {ord('A'): ord('O')}</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>translation_table</kbd> <span class=u>②</span></a>
|
||||
<samp>{65: 79}</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>'MARK'.translate(translation_table)</kbd> <span class=u>③</span></a>
|
||||
<samp>'MORK'</samp></pre>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>translation_table = {ord('A'): ord('O')}</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>translation_table</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>{65: 79}</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>'MARK'.translate(translation_table)</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp>'MORK'</samp></pre>
|
||||
<ol>
|
||||
<li>String translation starts with a translation table, which is just a dictionary that maps one character to another. Actually, “character” is incorrect — the translation table really maps one <em>byte</em> to another.
|
||||
<li>Remember, bytes in Python 3 are integers. The <code>ord()</code> function returns the <abbr>ASCII</abbr> value of a character, which, in the case of A–Z, is always a byte from 65 to 90.
|
||||
@@ -405,17 +405,17 @@ for guess in itertools.permutations(digits, len(characters)):
|
||||
<p>What does this have to do with solving alphametic puzzles? As it turns out, everything.
|
||||
|
||||
<pre class=screen>
|
||||
<a><samp class=p>>>> </samp><kbd>characters = tuple(ord(c) for c in 'SMEDONRY')</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd>characters</kbd>
|
||||
<samp>(83, 77, 69, 68, 79, 78, 82, 89)</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>guess = tuple(ord(c) for c in '91570682')</kbd> <span class=u>②</span></a>
|
||||
<samp class=p>>>> </samp><kbd>guess</kbd>
|
||||
<samp>(57, 49, 53, 55, 48, 54, 56, 50)</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>translation_table = dict(zip(characters, guess))</kbd> <span class=u>③</span></a>
|
||||
<samp class=p>>>> </samp><kbd>translation_table</kbd>
|
||||
<samp>{68: 55, 69: 53, 77: 49, 78: 54, 79: 48, 82: 56, 83: 57, 89: 50}</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>'SEND + MORE == MONEY'.translate(translation_table)</kbd> <span class=u>④</span></a>
|
||||
<samp>'9567 + 1085 == 10652'</samp></pre>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>characters = tuple(ord(c) for c in 'SMEDONRY')</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>characters</kbd>
|
||||
<samp class=pp>(83, 77, 69, 68, 79, 78, 82, 89)</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>guess = tuple(ord(c) for c in '91570682')</kbd> <span class=u>②</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>guess</kbd>
|
||||
<samp class=pp>(57, 49, 53, 55, 48, 54, 56, 50)</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>translation_table = dict(zip(characters, guess))</kbd> <span class=u>③</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>translation_table</kbd>
|
||||
<samp class=pp>{68: 55, 69: 53, 77: 49, 78: 54, 79: 48, 82: 56, 83: 57, 89: 50}</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>'SEND + MORE == MONEY'.translate(translation_table)</kbd> <span class=u>④</span></a>
|
||||
<samp class=pp>'9567 + 1085 == 10652'</samp></pre>
|
||||
<ol>
|
||||
<li>Using a <a href=#generator-expressions>generator expression</a>, we quickly compute the byte values for each character in a string. <var>characters</var> is an example of the value of <var>sorted_characters</var> in the <code>alphametics.solve()</code> function.
|
||||
<li>Using another generator expression, we quickly compute the byte values for each digit in this string. The result, <var>guess</var>, is of the form <a href=#guess>returned by the <code>itertools.permutations()</code> function</a> in the <code>alphametics.solve()</code> function.
|
||||
@@ -432,36 +432,36 @@ for guess in itertools.permutations(digits, len(characters)):
|
||||
<p>This is the final piece of the puzzle (or rather, the final piece of the puzzle solver). After all that fancy string manipulation, we’re left with a string like <code>'9567 + 1085 == 10652'</code>. But that’s a string, and what good is a string? Enter <code>eval()</code>, the universal Python evaluation tool.
|
||||
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>eval('1 + 1 == 2')</kbd>
|
||||
<samp>True</samp>
|
||||
<samp class=p>>>> </samp><kbd>eval('1 + 1 == 3')</kbd>
|
||||
<samp>False</samp>
|
||||
<samp class=p>>>> </samp><kbd>eval('9567 + 1085 == 10652')</kbd>
|
||||
<samp>True</samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>eval('1 + 1 == 2')</kbd>
|
||||
<samp class=pp>True</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>eval('1 + 1 == 3')</kbd>
|
||||
<samp class=pp>False</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>eval('9567 + 1085 == 10652')</kbd>
|
||||
<samp class=pp>True</samp></pre>
|
||||
|
||||
<p>But wait, there’s more! The <code>eval()</code> function isn’t limited to boolean expressions. It can handle <em>any</em> Python expression and returns <em>any</em> datatype.
|
||||
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>eval('"A" + "B"')</kbd>
|
||||
<samp>'AB'</samp>
|
||||
<samp class=p>>>> </samp><Kbd>eval('"MARK".translate({65: 79})')</kbd>
|
||||
<samp>'MORK'</samp>
|
||||
<samp class=p>>>> </samp><kbd>eval('"AAAAA".count("A")')</kbd>
|
||||
<samp>5</samp>
|
||||
<samp class=p>>>> </samp><kbd>eval('["*"] * 5')</kbd>
|
||||
<samp>['*', '*', '*', '*', '*']</samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>eval('"A" + "B"')</kbd>
|
||||
<samp class=pp>'AB'</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>eval('"MARK".translate({65: 79})')</kbd>
|
||||
<samp class=pp>'MORK'</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>eval('"AAAAA".count("A")')</kbd>
|
||||
<samp class=pp>5</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>eval('["*"] * 5')</kbd>
|
||||
<samp class=pp>['*', '*', '*', '*', '*']</samp></pre>
|
||||
|
||||
<p>But wait, that’s not all!
|
||||
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>x = 5</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>eval("x * 5")</kbd> <span class=u>①</span></a>
|
||||
<samp>25</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>eval("pow(x, 2)")</kbd> <span class=u>②</span></a>
|
||||
<samp>25</samp>
|
||||
<samp class=p>>>> </samp><kbd>import math</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>eval("math.sqrt(x)")</kbd> <span class=u>③</span></a>
|
||||
<samp>2.2360679774997898</samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>x = 5</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>eval("x * 5")</kbd> <span class=u>①</span></a>
|
||||
<samp class=pp>25</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>eval("pow(x, 2)")</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>25</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>import math</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>eval("math.sqrt(x)")</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp>2.2360679774997898</samp></pre>
|
||||
<ol>
|
||||
<li>The expression that <code>eval()</code> takes can reference global variables defined outside the <code>eval()</code>. If called within a function, it can reference local variables too.
|
||||
<li>And functions.
|
||||
@@ -471,12 +471,12 @@ for guess in itertools.permutations(digits, len(characters)):
|
||||
<p>Hey, wait a minute…
|
||||
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>import subprocess</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>eval("subprocess.getoutput('ls ~')")</kbd> <span class=u>①</span></a>
|
||||
<samp>'Desktop Library Pictures \
|
||||
<samp class=p>>>> </samp><kbd class=pp>import subprocess</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>eval("subprocess.getoutput('ls ~')")</kbd> <span class=u>①</span></a>
|
||||
<samp class=pp>'Desktop Library Pictures \
|
||||
Documents Movies Public \
|
||||
Music Sites'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>eval("subprocess.getoutput('rm -rf /')")</kbd> <span class=u>②</span></a></pre>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>eval("subprocess.getoutput('rm -rf /')")</kbd> <span class=u>②</span></a></pre>
|
||||
<ol>
|
||||
<li>The <code>subprocess</code> module allows you to run arbitrary shell commands and get the result as a Python string.
|
||||
<li>Don’t do this.
|
||||
@@ -485,7 +485,7 @@ for guess in itertools.permutations(digits, len(characters)):
|
||||
<p>It’s even worse than that, because there’s a global <code>__import__()</code> function that takes a module name as a string, imports the module, and returns a reference to it. Combined with the power of <code>eval()</code>, you can construct a single expression that will wipe out all your files:
|
||||
|
||||
<pre class=screen>
|
||||
<a><samp class=p>>>> </samp><kbd>eval("__import__('subprocess').getoutput('rm -rf /')")</kbd> <span class=u>①</span></a></pre>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>eval("__import__('subprocess').getoutput('rm -rf /')")</kbd> <span class=u>①</span></a></pre>
|
||||
<ol>
|
||||
<li>Don’t do this either.
|
||||
</ol>
|
||||
@@ -497,15 +497,15 @@ for guess in itertools.permutations(digits, len(characters)):
|
||||
<p>But surely there’s <em>some</em> way to evaluate expressions safely? To put <code>eval()</code> in a sandbox where it can’t access or harm the outside world? Well, yeah, but it’s tricky.
|
||||
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>x = 5</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>eval("x * 5", {}, {})</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>x = 5</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>eval("x * 5", {}, {})</kbd> <span class=u>①</span></a>
|
||||
<samp class=traceback>Traceback (most recent call last):
|
||||
File "<stdin>", line 1, in <module>
|
||||
File "<string>", line 1, in <module>
|
||||
NameError: name 'x' is not defined</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>eval("x * 5", {"x": x}, {})</kbd> <span class=u>②</span></a>
|
||||
<samp class=p>>>> </samp><kbd>import math</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>eval("math.sqrt(x)", {"x": x}, {})</kbd> <span class=u>②</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>eval("x * 5", {"x": x}, {})</kbd> <span class=u>②</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>import math</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>eval("math.sqrt(x)", {"x": x}, {})</kbd> <span class=u>②</span></a>
|
||||
<samp class=traceback>Traceback (most recent call last):
|
||||
File "<stdin>", line 1, in <module>
|
||||
File "<string>", line 1, in <module>
|
||||
@@ -519,10 +519,10 @@ NameError: name 'math' is not defined</samp></pre>
|
||||
<p>Gee, that was easy. Lemme make an alphametics web service now!
|
||||
|
||||
<pre class=screen>
|
||||
<a><samp class=p>>>> </samp><kbd>eval("pow(5, 2)", {}, {})</kbd> <span class=u>①</span></a>
|
||||
<samp>25</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>eval("__import__('math').sqrt(5)", {}, {})</kbd> <span class=u>②</span></a>
|
||||
<samp>2.2360679774997898</samp></pre>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>eval("pow(5, 2)", {}, {})</kbd> <span class=u>①</span></a>
|
||||
<samp class=pp>25</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>eval("__import__('math').sqrt(5)", {}, {})</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>2.2360679774997898</samp></pre>
|
||||
<ol>
|
||||
<li>Even though you’ve passed empty dictionaries for the global and local namespaces, all of Python’s built-in functions are still available during evaluation. So <code>pow(5, 2)</code> works, because <code>5</code> and <code>2</code> are literals, and <code>pow()</code> is a built-in function.
|
||||
<li>Unfortunately (and if you don’t see why it’s unfortunate, read on), the <code>__import__()</code> function is also a built-in function, so it works too.
|
||||
@@ -531,7 +531,7 @@ NameError: name 'math' is not defined</samp></pre>
|
||||
<p>Yeah, that means you can still do nasty things, even if you explicitly set the global and local namespaces to empty dictionaries when calling <code>eval()</code>:
|
||||
|
||||
<pre class=screen>
|
||||
<a><samp class=p>>>> </samp><kbd>eval("__import__('subprocess').getoutput('rm -rf /')", {}, {})</kbd> <span class=u>①</span></a></pre>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>eval("__import__('subprocess').getoutput('rm -rf /')", {}, {})</kbd> <span class=u>①</span></a></pre>
|
||||
<ol>
|
||||
<li>Please don’t do this.
|
||||
</ol>
|
||||
@@ -539,14 +539,14 @@ NameError: name 'math' is not defined</samp></pre>
|
||||
<p>Oops. I’m glad I didn’t make that alphametics web service. Is there <em>any</em> way to use <code>eval()</code> safely?
|
||||
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>eval("__import__('math').sqrt(5)",</kbd>
|
||||
<a><samp class=p>... </samp><kbd> {"__builtins__":None}, {})</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>eval("__import__('math').sqrt(5)",</kbd>
|
||||
<a><samp class=p>... </samp><kbd class=pp> {"__builtins__":None}, {})</kbd> <span class=u>①</span></a>
|
||||
<samp class=traceback>Traceback (most recent call last):
|
||||
File "<stdin>", line 1, in <module>
|
||||
File "<string>", line 1, in <module>
|
||||
NameError: name '__import__' is not defined</samp>
|
||||
<samp class=p>>>> </samp><kbd>eval("__import__('subprocess').getoutput('rm -rf /')",</kbd>
|
||||
<a><samp class=p>... </samp><kbd> {"__builtins__":None}, {})</kbd> <span class=u>②</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>eval("__import__('subprocess').getoutput('rm -rf /')",</kbd>
|
||||
<a><samp class=p>... </samp><kbd class=pp> {"__builtins__":None}, {})</kbd> <span class=u>②</span></a>
|
||||
<samp class=traceback>Traceback (most recent call last):
|
||||
File "<stdin>", line 1, in <module>
|
||||
File "<string>", line 1, in <module>
|
||||
|
||||
@@ -795,23 +795,23 @@ TypeError: unsupported operand type(s) for +: 'int' and 'bytes'</samp></pre>
|
||||
<aside>Each item in a string is a string. Each item in a byte array is an integer.</aside>
|
||||
<p>This error doesn’t occur the first time the <code>feed()</code> method gets called; it occurs the <em>second time</em>, after <var>self._mLastChar</var> has been set to the last byte of <var>aBuf</var>. Well, what’s the problem with that? Getting a single element from a byte array yields an integer, not a byte array. To see the difference, follow me to the interactive shell:
|
||||
<pre class=screen>
|
||||
<a><samp class=p>>>> </samp><kbd>aBuf = b'\xEF\xBB\xBF'</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd>len(aBuf)</kbd>
|
||||
<samp>3</samp>
|
||||
<samp class=p>>>> </samp><kbd>mLastChar = aBuf[-1]</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>mLastChar</kbd> <span class=u>②</span></a>
|
||||
<samp>191</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>type(mLastChar)</kbd> <span class=u>③</span></a>
|
||||
<samp><class 'int'></samp>
|
||||
<a><samp class=p>>>> </samp><kbd>mLastChar + aBuf</kbd> <span class=u>④</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>aBuf = b'\xEF\xBB\xBF'</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>len(aBuf)</kbd>
|
||||
<samp class=pp>3</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>mLastChar = aBuf[-1]</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>mLastChar</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>191</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>type(mLastChar)</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp><class 'int'></samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>mLastChar + aBuf</kbd> <span class=u>④</span></a>
|
||||
<samp class=traceback>Traceback (most recent call last):
|
||||
File "<stdin>", line 1, in <module>
|
||||
TypeError: unsupported operand type(s) for +: 'int' and 'bytes'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>mLastChar = aBuf[-1:]</kbd> <span class=u>⑤</span></a>
|
||||
<samp class=p>>>> </samp><kbd>mLastChar</kbd>
|
||||
<samp>b'\xbf'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>mLastChar + aBuf</kbd> <span class=u>⑥</span></a>
|
||||
<samp>b'\xbf\xef\xbb\xbf'</samp></pre>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>mLastChar = aBuf[-1:]</kbd> <span class=u>⑤</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>mLastChar</kbd>
|
||||
<samp class=pp>b'\xbf'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>mLastChar + aBuf</kbd> <span class=u>⑥</span></a>
|
||||
<samp class=pp>b'\xbf\xef\xbb\xbf'</samp></pre>
|
||||
<ol>
|
||||
<li>Define a byte array of length 3.
|
||||
<li>The last element of the byte array is 191.
|
||||
@@ -901,7 +901,7 @@ def feed(self, aBuf):
|
||||
<pre class=screen><samp class=p>C:\home\chardet> </samp><kbd>python test.py tests\*\*</kbd>
|
||||
<samp>tests\ascii\howto.diveintomark.org.xml ascii with confidence 1.0
|
||||
tests\Big5\0804.blogspot.com.xml</samp>
|
||||
<samp>Traceback (most recent call last):
|
||||
<samp class=traceback>Traceback (most recent call last):
|
||||
File "test.py", line 10, in <module>
|
||||
u.feed(line)
|
||||
File "C:\home\chardet\chardet\universaldetector.py", line 116, in feed
|
||||
|
||||
@@ -183,10 +183,8 @@ pre a, .w a {
|
||||
.w a {
|
||||
text-decoration: underline;
|
||||
}
|
||||
kbd, mark {
|
||||
font-weight: bold;
|
||||
}
|
||||
mark {
|
||||
font-weight: bold;
|
||||
display: inline-block;
|
||||
width: 100%;
|
||||
background: #ff8;
|
||||
|
||||
+34
-34
@@ -56,15 +56,15 @@ def plural(noun):
|
||||
|
||||
<p>Let’s look at regular expression substitutions in more detail.
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>import re</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>re.search('[abc]', 'Mark')</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>import re</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.search('[abc]', 'Mark')</kbd> <span class=u>①</span></a>
|
||||
<_sre.SRE_Match object at 0x001C1FA8>
|
||||
<a><samp class=p>>>> </samp><kbd>re.sub('[abc]', 'o', 'Mark')</kbd> <span class=u>②</span></a>
|
||||
<samp>'Mork'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>re.sub('[abc]', 'o', 'rock')</kbd> <span class=u>③</span></a>
|
||||
<samp>'rook'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>re.sub('[abc]', 'o', 'caps')</kbd> <span class=u>④</span></a>
|
||||
<samp>'oops'</samp></pre>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.sub('[abc]', 'o', 'Mark')</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>'Mork'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.sub('[abc]', 'o', 'rock')</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp>'rook'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.sub('[abc]', 'o', 'caps')</kbd> <span class=u>④</span></a>
|
||||
<samp class=pp>'oops'</samp></pre>
|
||||
<ol>
|
||||
<li>Does the string <code>Mark</code> contain <code>a</code>, <code>b</code>, or <code>c</code>? Yes, it contains <code>a</code>.
|
||||
<li>OK, now find <code>a</code>, <code>b</code>, or <code>c</code>, and replace it with <code>o</code>. <code>Mark</code> becomes <code>Mork</code>.
|
||||
@@ -92,14 +92,14 @@ def plural(noun):
|
||||
<p>Let’s look at negation regular expressions in more detail.
|
||||
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>import re</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>re.search('[^aeiou]y$', 'vacancy')</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>import re</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.search('[^aeiou]y$', 'vacancy')</kbd> <span class=u>①</span></a>
|
||||
<_sre.SRE_Match object at 0x001C1FA8>
|
||||
<a><samp class=p>>>> </samp><kbd>re.search('[^aeiou]y$', 'boy')</kbd> <span class=u>②</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.search('[^aeiou]y$', 'boy')</kbd> <span class=u>②</span></a>
|
||||
<samp class=p>>>> </samp>
|
||||
<samp class=p>>>> </samp><kbd>re.search('[^aeiou]y$', 'day')</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>re.search('[^aeiou]y$', 'day')</kbd>
|
||||
<samp class=p>>>> </samp>
|
||||
<a><samp class=p>>>> </samp><kbd>re.search('[^aeiou]y$', 'pita')</kbd> <span class=u>③</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.search('[^aeiou]y$', 'pita')</kbd> <span class=u>③</span></a>
|
||||
<samp class=p>>>> </samp></pre>
|
||||
<ol>
|
||||
<li><code>vacancy</code> matches this regular expression, because it ends in <code>cy</code>, and <code>c</code> is not <code>a</code>, <code>e</code>, <code>i</code>, <code>o</code>, or <code>u</code>.
|
||||
@@ -107,12 +107,12 @@ def plural(noun):
|
||||
<li><code>pita</code> does not match, because it does not end in <code>y</code>.
|
||||
</ol>
|
||||
<pre class=screen>
|
||||
<a><samp class=p>>>> </samp><kbd>re.sub('y$', 'ies', 'vacancy')</kbd> <span class=u>①</span></a>
|
||||
<samp>'vacancies'</samp>
|
||||
<samp class=p>>>> </samp><kbd>re.sub('y$', 'ies', 'agency')</kbd>
|
||||
<samp>'agencies'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>re.sub('([^aeiou])y$', r'\1ies', 'vacancy')</kbd> <span class=u>②</span></a>
|
||||
<samp>'vacancies'</samp></pre>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.sub('y$', 'ies', 'vacancy')</kbd> <span class=u>①</span></a>
|
||||
<samp class=pp>'vacancies'</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>re.sub('y$', 'ies', 'agency')</kbd>
|
||||
<samp class=pp>'agencies'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.sub('([^aeiou])y$', r'\1ies', 'vacancy')</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>'vacancies'</samp></pre>
|
||||
<ol>
|
||||
<li>This regular expression turns <code>vacancy</code> into <code>vacancies</code> and <code>agency</code> into <code>agencies</code>, which is what you wanted. Note that it would also turn <code>boy</code> into <code>boies</code>, but that will never happen in the function because you did that <code>re.search</code> first to find out whether you should do this <code>re.sub</code>.
|
||||
<li>Just in passing, I want to point out that it is possible to combine these two regular expressions (one to find out if the rule applies, and another to actually apply it) into a single regular expression. Here’s what that would look like. Most of it should look familiar: you’re using a remembered group, which you learned in <a href=regular-expressions.html#phonenumbers>Case study: Parsing Phone Numbers</a>. The group is used to remember the character before the letter <code>y</code>. Then in the substitution string, you use a new syntax, <code>\1</code>, which means “hey, that first group you remembered? put it right here.” In this case, you remember the <code>c</code> before the <code>y</code>; when you do the substitution, you substitute <code>c</code> in place of <code>c</code>, and <code>ies</code> in place of <code>y</code>. (If you have more than one remembered group, you can use <code>\2</code> and <code>\3</code> and so on.)
|
||||
@@ -313,23 +313,23 @@ def plural(noun):
|
||||
<p>How the heck does <em>that</em> work? Let’s look at an interactive example first.
|
||||
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>def make_counter(x):</kbd>
|
||||
<samp class=p>... </samp><kbd> print('entering make_counter')</kbd>
|
||||
<samp class=p>... </samp><kbd> while True:</kbd>
|
||||
<a><samp class=p>... </samp><kbd> yield x</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>... </samp><kbd> print('incrementing x')</kbd>
|
||||
<samp class=p>... </samp><kbd> x = x + 1</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>def make_counter(x):</kbd>
|
||||
<samp class=p>... </samp><kbd class=pp> print('entering make_counter')</kbd>
|
||||
<samp class=p>... </samp><kbd class=pp> while True:</kbd>
|
||||
<a><samp class=p>... </samp><kbd class=pp> yield x</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>... </samp><kbd class=pp> print('incrementing x')</kbd>
|
||||
<samp class=p>... </samp><kbd class=pp> x = x + 1</kbd>
|
||||
<samp class=p>... </samp>
|
||||
<a><samp class=p>>>> </samp><kbd>counter = make_counter(2)</kbd> <span class=u>②</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>counter</kbd> <span class=u>③</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>counter = make_counter(2)</kbd> <span class=u>②</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>counter</kbd> <span class=u>③</span></a>
|
||||
<generator object at 0x001C9C10>
|
||||
<a><samp class=p>>>> </samp><kbd>next(counter)</kbd> <span class=u>④</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>next(counter)</kbd> <span class=u>④</span></a>
|
||||
<samp>entering make_counter
|
||||
2</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>next(counter)</kbd> <span class=u>⑤</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>next(counter)</kbd> <span class=u>⑤</span></a>
|
||||
<samp>incrementing x
|
||||
3</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>next(counter)</kbd> <span class=u>⑥</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>next(counter)</kbd> <span class=u>⑥</span></a>
|
||||
<samp>incrementing x
|
||||
4</samp></pre>
|
||||
<ol>
|
||||
@@ -362,10 +362,10 @@ def plural(noun):
|
||||
<p>So you have a function that spits out successive Fibonacci numbers. Sure, you could do that with recursion, but this way is easier to read. Also, it works well with <code>for</code> loops.
|
||||
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>from fibonacci import fib</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>for n in fib(1000):</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>... </samp><kbd> print(n, end=' ')</kbd> <span class=u>②</span></a>
|
||||
<samp>0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987</samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>from fibonacci import fib</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>for n in fib(1000):</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>... </samp><kbd class=pp> print(n, end=' ')</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987</samp></pre>
|
||||
<ol>
|
||||
<li>You can use a generator like <code>fib()</code> in a <code>for</code> loop directly. The <code>for</code> loop will automatically call the <code>next()</code> function to get values from the <code>fib()</code> generator and assign them to the <code>for</code> loop index variable (<var>n</var>).
|
||||
<li>Each time through the <code>for</code> loop, <var>n</var> gets a new value from the <code>yield</code> statement in <code>fib()</code>, and all you have to do is print it out. Once <code>fib()</code> runs out of numbers (<var>a</var> becomes bigger than <var>max</var>, which in this case is <code>1000</code>), then the <code>for</code> loop exits gracefully.
|
||||
|
||||
+134
-115
@@ -187,10 +187,10 @@ Cache-Control: max-age=31536000, public</samp></pre>
|
||||
|
||||
<p>Let’s say you want to download a resource over <abbr>HTTP</abbr>, such as <a href=xml.html>an Atom feed</a>. Being a feed, you’re not just going to download it once; you’re going to download it over and over again. (Most feed readers will check for changes once an hour.) Let’s do it the quick-and-dirty way first, and then see how you can do better.
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>import urllib.request</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>data = urllib.request.urlopen('http://diveintopython3.org/examples/feed.xml').read()</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd>print(data)</kbd>
|
||||
<samp><?xml version='1.0' encoding='utf-8'?>
|
||||
<samp class=p>>>> </samp><kbd class=pp>import urllib.request</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>data = urllib.request.urlopen('http://diveintopython3.org/examples/feed.xml').read()</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>print(data)</kbd>
|
||||
<samp class=pp><?xml version='1.0' encoding='utf-8'?>
|
||||
<feed xmlns='http://www.w3.org/2005/Atom' xml:lang='en'>
|
||||
<title>dive into mark</title>
|
||||
<subtitle>currently between addictions</subtitle>
|
||||
@@ -212,10 +212,10 @@ Cache-Control: max-age=31536000, public</samp></pre>
|
||||
<p>To see why this is inefficient and rude, let’s turn on the debugging features of Python’s <abbr>HTTP</abbr> library and see what’s being sent “on the wire.”
|
||||
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>from http.client import HTTPConnection</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>HTTPConnection.debuglevel = 1</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd>from urllib.request import urlopen</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>response = urlopen('http://diveintopython3.org/examples/feed.xml')</kbd> <span class=u>②</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>from http.client import HTTPConnection</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>HTTPConnection.debuglevel = 1</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>from urllib.request import urlopen</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>response = urlopen('http://diveintopython3.org/examples/feed.xml')</kbd> <span class=u>②</span></a>
|
||||
<samp><a>send: b'GET /examples/feed.xml HTTP/1.1 <span class=u>③</span></a>
|
||||
<a>Host: diveintopython3.org <span class=u>④</span></a>
|
||||
<a>Accept-Encoding: identity <span class=u>⑤</span></a>
|
||||
@@ -236,7 +236,7 @@ reply: 'HTTP/1.1 200 OK'
|
||||
|
||||
<pre class=screen>
|
||||
# continued from previous example
|
||||
<a><samp class=p>>>> </samp><kbd>print(response.headers.as_string())</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>print(response.headers.as_string())</kbd> <span class=u>①</span></a>
|
||||
<samp><a>Date: Sun, 31 May 2009 19:23:06 GMT <span class=u>②</span></a>
|
||||
Server: Apache
|
||||
<a>Last-Modified: Sun, 31 May 2009 06:39:55 GMT <span class=u>③</span></a>
|
||||
@@ -248,9 +248,9 @@ Expires: Mon, 01 Jun 2009 19:23:06 GMT
|
||||
Vary: Accept-Encoding
|
||||
Connection: close
|
||||
Content-Type: application/xml</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>data = response.read()</kbd> <span class=u>⑦</span></a>
|
||||
<samp class=p>>>> </samp><kbd>len(data)</kbd>
|
||||
<samp>3070</samp></pre>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>data = response.read()</kbd> <span class=u>⑦</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>len(data)</kbd>
|
||||
<samp class=pp>3070</samp></pre>
|
||||
<ol>
|
||||
<li>The <var>response</var> returned from the <code>urllib.request.urlopen()</code> function contains all the <abbr>HTTP</abbr> headers the server sent back. It also contains methods to download the actual data; we’ll get to that in a minute.
|
||||
<li>The server tells you when it handled your request.
|
||||
@@ -267,7 +267,7 @@ Content-Type: application/xml</samp>
|
||||
|
||||
<pre class=screen>
|
||||
# continued from the <a href=#whats-on-the-wire>previous example</a>
|
||||
<samp class=p>>>> </samp><kbd>response2 = urlopen('http://diveintopython3.org/examples/feed.xml')</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>response2 = urlopen('http://diveintopython3.org/examples/feed.xml')</kbd>
|
||||
<samp>send: b'GET /examples/feed.xml HTTP/1.1
|
||||
Host: diveintopython3.org
|
||||
Accept-Encoding: identity
|
||||
@@ -282,7 +282,7 @@ reply: 'HTTP/1.1 200 OK'
|
||||
|
||||
<pre class=screen>
|
||||
# continued from the previous example
|
||||
<a><samp class=p>>>> </samp><kbd>print(response2.headers.as_string())</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>print(response2.headers.as_string())</kbd> <span class=u>①</span></a>
|
||||
<samp>Date: Mon, 01 Jun 2009 03:58:00 GMT
|
||||
Server: Apache
|
||||
Last-Modified: Sun, 31 May 2009 22:51:11 GMT
|
||||
@@ -294,11 +294,11 @@ Expires: Tue, 02 Jun 2009 03:58:00 GMT
|
||||
Vary: Accept-Encoding
|
||||
Connection: close
|
||||
Content-Type: application/xml</samp>
|
||||
<samp class=p>>>> </samp><kbd>data2 = response2.read()</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>len(data2)</kbd> <span class=u>②</span></a>
|
||||
<samp>3070</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>data2 == data</kbd> <span class=u>③</span></a>
|
||||
<samp>True</samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>data2 = response2.read()</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>len(data2)</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>3070</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>data2 == data</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp>True</samp></pre>
|
||||
<ol>
|
||||
<li>The server is still sending the same array of “smart” headers: <code>Cache-Control</code> and <code>Expires</code> to allow caching, <code>Last-Modified</code> and <code>ETag</code> to enable “not-modified” tracking. Even the <code>Vary: Accept-Encoding</code> header hints that the server would support compression, if only you would ask for it. But you didn’t.
|
||||
<li>Once again, fetching this data downloads the whole 3070 bytes…
|
||||
@@ -314,15 +314,15 @@ Content-Type: application/xml</samp>
|
||||
<p>To use <code>httplib2</code>, create an instance of the <code>httplib2.Http</code> class.
|
||||
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>import httplib2</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>h = httplib2.Http('.cache')</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>response, content = h.request('http://diveintopython3.org/examples/feed.xml')</kbd> <span class=u>②</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>response.status</kbd> <span class=u>③</span></a>
|
||||
<samp>200</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>content[:52]</kbd> <span class=u>④</span></a>
|
||||
<samp>b"<?xml version='1.0' encoding='utf-8'?>\r\n<feed xmlns="</samp>
|
||||
<samp class=p>>>> </samp><kbd>len(content)</kbd>
|
||||
<samp>3070</samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>import httplib2</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>h = httplib2.Http('.cache')</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>response, content = h.request('http://diveintopython3.org/examples/feed.xml')</kbd> <span class=u>②</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>response.status</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp>200</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>content[:52]</kbd> <span class=u>④</span></a>
|
||||
<samp class=pp>b"<?xml version='1.0' encoding='utf-8'?>\r\n<feed xmlns="</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>len(content)</kbd>
|
||||
<samp class=pp>3070</samp></pre>
|
||||
<ol>
|
||||
<li>The primary interface to <code>httplib2</code> is the <code>Http</code> object. For reasons you’ll see in the next section, you should always pass a directory name when you create an <code>Http</code> object. The directory does not need to exist; <code>httplib2</code> will create it if necessary.
|
||||
<li>Once you have an <code>Http</code> object, retrieving data is as simple as calling the <code>request()</code> method with the address of the data you want. This will issue an <abbr>HTTP</abbr> <code>GET</code> request for that <abbr>URL</abbr>. (Later in this chapter, you’ll see how to issue other <abbr>HTTP</abbr> requests, like <code>POST</code>.)
|
||||
@@ -340,13 +340,13 @@ Content-Type: application/xml</samp>
|
||||
|
||||
<pre class=screen>
|
||||
# continued from the <a href=#introducing-httplib2>previous example</a>
|
||||
<a><samp class=p>>>> </samp><kbd>response2, content2 = h.request('http://diveintopython3.org/examples/feed.xml')</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>response2.status</kbd> <span class=u>②</span></a>
|
||||
<samp>200</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>content2[:52]</kbd> <span class=u>③</span></a>
|
||||
<samp>b"<?xml version='1.0' encoding='utf-8'?>\r\n<feed xmlns="</samp>
|
||||
<samp class=p>>>> </samp><kbd>len(content2)</kbd>
|
||||
<samp>3070</samp></pre>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>response2, content2 = h.request('http://diveintopython3.org/examples/feed.xml')</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>response2.status</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>200</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>content2[:52]</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp>b"<?xml version='1.0' encoding='utf-8'?>\r\n<feed xmlns="</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>len(content2)</kbd>
|
||||
<samp class=pp>3070</samp></pre>
|
||||
<ol>
|
||||
<li>This shouldn’t be terribly surprising. It’s the same thing you did last time, except you’re putting the result into two new variables.
|
||||
<li>The <abbr>HTTP</abbr> <code>status</code> is once again <code>200</code>, just like last time.
|
||||
@@ -359,16 +359,16 @@ Content-Type: application/xml</samp>
|
||||
# NOT continued from previous example!
|
||||
# Please exit out of the interactive shell
|
||||
# and launch a new one.
|
||||
<samp class=p>>>> </samp><kbd>import httplib2</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>httplib2.debuglevel = 1</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>h = httplib2.Http('.cache')</kbd> <span class=u>②</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>response, content = h.request('http://diveintopython3.org/examples/feed.xml')</kbd> <span class=u>③</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>len(content)</kbd> <span class=u>④</span></a>
|
||||
<samp>3070</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>response.status</kbd> <span class=u>⑤</span></a>
|
||||
<samp>200</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>response.fromcache</kbd> <span class=u>⑥</span></a>
|
||||
<samp>True</samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>import httplib2</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>httplib2.debuglevel = 1</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>h = httplib2.Http('.cache')</kbd> <span class=u>②</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>response, content = h.request('http://diveintopython3.org/examples/feed.xml')</kbd> <span class=u>③</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>len(content)</kbd> <span class=u>④</span></a>
|
||||
<samp class=pp>3070</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>response.status</kbd> <span class=u>⑤</span></a>
|
||||
<samp class=pp>200</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>response.fromcache</kbd> <span class=u>⑥</span></a>
|
||||
<samp class=pp>True</samp></pre>
|
||||
<ol>
|
||||
<li>Let’s turn on debugging and see <a href=#whats-on-the-wire>what’s on the wire</a>. This is the <code>httplib2</code> equivalent of turning on debugging in <code>http.client</code>. <code>httplib2</code> will print all the data being sent to the server and some key information being sent back.
|
||||
<li>Create an <code>httplib2.Http</code> object with the same directory name as before.
|
||||
@@ -388,8 +388,8 @@ Content-Type: application/xml</samp>
|
||||
|
||||
<pre class=screen>
|
||||
# continued from the previous example
|
||||
<samp class=p>>>> </samp><kbd>response2, content2 = h.request('http://diveintopython3.org/examples/feed.xml',</kbd>
|
||||
<a><samp class=p>... </samp><kbd> headers={'cache-control':'no-cache'})</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>response2, content2 = h.request('http://diveintopython3.org/examples/feed.xml',</kbd>
|
||||
<a><samp class=p>... </samp><kbd class=pp> headers={'cache-control':'no-cache'})</kbd> <span class=u>①</span></a>
|
||||
<samp><a>connect: (diveintopython3.org, 80) <span class=u>②</span></a>
|
||||
send: b'GET /examples/feed.xml HTTP/1.1
|
||||
Host: diveintopython3.org
|
||||
@@ -398,12 +398,12 @@ accept-encoding: deflate, gzip
|
||||
cache-control: no-cache'
|
||||
reply: 'HTTP/1.1 200 OK'
|
||||
…further debugging information omitted…</samp>
|
||||
<samp class=p>>>> </samp><kbd>response2.status</kbd>
|
||||
<samp>200</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>response2.fromcache</kbd> <span class=u>③</span></a>
|
||||
<samp>False</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>print(dict(response2.items()))</kbd> <span class=u>④</span></a>
|
||||
<samp>{'status': '200',
|
||||
<samp class=p>>>> </samp><kbd class=pp>response2.status</kbd>
|
||||
<samp class=pp>200</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>response2.fromcache</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp>False</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>print(dict(response2.items()))</kbd> <span class=u>④</span></a>
|
||||
<samp class=pp>{'status': '200',
|
||||
'content-length': '3070',
|
||||
'content-location': 'http://diveintopython3.org/examples/feed.xml',
|
||||
'accept-ranges': 'bytes',
|
||||
@@ -431,18 +431,18 @@ reply: 'HTTP/1.1 200 OK'
|
||||
<p>But what about the case where the data <em>might</em> have changed, but hasn’t? <abbr>HTTP</abbr> defines <a href=#last-modified><code>Last-Modified</code></a> and <a href=#etags><code>Etag</code></a> headers for this purpose. These headers are called <i>validators</i>. If the local cache is no longer fresh, a client can send the validators with the next request to see if the data has actually changed. If the data hasn’t changed, the server sends back a <code>304</code> status code <em>and no data</em>. So there’s still a round-trip over the network, but you end up downloading fewer bytes.
|
||||
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>import httplib2</kbd>
|
||||
<samp class=p>>>> </samp><kbd>httplib2.debuglevel = 1</kbd>
|
||||
<samp class=p>>>> </samp><kbd>h = httplib2.Http('.cache')</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>response, content = h.request('http://diveintopython3.org/')</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>import httplib2</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>httplib2.debuglevel = 1</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>h = httplib2.Http('.cache')</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>response, content = h.request('http://diveintopython3.org/')</kbd> <span class=u>①</span></a>
|
||||
<samp>connect: (diveintopython3.org, 80)
|
||||
send: b'GET / HTTP/1.1
|
||||
Host: diveintopython3.org
|
||||
accept-encoding: deflate, gzip
|
||||
user-agent: Python-httplib2/$Rev: 259 $'
|
||||
reply: 'HTTP/1.1 200 OK'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>print(dict(response.items()))</kbd> <span class=u>②</span></a>
|
||||
<samp>{'-content-encoding': 'gzip',
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>print(dict(response.items()))</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>{'-content-encoding': 'gzip',
|
||||
'accept-ranges': 'bytes',
|
||||
'connection': 'close',
|
||||
'content-length': '6657',
|
||||
@@ -454,8 +454,8 @@ reply: 'HTTP/1.1 200 OK'</samp>
|
||||
'server': 'Apache',
|
||||
'status': '304',
|
||||
'vary': 'Accept-Encoding,User-Agent'}</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>len(content)</kbd> <span class=u>③</span></a>
|
||||
<samp>6657</samp></pre>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>len(content)</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp>6657</samp></pre>
|
||||
<ol>
|
||||
<li>Instead of the feed, this time we’re going to download the site’s home page, which is <abbr>HTML</abbr>. Since this is the first time you’lve ever requested this page, <code>httplib2</code> has little to work with, and it sends out a minimum of headers with the request.
|
||||
<li>The response contains a multitude of <abbr>HTTP</abbr> headers… but no caching information. However, it does include both an <code>ETag</code> and <code>Last-Modified</code> header.
|
||||
@@ -464,7 +464,7 @@ reply: 'HTTP/1.1 200 OK'</samp>
|
||||
|
||||
<pre class=screen>
|
||||
# continued from the previous example
|
||||
<a><samp class=p>>>> </samp><kbd>response, content = h.request('http://diveintopython3.org/')</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>response, content = h.request('http://diveintopython3.org/')</kbd> <span class=u>①</span></a>
|
||||
<samp>connect: (diveintopython3.org, 80)
|
||||
send: b'GET / HTTP/1.1
|
||||
Host: diveintopython3.org
|
||||
@@ -473,14 +473,14 @@ Host: diveintopython3.org
|
||||
accept-encoding: deflate, gzip
|
||||
user-agent: Python-httplib2/$Rev: 259 $'
|
||||
<a>reply: 'HTTP/1.1 304 Not Modified' <span class=u>④</span></a></samp>
|
||||
<a><samp class=p>>>> </samp><kbd>response.fromcache</kbd> <span class=u>⑤</span></a>
|
||||
<samp>True</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>response.status</kbd> <span class=u>⑥</span></a>
|
||||
<samp>200</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>response.dict['status']</kbd> <span class=u>⑦</span></a>
|
||||
<samp>'304'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>len(content)</kbd> <span class=u>⑧</span></a>
|
||||
<samp>6657</samp></pre>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>response.fromcache</kbd> <span class=u>⑤</span></a>
|
||||
<samp class=pp>True</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>response.status</kbd> <span class=u>⑥</span></a>
|
||||
<samp class=pp>200</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>response.dict['status']</kbd> <span class=u>⑦</span></a>
|
||||
<samp class=pp>'304'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>len(content)</kbd> <span class=u>⑧</span></a>
|
||||
<samp class=pp>6657</samp></pre>
|
||||
<ol>
|
||||
<li>You request the same page again, with the same <code>Http</code> object (and the same local cache).
|
||||
<li><code>httplib2</code> sends the <code>ETag</code> validator back to the server in the <code>If-None-Match</code> header.
|
||||
@@ -497,15 +497,15 @@ user-agent: Python-httplib2/$Rev: 259 $'
|
||||
<p><abbr>HTTP</abbr> supports <a href=#compression>two types of compression</a>. <code>httplib2</code> supports both of them.
|
||||
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>response, content = h.request('http://diveintopython3.org/')</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>response, content = h.request('http://diveintopython3.org/')</kbd>
|
||||
<samp>connect: (diveintopython3.org, 80)
|
||||
send: b'GET / HTTP/1.1
|
||||
Host: diveintopython3.org
|
||||
<a>accept-encoding: deflate, gzip <span class=u>①</span></a>
|
||||
user-agent: Python-httplib2/$Rev: 259 $'
|
||||
reply: 'HTTP/1.1 200 OK'</samp>
|
||||
<samp class=p>>>> </samp><kbd>print(dict(response.items()))</kbd>
|
||||
<samp><a>{'-content-encoding': 'gzip', <span class=u>②</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>print(dict(response.items()))</kbd>
|
||||
<samp class=pp><a>{'-content-encoding': 'gzip', <span class=u>②</span></a>
|
||||
'accept-ranges': 'bytes',
|
||||
'connection': 'close',
|
||||
'content-length': '6657',
|
||||
@@ -524,57 +524,76 @@ reply: 'HTTP/1.1 200 OK'</samp>
|
||||
|
||||
<h3 id=httplib2-redirects>How <code>httplib2</code> Handles Redirects</h3>
|
||||
|
||||
<p>FIXME
|
||||
<p><abbr>HTTP</abbr> defines <a href=#redirects>two kinds of redirects</a>: temporary and permanent. There’s nothing special to do with temporary redirects except follow them, which <code>httplib2</code> does automatically.
|
||||
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>response, content = h.request('http://diveintopython3.org/examples/feed-302.xml')</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>response, content = h.request('http://diveintopython3.org/examples/feed-302.xml')</kbd> <span class=u>①</span></a>
|
||||
<samp>connect: (diveintopython3.org, 80)
|
||||
send: b'GET /examples/feed-302.xml HTTP/1.1
|
||||
<a>send: b'GET /examples/feed-302.xml HTTP/1.1 <span class=u>②</span></a>
|
||||
Host: diveintopython3.org
|
||||
accept-encoding: deflate, gzip
|
||||
user-agent: Python-httplib2/$Rev: 259 $'
|
||||
<mark>reply: 'HTTP/1.1 302 Found'</mark>
|
||||
<mark>send: b'GET /examples/feed.xml HTTP/1.1</mark>
|
||||
<a>reply: 'HTTP/1.1 302 Found' <span class=u>③</span></a>
|
||||
<a>send: b'GET /examples/feed.xml HTTP/1.1 <span class=u>④</span></a>
|
||||
Host: diveintopython3.org
|
||||
accept-encoding: deflate, gzip
|
||||
user-agent: Python-httplib2/$Rev: 259 $'
|
||||
reply: 'HTTP/1.1 200 OK'</samp>
|
||||
<samp class=p>>>> </samp><kbd>print(dict(response.items()))</kbd>
|
||||
<samp>{'status': '200',
|
||||
reply: 'HTTP/1.1 200 OK'</samp></pre>
|
||||
<ol>
|
||||
<li>
|
||||
<li>
|
||||
<li>
|
||||
<li>
|
||||
</ol>
|
||||
|
||||
<pre class=screen>
|
||||
# continued from the previous example
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>print(dict(response.items()))</kbd> <span class=u>①</span></a>
|
||||
<samp class=pp>{'status': '200',
|
||||
'content-length': '3070',
|
||||
<mark> 'content-location': 'http://diveintopython3.org/examples/feed.xml',</mark>
|
||||
<a> 'content-location': 'http://diveintopython3.org/examples/feed.xml', <span class=u>②</span></a>
|
||||
'accept-ranges': 'bytes',
|
||||
'expires': 'Thu, 04 Jun 2009 02:21:41 GMT',
|
||||
'vary': 'Accept-Encoding',
|
||||
'server': 'Apache',
|
||||
'last-modified': 'Wed, 03 Jun 2009 02:20:15 GMT',
|
||||
'connection': 'close',
|
||||
'-content-encoding': 'gzip',
|
||||
<a> '-content-encoding': 'gzip', <span class=u>③</span></a>
|
||||
'etag': '"bfe-4cbbf5c0"',
|
||||
'cache-control': 'max-age=86400',
|
||||
'date': 'Wed, 03 Jun 2009 02:21:41 GMT',
|
||||
'content-type': 'application/xml'}</samp>
|
||||
<samp class=p>>>> </samp><kbd>response, content = h.request('http://diveintopython3.org/examples/feed-302.xml')</kbd>
|
||||
<samp>connect: (diveintopython3.org, 80)
|
||||
send: b'GET /examples/feed-302.xml HTTP/1.1
|
||||
Host: diveintopython3.org
|
||||
accept-encoding: deflate, gzip
|
||||
user-agent: Python-httplib2/$Rev: 259 $'
|
||||
reply: 'HTTP/1.1 302 Found'</samp></pre>
|
||||
'content-type': 'application/xml'}</samp></pre>
|
||||
<ol>
|
||||
<li>FIXME
|
||||
<li>
|
||||
<li>
|
||||
<li>
|
||||
</ol>
|
||||
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>response, content = h.request('http://diveintopython3.org/examples/feed-301.xml')</kbd>
|
||||
# continued from the previous example
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>response, content = h.request('http://diveintopython3.org/examples/feed-302.xml')</kbd> <span class=u>①</span></a>
|
||||
<samp>connect: (diveintopython3.org, 80)
|
||||
<a>send: b'GET /examples/feed-302.xml HTTP/1.1 <span class=u>②</span></a>
|
||||
Host: diveintopython3.org
|
||||
accept-encoding: deflate, gzip
|
||||
user-agent: Python-httplib2/$Rev: 259 $'
|
||||
<a>reply: 'HTTP/1.1 302 Found' <span class=u>③</span></a></samp></pre>
|
||||
<ol>
|
||||
<li>
|
||||
<li>
|
||||
<li>
|
||||
</ol>
|
||||
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd class=pp>response, content = h.request('http://diveintopython3.org/examples/feed-301.xml')</kbd>
|
||||
<samp>connect: (diveintopython3.org, 80)
|
||||
send: b'GET /examples/feed-301.xml HTTP/1.1
|
||||
Host: diveintopython3.org
|
||||
accept-encoding: deflate, gzip
|
||||
user-agent: Python-httplib2/$Rev: 259 $'
|
||||
reply: 'HTTP/1.1 301 Moved Permanently'</samp>
|
||||
<samp class=p>>>> </samp><kbd>print(dict(response.items()))</kbd>
|
||||
<samp>{'status': '200',
|
||||
<samp class=p>>>> </samp><kbd class=pp>print(dict(response.items()))</kbd>
|
||||
<samp class=pp>{'status': '200',
|
||||
'content-length': '3070',
|
||||
'content-location': 'http://diveintopython3.org/examples/feed.xml',
|
||||
'accept-ranges': 'bytes',
|
||||
@@ -588,9 +607,9 @@ reply: 'HTTP/1.1 301 Moved Permanently'</samp>
|
||||
'cache-control': 'max-age=86400',
|
||||
'date': 'Wed, 03 Jun 2009 02:21:41 GMT',
|
||||
'content-type': 'application/xml'}</samp>
|
||||
<samp class=p>>>> </samp><kbd>response2, content2 = h.request('http://diveintopython3.org/examples/feed-301.xml')</kbd>
|
||||
<samp class=p>>>> </samp><kbd>response2.fromcache</kbd>
|
||||
<samp>True</samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>response2, content2 = h.request('http://diveintopython3.org/examples/feed-301.xml')</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>response2.fromcache</kbd>
|
||||
<samp class=pp>True</samp></pre>
|
||||
<ol>
|
||||
<li>FIXME
|
||||
</ol>
|
||||
@@ -602,18 +621,18 @@ reply: 'HTTP/1.1 301 Moved Permanently'</samp>
|
||||
<p>FIXME
|
||||
|
||||
<pre>
|
||||
<samp class=p>>>> </samp><kbd>import httplib2</kbd>
|
||||
<samp class=p>>>> </samp><kbd>from urllib.parse import urlencode</kbd>
|
||||
<samp class=p>>>> </samp><kbd>h = httplib2.Http('.cache')</kbd>
|
||||
<samp class=p>>>> </samp><kbd>data = {'status': 'Test update from Python 3'}</kbd>
|
||||
<samp class=p>>>> </samp><kbd>h.add_credentials('diveintomark', '<var>MY_SECRET_PASSWORD</var>')</kbd>
|
||||
<samp class=p>>>> </samp><kbd>resp, content = h.request('http://twitter.com/statuses/update.xml', 'POST', urlencode(data))</kbd>
|
||||
<samp class=p>>>> </samp><kbd>resp.status</kbd>
|
||||
<samp>200</samp>
|
||||
<samp class=p>>>> </samp><kbd>from xml.etree import ElementTree as etree</kbd>
|
||||
<samp class=p>>>> </samp><kbd>tree = etree.fromstring(content)</kbd>
|
||||
<samp class=p>>>> </samp><kbd>print(etree.tostring(tree))</kbd>
|
||||
<samp><status>
|
||||
<samp class=p>>>> </samp><kbd class=pp>import httplib2</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>from urllib.parse import urlencode</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>h = httplib2.Http('.cache')</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>data = {'status': 'Test update from Python 3'}</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>h.add_credentials('diveintomark', '<var>MY_SECRET_PASSWORD</var>')</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>resp, content = h.request('http://twitter.com/statuses/update.xml', 'POST', urlencode(data))</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>resp.status</kbd>
|
||||
<samp class=pp>200</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>from xml.etree import ElementTree as etree</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>tree = etree.fromstring(content)</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>print(etree.tostring(tree))</kbd>
|
||||
<samp class=pp><status>
|
||||
<created_at>Sat May 30 19:11:38 +0000 2009</created_at>
|
||||
<id>1973974228</id>
|
||||
<text>Test update from Python 3</text>
|
||||
@@ -662,11 +681,11 @@ reply: 'HTTP/1.1 301 Moved Permanently'</samp>
|
||||
|
||||
<pre class=screen>
|
||||
# continued from the previous example
|
||||
<samp class=p>>>> </samp><kbd>tree.findtext('id')</kbd>
|
||||
<samp>'1973974228'</samp>
|
||||
<samp class=p>>>> </samp><kbd>resp, delete_content = h.request('http://twitter.com/statuses/destroy/{0}.xml'.format(tree.findtext('id')), 'DELETE')</kbd>
|
||||
<samp class=p>>>> </samp><kbd>resp.status</kbd>
|
||||
<samp>200</samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>tree.findtext('id')</kbd>
|
||||
<samp class=pp>'1973974228'</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>resp, delete_content = h.request('http://twitter.com/statuses/destroy/{0}.xml'.format(tree.findtext('id')), 'DELETE')</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>resp.status</kbd>
|
||||
<samp class=pp>200</samp></pre>
|
||||
|
||||
<p class=a>⁂
|
||||
|
||||
|
||||
+33
-33
@@ -95,14 +95,14 @@ body{counter-reset:h1 6}
|
||||
|
||||
<p>Instantiating classes in Python is straightforward. To instantiate a class, simply call the class as if it were a function, passing the arguments that the <code>__init__()</code> method requires. The return value will be the newly created object.
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>import fibonacci2</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>fib = fibonacci2.Fib(100)</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>fib</kbd> <span class=u>②</span></a>
|
||||
<samp><fibonacci2.Fib object at 0x00DB8810></samp>
|
||||
<a><samp class=p>>>> </samp><kbd>fib.__class__</kbd> <span class=u>③</span></a>
|
||||
<samp><class 'fibonacci2.Fib'></samp>
|
||||
<a><samp class=p>>>> </samp><kbd>fib.__doc__</kbd> <span class=u>④</span></a>
|
||||
<samp>'iterator that yields numbers in the Fibonacci sequence'</samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>import fibonacci2</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>fib = fibonacci2.Fib(100)</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>fib</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp><fibonacci2.Fib object at 0x00DB8810></samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>fib.__class__</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp><class 'fibonacci2.Fib'></samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>fib.__doc__</kbd> <span class=u>④</span></a>
|
||||
<samp class=pp>'iterator that yields numbers in the Fibonacci sequence'</samp></pre>
|
||||
<ol>
|
||||
<li>You are creating an instance of the <code>Fib</code> class (defined in the <code>fibonacci2</code> module) and assigning the newly created instance to the variable <var>fib</var>. You are passing one parameter, <code>100</code>, which will end up as the <var>max</var> argument in <code>Fib</code>’s <code>__init__()</code> method.
|
||||
<li><var>fib</var> is now an instance of the <code>Fib</code> class.
|
||||
@@ -144,13 +144,13 @@ body{counter-reset:h1 6}
|
||||
<p>Instance variables are specific to one instance of a class. For example, if you create two <code>Fib</code> instances with different maximum values, they will each remember their own values.
|
||||
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>import fibonacci2</kbd>
|
||||
<samp class=p>>>> </samp><kbd>fib1 = fibonacci2.Fib(100)</kbd>
|
||||
<samp class=p>>>> </samp><kbd>fib2 = fibonacci2.Fib(200)</kbd>
|
||||
<samp class=p>>>> </samp><kbd>fib1.max</kbd>
|
||||
<samp>100</samp>
|
||||
<samp class=p>>>> </samp><kbd>fib2.max</kbd>
|
||||
<samp>200</samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>import fibonacci2</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>fib1 = fibonacci2.Fib(100)</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>fib2 = fibonacci2.Fib(200)</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>fib1.max</kbd>
|
||||
<samp class=pp>100</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>fib2.max</kbd>
|
||||
<samp class=pp>200</samp></pre>
|
||||
|
||||
<p class=a>⁂
|
||||
|
||||
@@ -185,10 +185,10 @@ body{counter-reset:h1 6}
|
||||
<p>Thoroughly confused yet? Excellent. Let’s see how to call this iterator:
|
||||
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>from fibonacci2 import Fib</kbd>
|
||||
<samp class=p>>>> </samp><kbd>for n in Fib(1000):</kbd>
|
||||
<samp class=p>... </samp><kbd> print(n, end=' ')</kbd>
|
||||
<samp>0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987</samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>from fibonacci2 import Fib</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>for n in Fib(1000):</kbd>
|
||||
<samp class=p>... </samp><kbd class=pp> print(n, end=' ')</kbd>
|
||||
<samp class=pp>0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987</samp></pre>
|
||||
|
||||
<p>Why, it’s exactly the same! Byte for byte identical to how you called <a href=generators.html#a-fibonacci-generator>Fibonacci-as-a-generator</a> (modulo one capital letter). But how?
|
||||
|
||||
@@ -260,20 +260,20 @@ rules = LazyRules()</code></pre>
|
||||
<p>Before we continue, let’s take a closer look at <var>rules_filename</var>. It’s not defined within the <code>__init__()</code> method. In fact, it’s not defined within <em>any</em> method. It’s defined at the class level. It’s a <i>class variable</i>, and although you can access it just like an instance variable (<var>self.rules_filename</var>), it is shared across all instances of the <code>LazyRules</code> class.
|
||||
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>import plural6</kbd>
|
||||
<samp class=p>>>> </samp><kbd>r1 = plural6.LazyRules()</kbd>
|
||||
<samp class=p>>>> </samp><kbd>r2 = plural6.LazyRules()</kbd>
|
||||
<samp class=p>>>> </samp><kbd>r1.rules_filename</kbd> <span class=u>①</span>
|
||||
<samp>'plural6-rules.txt'</samp>
|
||||
<samp class=p>>>> </samp><kbd>r2.rules_filename</kbd>
|
||||
<samp>'plural6-rules.txt'</samp>
|
||||
<samp class=p>>>> </samp><kbd>r1.__class__.rules_filename</kbd> <span class=u>②</span>
|
||||
<samp>'plural6-rules.txt'</samp>
|
||||
<samp class=p>>>> </samp><kbd>r1.__class__.rules_filename = 'papayawhip.txt'</kbd> <span class=u>③</span>
|
||||
<samp class=p>>>> </samp><kbd>r1.rules_filename</kbd>
|
||||
<samp>'papayawhip.txt'</samp>
|
||||
<samp class=p>>>> </samp><kbd>r2.rules_filename</kbd> <span class=u>④</span>
|
||||
<samp>'papayawhip.txt'</samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>import plural6</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>r1 = plural6.LazyRules()</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>r2 = plural6.LazyRules()</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>r1.rules_filename</kbd> <span class=u>①</span>
|
||||
<samp class=pp>'plural6-rules.txt'</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>r2.rules_filename</kbd>
|
||||
<samp class=pp>'plural6-rules.txt'</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>r1.__class__.rules_filename</kbd> <span class=u>②</span>
|
||||
<samp class=pp>'plural6-rules.txt'</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>r1.__class__.rules_filename = 'papayawhip.txt'</kbd> <span class=u>③</span>
|
||||
<samp class=p>>>> </samp><kbd class=pp>r1.rules_filename</kbd>
|
||||
<samp class=pp>'papayawhip.txt'</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>r2.rules_filename</kbd> <span class=u>④</span>
|
||||
<samp class=pp>'papayawhip.txt'</samp></pre>
|
||||
<ol>
|
||||
<li>FIXME
|
||||
<li>
|
||||
|
||||
+6
-40
@@ -14,7 +14,8 @@
|
||||
//
|
||||
// Changes from upstream:
|
||||
// - use class=pp instead of class=prettyprint to declare blocks-to-colorize
|
||||
|
||||
// - removed support for <xmp>
|
||||
// - added support for <kbd> and <samp>
|
||||
|
||||
/**
|
||||
* @fileoverview
|
||||
@@ -35,9 +36,6 @@
|
||||
* <script type="text/javascript" src="/path/to/prettify.js"></script>
|
||||
* 2) define style rules. See the example page for examples.
|
||||
* 3) mark the <pre> and <code> tags in your source with class=pp.
|
||||
* You can also use the (html deprecated) <xmp> tag, but the pretty printer
|
||||
* needs to do more substantial DOM manipulations to support that, so some
|
||||
* css styles may not be preserved.
|
||||
* That's it. I wanted to keep the API as simple as possible, so there's no
|
||||
* need to specify which language the code is in.
|
||||
*
|
||||
@@ -271,11 +269,6 @@ window['_pr_isIE6'] = function () {
|
||||
.replace(pr_nbspEnt, ' ');
|
||||
}
|
||||
|
||||
/** is the given node's innerHTML normally unescaped? */
|
||||
function isRawContent(node) {
|
||||
return 'XMP' === node.tagName;
|
||||
}
|
||||
|
||||
function normalizedHtml(node, out) {
|
||||
switch (node.nodeType) {
|
||||
case 1: // an element
|
||||
@@ -548,10 +541,6 @@ window['_pr_isIE6'] = function () {
|
||||
|
||||
if (PR_innerHtmlWorks) {
|
||||
var content = node.innerHTML;
|
||||
// XMP tags contain unescaped entities so require special handling.
|
||||
if (isRawContent(node)) {
|
||||
content = textToHtml(content);
|
||||
}
|
||||
return content;
|
||||
}
|
||||
|
||||
@@ -1283,7 +1272,8 @@ window['_pr_isIE6'] = function () {
|
||||
var codeSegments = [
|
||||
document.getElementsByTagName('pre'),
|
||||
document.getElementsByTagName('code'),
|
||||
document.getElementsByTagName('xmp') ];
|
||||
document.getElementsByTagName('kbd'),
|
||||
document.getElementsByTagName('samp') ];
|
||||
var elements = [];
|
||||
for (var i = 0; i < codeSegments.length; ++i) {
|
||||
for (var j = 0, n = codeSegments[i].length; j < n; ++j) {
|
||||
@@ -1321,7 +1311,7 @@ window['_pr_isIE6'] = function () {
|
||||
var nested = false;
|
||||
for (var p = cs.parentNode; p; p = p.parentNode) {
|
||||
if ((p.tagName === 'pre' || p.tagName === 'code' ||
|
||||
p.tagName === 'xmp') &&
|
||||
p.tagName === 'kbd' || p.tagName === 'samp') &&
|
||||
p.className && p.className.indexOf('pp') >= 0) {
|
||||
nested = true;
|
||||
break;
|
||||
@@ -1358,31 +1348,7 @@ window['_pr_isIE6'] = function () {
|
||||
var cs = prettyPrintingJob.sourceNode;
|
||||
|
||||
// push the prettified html back into the tag.
|
||||
if (!isRawContent(cs)) {
|
||||
// just replace the old html with the new
|
||||
cs.innerHTML = newContent;
|
||||
} else {
|
||||
// we need to change the tag to a <pre> since <xmp>s do not allow
|
||||
// embedded tags such as the span tags used to attach styles to
|
||||
// sections of source code.
|
||||
var pre = document.createElement('PRE');
|
||||
for (var i = 0; i < cs.attributes.length; ++i) {
|
||||
var a = cs.attributes[i];
|
||||
if (a.specified) {
|
||||
var aname = a.name.toLowerCase();
|
||||
if (aname === 'class') {
|
||||
pre.className = a.value; // For IE 6
|
||||
} else {
|
||||
pre.setAttribute(a.name, a.value);
|
||||
}
|
||||
}
|
||||
}
|
||||
pre.innerHTML = newContent;
|
||||
|
||||
// remove the old
|
||||
cs.parentNode.replaceChild(pre, cs);
|
||||
cs = pre;
|
||||
}
|
||||
cs.innerHTML = newContent;
|
||||
|
||||
// Replace <br>s with line-feeds so that copying and pasting works
|
||||
// on IE 6.
|
||||
|
||||
+198
-198
@@ -43,28 +43,28 @@ body{counter-reset:h1 2}
|
||||
raise ValueError('number must be non-negative')</code></pre>
|
||||
<p><var>size</var> is an integer, <code>0</code> is an integer, and <code><</code> is a numerical operator. The result of the expression <code>size < 0</code> is always a boolean. You can test this yourself in the Python interactive shell:
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>size = 1</kbd>
|
||||
<samp class=p>>>> </samp><kbd>size < 0</kbd>
|
||||
<samp>False</samp>
|
||||
<samp class=p>>>> </samp><kbd>size = 0</kbd>
|
||||
<samp class=p>>>> </samp><kbd>size < 0</kbd>
|
||||
<samp>False</samp>
|
||||
<samp class=p>>>> </samp><kbd>size = -1</kbd>
|
||||
<samp class=p>>>> </samp><kbd>size < 0</kbd>
|
||||
<samp>True</samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>size = 1</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>size < 0</kbd>
|
||||
<samp class=pp>False</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>size = 0</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>size < 0</kbd>
|
||||
<samp class=pp>False</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>size = -1</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>size < 0</kbd>
|
||||
<samp class=pp>True</samp></pre>
|
||||
<p class=a>⁂
|
||||
|
||||
<h2 id=numbers>Numbers</h2>
|
||||
<p>Numbers are awesome. There are so many to choose from. Python supports both integers and floating point numbers. There’s no type declaration to distinguish them; Python tells them apart by the presence or absence of a decimal point.
|
||||
<pre class=screen>
|
||||
<a><samp class=p>>>> </samp><kbd>type(1)</kbd> <span class=u>①</span></a>
|
||||
<samp><class 'int'></samp>
|
||||
<a><samp class=p>>>> </samp><kbd>1 + 1</kbd> <span class=u>②</span></a>
|
||||
<samp>2</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>1 + 1.0</kbd> <span class=u>③</span></a>
|
||||
<samp>2.0</samp>
|
||||
<samp class=p>>>> </samp><kbd>type(2.0)</kbd>
|
||||
<samp><class 'float'></samp></pre>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>type(1)</kbd> <span class=u>①</span></a>
|
||||
<samp class=pp><class 'int'></samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>1 + 1</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>2</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>1 + 1.0</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp>2.0</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>type(2.0)</kbd>
|
||||
<samp class=pp><class 'float'></samp></pre>
|
||||
<ol>
|
||||
<li>You can use the <code>type()</code> function to check the type of any value or variable. As you might expect, <code>1</code> is an <code>int</code>.
|
||||
<li>Adding an <code>int</code> to an <code>int</code> yields an <code>int</code>.
|
||||
@@ -73,18 +73,18 @@ body{counter-reset:h1 2}
|
||||
<h3 id=number-coercion>Coercing Integers To Floats And Vice-Versa</h3>
|
||||
<p>As you just saw, some operators (like addition) will coerce integers to floating point numbers as needed. You can also coerce them by yourself.
|
||||
<pre class=screen>
|
||||
<a><samp class=p>>>> </samp><kbd>float(2)</kbd> <span class=u>①</span></a>
|
||||
<samp>2.0</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>int(2.0)</kbd> <span class=u>②</span></a>
|
||||
<samp>2</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>int(2.5)</kbd> <span class=u>③</span></a>
|
||||
<samp>2</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>int(-2.5)</kbd> <span class=u>④</span></a>
|
||||
<samp>-2</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>1.12345678901234567890</kbd> <span class=u>⑤</span></a>
|
||||
<samp>1.1234567890123457</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>type(1000000000000000)</kbd> <span class=u>⑥</span></a>
|
||||
<samp><class 'int'></samp></pre>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>float(2)</kbd> <span class=u>①</span></a>
|
||||
<samp class=pp>2.0</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>int(2.0)</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>2</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>int(2.5)</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp>2</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>int(-2.5)</kbd> <span class=u>④</span></a>
|
||||
<samp class=pp>-2</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>1.12345678901234567890</kbd> <span class=u>⑤</span></a>
|
||||
<samp class=pp>1.1234567890123457</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>type(1000000000000000)</kbd> <span class=u>⑥</span></a>
|
||||
<samp class=pp><class 'int'></samp></pre>
|
||||
<ol>
|
||||
<li>You can explicitly coerce an <code>int</code> to a <code>float</code> by calling the <code>float()</code> function.
|
||||
<li>Unsurprisingly, you can also coerce a <code>float</code> to an <code>int</code> by calling <code>int()</code>.
|
||||
@@ -99,18 +99,18 @@ body{counter-reset:h1 2}
|
||||
<h3 id=common-numerical-operations>Common Numerical Operations</h3>
|
||||
<p>You can do all kinds of things with numbers.
|
||||
<pre class=screen>
|
||||
<a><samp class=p>>>> </samp><kbd>11 / 2</kbd> <span class=u>①</span></a>
|
||||
<samp>5.5</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>11 // 2</kbd> <span class=u>②</span></a>
|
||||
<samp>5</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>−11 // 2</kbd> <span class=u>③</span></a>
|
||||
<samp>−6</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>11.0 // 2</kbd> <span class=u>④</span></a>
|
||||
<samp>5.0</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>11 ** 2</kbd> <span class=u>⑤</span></a>
|
||||
<samp>121</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>11 % 2</kbd> <span class=u>⑥</span></a>
|
||||
<samp>1</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>11 / 2</kbd> <span class=u>①</span></a>
|
||||
<samp class=pp>5.5</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>11 // 2</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>5</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>−11 // 2</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp>−6</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>11.0 // 2</kbd> <span class=u>④</span></a>
|
||||
<samp class=pp>5.0</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>11 ** 2</kbd> <span class=u>⑤</span></a>
|
||||
<samp class=pp>121</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>11 % 2</kbd> <span class=u>⑥</span></a>
|
||||
<samp class=pp>1</samp>
|
||||
</pre>
|
||||
<ol>
|
||||
<li>The <code>/</code> operator performs floating point division. It returns a <code>float</code> even if both the numerator and denominator are <code>int</code>s.
|
||||
@@ -126,14 +126,14 @@ body{counter-reset:h1 2}
|
||||
<h3 id=fractions>Fractions</h3>
|
||||
<p>Python isn’t limited to integers and floating point numbers. It can also do all the fancy math you learned in high school and promptly forgot about.
|
||||
<pre class=screen>
|
||||
<a><samp class=p>>>> </samp><kbd>import fractions</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>x = fractions.Fraction(1, 3)</kbd> <span class=u>②</span></a>
|
||||
<samp class=p>>>> </samp><kbd>x</kbd>
|
||||
<samp>Fraction(1, 3)</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>x * 2</kbd> <span class=u>③</span></a>
|
||||
<samp>Fraction(2, 3)</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>fractions.Fraction(6, 4)</kbd> <span class=u>④</span></a>
|
||||
<samp>Fraction(3, 2)</samp></pre>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>import fractions</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>x = fractions.Fraction(1, 3)</kbd> <span class=u>②</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>x</kbd>
|
||||
<samp class=pp>Fraction(1, 3)</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>x * 2</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp>Fraction(2, 3)</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>fractions.Fraction(6, 4)</kbd> <span class=u>④</span></a>
|
||||
<samp class=pp>Fraction(3, 2)</samp></pre>
|
||||
<ol>
|
||||
<li>To start using fractions, import the <code>fractions</code> module.
|
||||
<li>To define a fraction, create a <code>Fraction</code> object and pass in the numerator and denominator.
|
||||
@@ -143,13 +143,13 @@ body{counter-reset:h1 2}
|
||||
<h3 id=trig>Trigonometry</h3>
|
||||
<p>You can also do basic trigonometry in Python.
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>import math</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>math.pi</kbd> <span class=u>①</span></a>
|
||||
<samp>3.1415926535897931</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>math.sin(math.pi / 2)</kbd> <span class=u>②</span></a>
|
||||
<samp>1.0</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>math.tan(math.pi / 4)</kbd> <span class=u>③</span></a>
|
||||
<samp>0.99999999999999989</samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>import math</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>math.pi</kbd> <span class=u>①</span></a>
|
||||
<samp class=pp>3.1415926535897931</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>math.sin(math.pi / 2)</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>1.0</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>math.tan(math.pi / 4)</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp>0.99999999999999989</samp></pre>
|
||||
<ol>
|
||||
<li>The <code>math</code> module has a constant for π, the ratio of a circle’s circumference to its diameter.
|
||||
<li>The <code>math</code> module has all the basic trigonometric functions, including <code>sin()</code>, <code>cos()</code>, <code>tan()</code>, and variants like <code>asin()</code>.
|
||||
@@ -159,26 +159,26 @@ body{counter-reset:h1 2}
|
||||
<aside>Zero values are false, and non-zero values are true.</aside>
|
||||
<p>You can use numbers <a href=#booleans>in a boolean context</a>, such as an <code>if</code> statement. Zero values are false, and non-zero values are true.
|
||||
<pre class=screen>
|
||||
<a><samp class=p>>>> </samp><kbd>def is_it_true(anything):</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>... </samp><kbd> if anything:</kbd>
|
||||
<samp class=p>... </samp><kbd> print('yes, it's true')</kbd>
|
||||
<samp class=p>... </samp><kbd> else:</kbd>
|
||||
<samp class=p>... </samp><kbd> print('no, it's false')</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>def is_it_true(anything):</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>... </samp><kbd class=pp> if anything:</kbd>
|
||||
<samp class=p>... </samp><kbd class=pp> print('yes, it's true')</kbd>
|
||||
<samp class=p>... </samp><kbd class=pp> else:</kbd>
|
||||
<samp class=p>... </samp><kbd class=pp> print('no, it's false')</kbd>
|
||||
<samp class=p>...</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>is_it_true(1)</kbd> <span class=u>②</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>is_it_true(1)</kbd> <span class=u>②</span></a>
|
||||
<samp>yes, it's true</samp>
|
||||
<samp class=p>>>> </samp><kbd>is_it_true(-1)</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>is_it_true(-1)</kbd>
|
||||
<samp>yes, it's true</samp>
|
||||
<samp class=p>>>> </samp><kbd>is_it_true(0)</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>is_it_true(0)</kbd>
|
||||
<samp>no, it's false</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>is_it_true(0.1)</kbd> <span class=u>③</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>is_it_true(0.1)</kbd> <span class=u>③</span></a>
|
||||
<samp>yes, it's true</samp>
|
||||
<samp class=p>>>> </samp><kbd>is_it_true(0.0)</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>is_it_true(0.0)</kbd>
|
||||
<samp>no, it's false</samp>
|
||||
<samp class=p>>>> </samp><kbd>import fractions</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>is_it_true(fractions.Fraction(1, 2))</kbd> <span class=u>④</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>import fractions</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>is_it_true(fractions.Fraction(1, 2))</kbd> <span class=u>④</span></a>
|
||||
<samp>yes, it's true</samp>
|
||||
<samp class=p>>>> </samp><kbd>is_it_true(fractions.Fraction(0, 1))</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>is_it_true(fractions.Fraction(0, 1))</kbd>
|
||||
<samp>no, it's false</samp></pre>
|
||||
<ol>
|
||||
<li>Did you know you can define your own functions in the Python interactive shell? Just press <kbd>ENTER</kbd> at the end of each line, and <kbd>ENTER</kbd> on a blank line to finish.
|
||||
@@ -199,17 +199,17 @@ body{counter-reset:h1 2}
|
||||
<h3 id=creatinglists>Creating A List</h3>
|
||||
<p>Creating a list is easy: use square brackets to wrap a comma-separated list of values.
|
||||
<pre class=screen>
|
||||
<a><samp class=p>>>> </samp><kbd>a_list = ['a', 'b', 'mpilgrim', 'z', 'example']</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd>a_list</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>a_list = ['a', 'b', 'mpilgrim', 'z', 'example']</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>a_list</kbd>
|
||||
['a', 'b', 'mpilgrim', 'z', 'example']
|
||||
<a><samp class=p>>>> </samp><kbd>a_list[0]</kbd> <span class=u>②</span></a>
|
||||
<samp>'a'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>a_list[4]</kbd> <span class=u>③</span></a>
|
||||
<samp>'example'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>a_list[-1]</kbd> <span class=u>④</span></a>
|
||||
<samp>'example'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>a_list[-3]</kbd> <span class=u>⑤</span></a>
|
||||
<samp>'mpilgrim'</samp></pre>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>a_list[0]</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>'a'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>a_list[4]</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp>'example'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>a_list[-1]</kbd> <span class=u>④</span></a>
|
||||
<samp class=pp>'example'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>a_list[-3]</kbd> <span class=u>⑤</span></a>
|
||||
<samp class=pp>'mpilgrim'</samp></pre>
|
||||
<ol>
|
||||
<li>First, you define a list of five items. Note that they retain their original order. This is not an accident. A list is an ordered set of items.
|
||||
<li>A list can be used like a zero-based array. The first item of any non-empty list is always <code>a_list[0]</code>.
|
||||
@@ -221,19 +221,19 @@ body{counter-reset:h1 2}
|
||||
<aside>a_list[0] is the first item of a_list.</aside>
|
||||
<p>Once you’ve defined a list, you can get any part of it as a new list. This is called <i>slicing</i> the list.
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>a_list</kbd>
|
||||
<samp>['a', 'b', 'mpilgrim', 'z', 'example']</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>a_list[1:3]</kbd> <span class=u>①</span></a>
|
||||
<samp>['b', 'mpilgrim']</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>a_list[1:-1]</kbd> <span class=u>②</span></a>
|
||||
<samp>['b', 'mpilgrim', 'z']</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>a_list[0:3]</kbd> <span class=u>③</span></a>
|
||||
<samp>['a', 'b', 'mpilgrim']</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>a_list[:3]</kbd> <span class=u>④</span></a>
|
||||
<samp>['a', 'b', 'mpilgrim']</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>a_list[3:]</kbd> <span class=u>⑤</span></a>
|
||||
<samp>['z', 'example']</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>a_list[:]</kbd> <span class=u>⑥</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>a_list</kbd>
|
||||
<samp class=pp>['a', 'b', 'mpilgrim', 'z', 'example']</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>a_list[1:3]</kbd> <span class=u>①</span></a>
|
||||
<samp class=pp>['b', 'mpilgrim']</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>a_list[1:-1]</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>['b', 'mpilgrim', 'z']</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>a_list[0:3]</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp>['a', 'b', 'mpilgrim']</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>a_list[:3]</kbd> <span class=u>④</span></a>
|
||||
<samp class=pp>['a', 'b', 'mpilgrim']</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>a_list[3:]</kbd> <span class=u>⑤</span></a>
|
||||
<samp class=pp>['z', 'example']</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>a_list[:]</kbd> <span class=u>⑥</span></a>
|
||||
['a', 'b', 'mpilgrim', 'z', 'example']</pre>
|
||||
<ol>
|
||||
<li>You can get a part of a list, called a “slice”, by specifying two indices. The return value is a new list containing all the items of the list, in order, starting with the first slice index (in this case <code>a_list[1]</code>), up to but not including the second slice index (in this case <code>a_list[3]</code>).
|
||||
@@ -246,19 +246,19 @@ body{counter-reset:h1 2}
|
||||
<h3 id=extendinglists>Adding Items To A List</h3>
|
||||
<p>There are four ways to add items to a list.
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>a_list = ['a']</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>a_list = a_list + [2.0, 3]</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd>a_list</kbd>
|
||||
<samp>['a', 2.0, 3]</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>a_list.append(True)</kbd> <span class=u>②</span></a>
|
||||
<samp class=p>>>> </samp><kbd>a_list</kbd>
|
||||
<samp>['a', 2.0, 3, True]</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>a_list.extend(['four', 'e'])</kbd> <span class=u>③</span></a>
|
||||
<samp class=p>>>> </samp><kbd>a_list</kbd>
|
||||
<samp>['a', 2.0, 3, True, 'four', 'e']</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>a_list.insert(1, 'a')</kbd> <span class=u>④</span></a>
|
||||
<samp class=p>>>> </samp><kbd>a_list</kbd>
|
||||
<samp>['a', 'a', 2.0, 3, True, 'four', 'e']</samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>a_list = ['a']</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>a_list = a_list + [2.0, 3]</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>a_list</kbd>
|
||||
<samp class=pp>['a', 2.0, 3]</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>a_list.append(True)</kbd> <span class=u>②</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>a_list</kbd>
|
||||
<samp class=pp>['a', 2.0, 3, True]</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>a_list.extend(['four', 'e'])</kbd> <span class=u>③</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>a_list</kbd>
|
||||
<samp class=pp>['a', 2.0, 3, True, 'four', 'e']</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>a_list.insert(1, 'a')</kbd> <span class=u>④</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>a_list</kbd>
|
||||
<samp class=pp>['a', 'a', 2.0, 3, True, 'four', 'e']</samp></pre>
|
||||
<ol>
|
||||
<li>The <code>+</code> operator concatenates lists. A list can contain any number of items; there is no size limit (other than available memory). A list can contain items of any datatype; they don’t all need to be the same type. Here we have a list containing a string, a floating point number, and an integer.
|
||||
<li>The <code>append()</code> method adds a single item to the end of the list. (Now we have <em>four</em> different datatypes in the list!)
|
||||
@@ -267,21 +267,21 @@ body{counter-reset:h1 2}
|
||||
</ol>
|
||||
<p>Let’s look closer at the difference between <code>append()</code> and <code>extend()</code>.
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>a_list = ['a', 'b', 'c']</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>a_list.extend(['d', 'e', 'f'])</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd>a_list</kbd>
|
||||
<samp>['a', 'b', 'c', 'd', 'e', 'f']</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>len(a_list)</kbd> <span class=u>②</span></a>
|
||||
<samp>6</samp>
|
||||
<samp class=p>>>> </samp><kbd>a_list[-1]</kbd>
|
||||
<samp>'f'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>a_list.append(['g', 'h', 'i'])</kbd> <span class=u>③</span></a>
|
||||
<samp class=p>>>> </samp><kbd>a_list</kbd>
|
||||
<samp>['a', 'b', 'c', 'd', 'e', 'f', ['g', 'h', 'i']]</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>len(a_list)</kbd> <span class=u>④</span></a>
|
||||
<samp>7</samp>
|
||||
<samp class=p>>>> </samp><kbd>a_list[-1]</kbd>
|
||||
<samp>['g', 'h', 'i']</samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>a_list = ['a', 'b', 'c']</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>a_list.extend(['d', 'e', 'f'])</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>a_list</kbd>
|
||||
<samp class=pp>['a', 'b', 'c', 'd', 'e', 'f']</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>len(a_list)</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>6</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>a_list[-1]</kbd>
|
||||
<samp class=pp>'f'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>a_list.append(['g', 'h', 'i'])</kbd> <span class=u>③</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>a_list</kbd>
|
||||
<samp class=pp>['a', 'b', 'c', 'd', 'e', 'f', ['g', 'h', 'i']]</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>len(a_list)</kbd> <span class=u>④</span></a>
|
||||
<samp class=pp>7</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>a_list[-1]</kbd>
|
||||
<samp class=pp>['g', 'h', 'i']</samp></pre>
|
||||
<ol>
|
||||
<li>The <code>extend()</code> method takes a single argument, which is always a list, and adds each of the items of that list to <var>a_list</var>.
|
||||
<li>If you start with a list of three items and extend it with a list of another three items, you end up with a list of six items.
|
||||
@@ -290,16 +290,16 @@ body{counter-reset:h1 2}
|
||||
</ol>
|
||||
<h3 id=searchinglists>Searching For Values In A List</h3>
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>a_list = ['a', 'b', 'new', 'mpilgrim', 'new']</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>'mpilgrim' in a_list</kbd> <span class=u>①</span></a>
|
||||
<samp>True</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>a_list.index('mpilgrim')</kbd> <span class=u>②</span></a>
|
||||
<samp>3</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>a_list.index('new')</kbd> <span class=u>③</span></a>
|
||||
<samp>2</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>'c' in a_list</kbd> <span class=u>④</span></a>
|
||||
<samp>False</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>a_list.index('c')</kbd> <span class=u>⑤</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>a_list = ['a', 'b', 'new', 'mpilgrim', 'new']</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>'mpilgrim' in a_list</kbd> <span class=u>①</span></a>
|
||||
<samp class=pp>True</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>a_list.index('mpilgrim')</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>3</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>a_list.index('new')</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp>2</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>'c' in a_list</kbd> <span class=u>④</span></a>
|
||||
<samp class=pp>False</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>a_list.index('c')</kbd> <span class=u>⑤</span></a>
|
||||
<samp class=traceback>Traceback (innermost last):
|
||||
File "<interactive input>", line 1, in ?
|
||||
ValueError: list.index(x): x not in list</samp></pre>
|
||||
@@ -314,17 +314,17 @@ ValueError: list.index(x): x not in list</samp></pre>
|
||||
<aside>Empty lists are false; all other lists are true.</aside>
|
||||
<p>You can also use a list in <a href=#booleans>a boolean context</a>, such as an <code>if</code> statement.
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>def is_it_true(anything):</kbd>
|
||||
<samp class=p>... </samp><kbd> if anything:</kbd>
|
||||
<samp class=p>... </samp><kbd> print('yes, it's true')</kbd>
|
||||
<samp class=p>... </samp><kbd> else:</kbd>
|
||||
<samp class=p>... </samp><kbd> print('no, it's false')</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>def is_it_true(anything):</kbd>
|
||||
<samp class=p>... </samp><kbd class=pp> if anything:</kbd>
|
||||
<samp class=p>... </samp><kbd class=pp> print('yes, it's true')</kbd>
|
||||
<samp class=p>... </samp><kbd class=pp> else:</kbd>
|
||||
<samp class=p>... </samp><kbd class=pp> print('no, it's false')</kbd>
|
||||
<samp class=p>...</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>is_it_true([])</kbd> <span class=u>②</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>is_it_true([])</kbd> <span class=u>②</span></a>
|
||||
<samp>no, it's false</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>is_it_true(['a'])</kbd> <span class=u>③</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>is_it_true(['a'])</kbd> <span class=u>③</span></a>
|
||||
<samp>yes, it's true</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>is_it_true([False])</kbd> <span class=u>④</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>is_it_true([False])</kbd> <span class=u>④</span></a>
|
||||
<samp>yes, it's true</samp></pre>
|
||||
<ol>
|
||||
<li>In a boolean context, an empty list is false.
|
||||
@@ -347,14 +347,14 @@ ValueError: list.index(x): x not in list</samp></pre>
|
||||
<h3 id=creating-dictionaries>Creating A Dictionary</h3>
|
||||
<p>Creating a dictionary is easy. The syntax is similar to <a href=#sets>sets</a>, but instead of values, you have key-value pairs. Once you have a dictionary, you can look up values by their key.
|
||||
<pre class=screen>
|
||||
<a><samp class=p>>>> </samp><kbd>a_dict = {'server':'db.diveintopython3.org', 'database':'mysql'}</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd>a_dict</kbd>
|
||||
<samp>{'server': 'db.diveintopython3.org', 'database': 'mysql'}</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>a_dict['server']</kbd> <span class=u>②</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>a_dict = {'server':'db.diveintopython3.org', 'database':'mysql'}</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>a_dict</kbd>
|
||||
<samp class=pp>{'server': 'db.diveintopython3.org', 'database': 'mysql'}</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>a_dict['server']</kbd> <span class=u>②</span></a>
|
||||
'db.diveintopython3.org'
|
||||
<a><samp class=p>>>> </samp><kbd>a_dict['database']</kbd> <span class=u>③</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>a_dict['database']</kbd> <span class=u>③</span></a>
|
||||
'mysql'
|
||||
<a><samp class=p>>>> </samp><kbd>a_dict['db.diveintopython3.org']</kbd> <span class=u>④</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>a_dict['db.diveintopython3.org']</kbd> <span class=u>④</span></a>
|
||||
<samp class=traceback>Traceback (most recent call last):
|
||||
File "<stdin>", line 1, in <module>
|
||||
KeyError: 'db.diveintopython3.org'</samp></pre>
|
||||
@@ -367,20 +367,20 @@ KeyError: 'db.diveintopython3.org'</samp></pre>
|
||||
<h3 id=modifying-dictionaries>Modifying A Dictionary</h3>
|
||||
<p>Dictionaries do not have any predefined size limit. You can add new key-value pairs to a dictionary at any time, or you can modify the value of an existing key. Continuing from the previous example:
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>a_dict</kbd>
|
||||
<samp>{'server': 'db.diveintopython3.org', 'database': 'mysql'}</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>a_dict['database'] = 'blog'</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd>a_dict</kbd>
|
||||
<samp>{'server': 'db.diveintopython3.org', 'database': 'blog'}</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>a_dict['user'] = 'mark'</kbd> <span class=u>②</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>a_dict</kbd> <span class=u>③</span></a>
|
||||
<samp>{'server': 'db.diveintopython3.org', 'user': 'mark', 'database': 'blog'}</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>a_dict['user'] = 'dora'</kbd> <span class=u>④</span></a>
|
||||
<samp class=p>>>> </samp><kbd>a_dict</kbd>
|
||||
<samp>{'server': 'db.diveintopython3.org', 'user': 'dora', 'database': 'blog'}</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>a_dict['User'] = 'mark'</kbd> <span class=u>⑤</span></a>
|
||||
<samp class=p>>>> </samp><kbd>a_dict</kbd>
|
||||
<samp>{'User': 'mark', 'server': 'db.diveintopython3.org', 'user': 'dora', 'database': 'blog'}</samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>a_dict</kbd>
|
||||
<samp class=pp>{'server': 'db.diveintopython3.org', 'database': 'mysql'}</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>a_dict['database'] = 'blog'</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>a_dict</kbd>
|
||||
<samp class=pp>{'server': 'db.diveintopython3.org', 'database': 'blog'}</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>a_dict['user'] = 'mark'</kbd> <span class=u>②</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>a_dict</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp>{'server': 'db.diveintopython3.org', 'user': 'mark', 'database': 'blog'}</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>a_dict['user'] = 'dora'</kbd> <span class=u>④</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>a_dict</kbd>
|
||||
<samp class=pp>{'server': 'db.diveintopython3.org', 'user': 'dora', 'database': 'blog'}</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>a_dict['User'] = 'mark'</kbd> <span class=u>⑤</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>a_dict</kbd>
|
||||
<samp class=pp>{'User': 'mark', 'server': 'db.diveintopython3.org', 'user': 'dora', 'database': 'blog'}</samp></pre>
|
||||
<ol>
|
||||
<li>You can not have duplicate keys in a dictionary. Assigning a value to an existing key will wipe out the old value.
|
||||
<li>You can add new key-value pairs at any time. This syntax is identical to modifying existing values.
|
||||
@@ -395,16 +395,16 @@ KeyError: 'db.diveintopython3.org'</samp></pre>
|
||||
1024: ['KiB', 'MiB', 'GiB', 'TiB', 'PiB', 'EiB', 'ZiB', 'YiB']}</code></pre>
|
||||
<p>Let's tear that apart in the interactive shell.
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>SUFFIXES = {1000: ['KB', 'MB', 'GB', 'TB', 'PB', 'EB', 'ZB', 'YB'],</kbd>
|
||||
<samp class=p>... </samp><kbd> 1024: ['KiB', 'MiB', 'GiB', 'TiB', 'PiB', 'EiB', 'ZiB', 'YiB']}</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>len(SUFFIXES)</kbd> <span class=u>①</span></a>
|
||||
<samp>2</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>SUFFIXES[1000]</kbd> <span class=u>②</span></a>
|
||||
<samp>['KB', 'MB', 'GB', 'TB', 'PB', 'EB', 'ZB', 'YB']</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>SUFFIXES[1024]</kbd> <span class=u>③</span></a>
|
||||
<samp>['KiB', 'MiB', 'GiB', 'TiB', 'PiB', 'EiB', 'ZiB', 'YiB']</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>SUFFIXES[1000][3]</kbd> <span class=u>④</span></a>
|
||||
<samp>'TB'</samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>SUFFIXES = {1000: ['KB', 'MB', 'GB', 'TB', 'PB', 'EB', 'ZB', 'YB'],</kbd>
|
||||
<samp class=p>... </samp><kbd class=pp> 1024: ['KiB', 'MiB', 'GiB', 'TiB', 'PiB', 'EiB', 'ZiB', 'YiB']}</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>len(SUFFIXES)</kbd> <span class=u>①</span></a>
|
||||
<samp class=pp>2</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>SUFFIXES[1000]</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>['KB', 'MB', 'GB', 'TB', 'PB', 'EB', 'ZB', 'YB']</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>SUFFIXES[1024]</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp>['KiB', 'MiB', 'GiB', 'TiB', 'PiB', 'EiB', 'ZiB', 'YiB']</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>SUFFIXES[1000][3]</kbd> <span class=u>④</span></a>
|
||||
<samp class=pp>'TB'</samp></pre>
|
||||
<ol>
|
||||
<li>As with <a href=#lists>lists</a><!-- and <a href=#sets>sets</a>-->, the <code>len()</code> function gives you the number of items in a dictionary.
|
||||
<li><code>1000</code> is a key in the <code>SUFFIXES</code> dictionary; its value is a list of eight items (eight strings, to be precise).
|
||||
@@ -415,15 +415,15 @@ KeyError: 'db.diveintopython3.org'</samp></pre>
|
||||
<aside>Empty dictionaries are false; all other dictionaries are true.</aside>
|
||||
<p>You can also use a dictionary in <a href=#booleans>a boolean context</a>, such as an <code>if</code> statement.
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>def is_it_true(anything):</kbd>
|
||||
<samp class=p>... </samp><kbd> if anything:</kbd>
|
||||
<samp class=p>... </samp><kbd> print('yes, it's true')</kbd>
|
||||
<samp class=p>... </samp><kbd> else:</kbd>
|
||||
<samp class=p>... </samp><kbd> print('no, it's false')</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>def is_it_true(anything):</kbd>
|
||||
<samp class=p>... </samp><kbd class=pp> if anything:</kbd>
|
||||
<samp class=p>... </samp><kbd class=pp> print('yes, it's true')</kbd>
|
||||
<samp class=p>... </samp><kbd class=pp> else:</kbd>
|
||||
<samp class=p>... </samp><kbd class=pp> print('no, it's false')</kbd>
|
||||
<samp class=p>...</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>is_it_true({})</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>is_it_true({})</kbd> <span class=u>①</span></a>
|
||||
<samp>no, it's false</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>is_it_true({'a': 1})</kbd> <span class=u>②</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>is_it_true({'a': 1})</kbd> <span class=u>②</span></a>
|
||||
<samp>yes, it's true</samp></pre>
|
||||
<ol>
|
||||
<li>In a boolean context, an empty dictionary is false.
|
||||
@@ -435,35 +435,35 @@ KeyError: 'db.diveintopython3.org'</samp></pre>
|
||||
<p><code>None</code> is a special constant in Python. It is a null value. <code>None</code> is not the same as <code>False</code>. <code>None</code> is not <code>0</code>. <code>None</code> is not an empty string. Comparing <code>None</code> to anything other than <code>None</code> will always return <code>False</code>.
|
||||
<p><code>None</code> is the only null value. It has its own datatype (<code>NoneType</code>). You can assign <code>None</code> to any variable, but you can not create other <code>NoneType</code> objects. All variables whose value is <code>None</code> are equal to each other.
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>type(None)</kbd>
|
||||
<samp><class 'NoneType'></samp>
|
||||
<samp class=p>>>> </samp><kbd>None == False</kbd>
|
||||
<samp>False</samp>
|
||||
<samp class=p>>>> </samp><kbd>None == 0</kbd>
|
||||
<samp>False</samp>
|
||||
<samp class=p>>>> </samp><kbd>None == ''</kbd>
|
||||
<samp>False</samp>
|
||||
<samp class=p>>>> </samp><kbd>None == None</kbd>
|
||||
<samp>True</samp>
|
||||
<samp class=p>>>> </samp><kbd>x = None</kbd>
|
||||
<samp class=p>>>> </samp><kbd>x == None</kbd>
|
||||
<samp>True</samp>
|
||||
<samp class=p>>>> </samp><kbd>y = None</kbd>
|
||||
<samp class=p>>>> </samp><kbd>x == y</kbd>
|
||||
<samp>True</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>type(None)</kbd>
|
||||
<samp class=pp><class 'NoneType'></samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>None == False</kbd>
|
||||
<samp class=pp>False</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>None == 0</kbd>
|
||||
<samp class=pp>False</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>None == ''</kbd>
|
||||
<samp class=pp>False</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>None == None</kbd>
|
||||
<samp class=pp>True</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>x = None</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>x == None</kbd>
|
||||
<samp class=pp>True</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>y = None</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>x == y</kbd>
|
||||
<samp class=pp>True</samp>
|
||||
</pre>
|
||||
<h3 id=none-in-a-boolean-context><code>None</code> In A Boolean Context</h3>
|
||||
<p>In <a href=#booleans>a boolean context</a>, <code>None</code> is false and <code>not None</code> is true.
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>def is_it_true(anything):</kbd>
|
||||
<samp class=p>... </samp><kbd> if anything:</kbd>
|
||||
<samp class=p>... </samp><kbd> print('yes, it's true')</kbd>
|
||||
<samp class=p>... </samp><kbd> else:</kbd>
|
||||
<samp class=p>... </samp><kbd> print('no, it's false')</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>def is_it_true(anything):</kbd>
|
||||
<samp class=p>... </samp><kbd class=pp> if anything:</kbd>
|
||||
<samp class=p>... </samp><kbd class=pp> print('yes, it's true')</kbd>
|
||||
<samp class=p>... </samp><kbd class=pp> else:</kbd>
|
||||
<samp class=p>... </samp><kbd class=pp> print('no, it's false')</kbd>
|
||||
<samp class=p>...</samp>
|
||||
<samp class=p>>>> </samp><kbd>is_it_true(None)</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>is_it_true(None)</kbd>
|
||||
<samp>no, it's false</samp>
|
||||
<samp class=p>>>> </samp><kbd>is_it_true(not None)</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>is_it_true(not None)</kbd>
|
||||
<samp>yes, it's true</samp></pre>
|
||||
<p class=a>⁂
|
||||
|
||||
|
||||
+3
-3
@@ -22,9 +22,9 @@ body{counter-reset:h1 10}
|
||||
<h2 id=divingin>Diving In</h2>
|
||||
<p class=f>Despite your best efforts to write comprehensive unit tests, bugs happen. What do I mean by “bug”? A bug is a test case you haven’t written yet.
|
||||
|
||||
<pre class=screen><samp class=p>>>> </samp><kbd>import roman7</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>roman7.from_roman('')</kbd> <span class=u>①</span></a>
|
||||
<samp>0</samp></pre>
|
||||
<pre class=screen><samp class=p>>>> </samp><kbd class=pp>import roman7</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>roman7.from_roman('')</kbd> <span class=u>①</span></a>
|
||||
<samp class=pp>0</samp></pre>
|
||||
<ol>
|
||||
<li>Remember in the [FIXME-xref] previous section when you kept seeing that an empty string would match the regular expression you were using to check for valid Roman numerals? Well, it turns out that this is still true for the final version of the regular expression. And that’s a bug; you want an empty string to raise an <code>InvalidRomanNumeralError</code> exception just like any other sequence of characters that don’t represent a valid Roman numeral.
|
||||
</ol>
|
||||
|
||||
+134
-134
@@ -31,17 +31,17 @@ body{counter-reset:h1 4}
|
||||
<h2 id=streetaddresses>Case Study: Street Addresses</h2>
|
||||
<p>This series of examples was inspired by a real-life problem I had in my day job several years ago, when I needed to scrub and standardize street addresses exported from a legacy system before importing them into a newer system. (See, I don’t just make this stuff up; it’s actually useful.) This example shows how I approached the problem.
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>s = '100 NORTH MAIN ROAD'</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>s.replace('ROAD', 'RD.')</kbd> <span class=u>①</span></a>
|
||||
<samp>'100 NORTH MAIN RD.'</samp>
|
||||
<samp class=p>>>> </samp><kbd>s = '100 NORTH BROAD ROAD'</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>s.replace('ROAD', 'RD.')</kbd> <span class=u>②</span></a>
|
||||
<samp>'100 NORTH BRD. RD.'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>s[:-4] + s[-4:].replace('ROAD', 'RD.')</kbd> <span class=u>③</span></a>
|
||||
<samp>'100 NORTH BROAD RD.'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>import re</kbd> <span class=u>④</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>re.sub('ROAD$', 'RD.', s)</kbd> <span class=u>⑤</span></a>
|
||||
<samp>'100 NORTH BROAD RD.'</samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>s = '100 NORTH MAIN ROAD'</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>s.replace('ROAD', 'RD.')</kbd> <span class=u>①</span></a>
|
||||
<samp class=pp>'100 NORTH MAIN RD.'</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>s = '100 NORTH BROAD ROAD'</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>s.replace('ROAD', 'RD.')</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>'100 NORTH BRD. RD.'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>s[:-4] + s[-4:].replace('ROAD', 'RD.')</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp>'100 NORTH BROAD RD.'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>import re</kbd> <span class=u>④</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.sub('ROAD$', 'RD.', s)</kbd> <span class=u>⑤</span></a>
|
||||
<samp class=pp>'100 NORTH BROAD RD.'</samp></pre>
|
||||
<ol>
|
||||
<li>My goal is to standardize a street address so that <code>'ROAD'</code> is always abbreviated as <code>'RD.'</code>. At first glance, I thought this was simple enough that I could just use the string method <code>replace()</code>. After all, all the data was already uppercase, so case mismatches would not be a problem. And the search string, <code>'ROAD'</code>, was a constant. And in this deceptively simple example, <code>s.replace()</code> does indeed work.
|
||||
<li>Life, unfortunately, is full of counterexamples, and I quickly discovered this one. The problem here is that <code>'ROAD'</code> appears twice in the address, once as part of the street name <code>'BROAD'</code> and once as its own word. The <code>replace()</code> method sees these two occurrences and blindly replaces both of them; meanwhile, I see my addresses getting destroyed.
|
||||
@@ -52,18 +52,18 @@ body{counter-reset:h1 4}
|
||||
<aside>^ matches the start of a string. $ matches the end of a string.</aside>
|
||||
<p>Continuing with my story of scrubbing addresses, I soon discovered that the previous example, matching <code>'ROAD'</code> at the end of the address, was not good enough, because not all addresses included a street designation at all. Some addresses simply ended with the street name. I got away with it most of the time, but if the street name was <code>'BROAD'</code>, then the regular expression would match <code>'ROAD'</code> at the end of the string as part of the word <code>'BROAD'</code>, which is not what I wanted.
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>s = '100 BROAD'</kbd>
|
||||
<samp class=p>>>> </samp><kbd>re.sub('ROAD$', 'RD.', s)</kbd>
|
||||
<samp>'100 BRD.'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>re.sub('\\bROAD$', 'RD.', s)</kbd> <span class=u>①</span></a>
|
||||
<samp>'100 BROAD'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>re.sub(r'\bROAD$', 'RD.', s)</kbd> <span class=u>②</span></a>
|
||||
<samp>'100 BROAD'</samp>
|
||||
<samp class=p>>>> </samp><kbd>s = '100 BROAD ROAD APT. 3'</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>re.sub(r'\bROAD$', 'RD.', s)</kbd> <span class=u>③</span></a>
|
||||
<samp>'100 BROAD ROAD APT. 3'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>re.sub(r'\bROAD\b', 'RD.', s)</kbd> <span class=u>④</span></a>
|
||||
<samp>'100 BROAD RD. APT 3'</samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>s = '100 BROAD'</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>re.sub('ROAD$', 'RD.', s)</kbd>
|
||||
<samp class=pp>'100 BRD.'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.sub('\\bROAD$', 'RD.', s)</kbd> <span class=u>①</span></a>
|
||||
<samp class=pp>'100 BROAD'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.sub(r'\bROAD$', 'RD.', s)</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>'100 BROAD'</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>s = '100 BROAD ROAD APT. 3'</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.sub(r'\bROAD$', 'RD.', s)</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp>'100 BROAD ROAD APT. 3'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.sub(r'\bROAD\b', 'RD.', s)</kbd> <span class=u>④</span></a>
|
||||
<samp class=pp>'100 BROAD RD. APT 3'</samp></pre>
|
||||
<ol>
|
||||
<li>What I <em>really</em> wanted was to match <code>'ROAD'</code> when it was at the end of the string <em>and</em> it was its own word (and not a part of some larger word). To express this in a regular expression, you use <code>\b</code>, which means “a word boundary must occur right here.” In Python, this is complicated by the fact that the <code>'\'</code> character in a string must itself be escaped. This is sometimes referred to as the backslash plague, and it is one reason why regular expressions are easier in Perl than in Python. On the down side, Perl mixes regular expressions with other syntax, so if you have a bug, it may be hard to tell whether it’s a bug in syntax or a bug in your regular expression.
|
||||
<li>To work around the backslash plague, you can use what is called a <i>raw string</i> [FIXME reference to strings chapter], by prefixing the string with the letter <code>r</code>. This tells Python that nothing in this string should be escaped; <code>'\t'</code> is a tab character, but <code>r'\t'</code> is really the backslash character <code>\</code> followed by the letter <code>t</code>. I recommend always using raw strings when dealing with regular expressions; otherwise, things get too confusing too quickly (and regular expressions are confusing enough already).
|
||||
@@ -95,17 +95,17 @@ body{counter-reset:h1 4}
|
||||
<h3 id=thousands>Checking For Thousands</h3>
|
||||
<p>What would it take to validate that an arbitrary string is a valid Roman numeral? Let’s take it one digit at a time. Since Roman numerals are always written highest to lowest, let’s start with the highest: the thousands place. For numbers 1000 and higher, the thousands are represented by a series of <code>M</code> characters.
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>import re</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>pattern = '^M?M?M?$'</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>re.search(pattern, 'M')</kbd> <span class=u>②</span></a>
|
||||
<samp><SRE_Match object at 0106FB58></samp>
|
||||
<a><samp class=p>>>> </samp><kbd>re.search(pattern, 'MM')</kbd> <span class=u>③</span></a>
|
||||
<samp><SRE_Match object at 0106C290></samp>
|
||||
<a><samp class=p>>>> </samp><kbd>re.search(pattern, 'MMM')</kbd> <span class=u>④</span></a>
|
||||
<samp><SRE_Match object at 0106AA38></samp>
|
||||
<a><samp class=p>>>> </samp><kbd>re.search(pattern, 'MMMM')</kbd> <span class=u>⑤</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>re.search(pattern, '')</kbd> <span class=u>⑥</span></a>
|
||||
<samp><SRE_Match object at 0106F4A8></samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>import re</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>pattern = '^M?M?M?$'</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.search(pattern, 'M')</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp><SRE_Match object at 0106FB58></samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.search(pattern, 'MM')</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp><SRE_Match object at 0106C290></samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.search(pattern, 'MMM')</kbd> <span class=u>④</span></a>
|
||||
<samp class=pp><SRE_Match object at 0106AA38></samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.search(pattern, 'MMMM')</kbd> <span class=u>⑤</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.search(pattern, '')</kbd> <span class=u>⑥</span></a>
|
||||
<samp class=pp><SRE_Match object at 0106F4A8></samp></pre>
|
||||
<ol>
|
||||
<li>This pattern has three parts. <code>^</code> matches what follows only at the beginning of the string. If this were not specified, the pattern would match no matter where the <code>M</code> characters were, which is not what you want. You want to make sure that the <code>M</code> characters, if they’re there, are at the beginning of the string. <code>M?</code> optionally matches a single <code>M</code> character. Since this is repeated three times, you’re matching anywhere from zero to three <code>M</code> characters in a row. And <code>$</code> matches the end of the string. When combined with the <code>^</code> character at the beginning, this means that the pattern must match the entire string, with no other characters before or after the <code>M</code> characters.
|
||||
<li>The essence of the <code>re</code> module is the <code>search()</code> function, that takes a regular expression (<var>pattern</var>) and a string (<code>'M'</code>) to try to match against the regular expression. If a match is found, <code>search()</code> returns an object which has various methods to describe the match; if no match is found, <code>search()</code> returns <code>None</code>, the Python null value. All you care about at the moment is whether the pattern matches, which you can tell by just looking at the return value of <code>search()</code>. <code>'M'</code> matches this regular expression, because the first optional <code>M</code> matches and the second and third optional <code>M</code> characters are ignored.
|
||||
@@ -141,17 +141,17 @@ body{counter-reset:h1 4}
|
||||
</ul>
|
||||
<p>This example shows how to validate the hundreds place of a Roman numeral.
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>import re</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>pattern = '^M?M?M?(CM|CD|D?C?C?C?)$'</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>re.search(pattern, 'MCM')</kbd> <span class=u>②</span></a>
|
||||
<samp><SRE_Match object at 01070390></samp>
|
||||
<a><samp class=p>>>> </samp><kbd>re.search(pattern, 'MD')</kbd> <span class=u>③</span></a>
|
||||
<samp><SRE_Match object at 01073A50></samp>
|
||||
<a><samp class=p>>>> </samp><kbd>re.search(pattern, 'MMMCCC')</kbd> <span class=u>④</span></a>
|
||||
<samp><SRE_Match object at 010748A8></samp>
|
||||
<a><samp class=p>>>> </samp><kbd>re.search(pattern, 'MCMC')</kbd> <span class=u>⑤</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>re.search(pattern, '')</kbd> <span class=u>⑥</span></a>
|
||||
<samp><SRE_Match object at 01071D98></samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>import re</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>pattern = '^M?M?M?(CM|CD|D?C?C?C?)$'</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.search(pattern, 'MCM')</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp><SRE_Match object at 01070390></samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.search(pattern, 'MD')</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp><SRE_Match object at 01073A50></samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.search(pattern, 'MMMCCC')</kbd> <span class=u>④</span></a>
|
||||
<samp class=pp><SRE_Match object at 010748A8></samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.search(pattern, 'MCMC')</kbd> <span class=u>⑤</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.search(pattern, '')</kbd> <span class=u>⑥</span></a>
|
||||
<samp class=pp><SRE_Match object at 01071D98></samp></pre>
|
||||
<ol>
|
||||
<li>This pattern starts out the same as the previous one, checking for the beginning of the string (<code>^</code>), then the thousands place (<code>M?M?M?</code>). Then it has the new part, in parentheses, which defines a set of three mutually exclusive patterns, separated by vertical bars: <code>CM</code>, <code>CD</code>, and <code>D?C?C?C?</code> (which is an optional <code>D</code> followed by zero to three optional <code>C</code> characters). The regular expression parser checks for each of these patterns in order (from left to right), takes the first one that matches, and ignores the rest.
|
||||
<li><code>'MCM'</code> matches because the first <code>M</code> matches, the second and third <code>M</code> characters are ignored, and the <code>CM</code> matches (so the <code>CD</code> and <code>D?C?C?C?</code> patterns are never even considered). <code>MCM</code> is the Roman numeral representation of <code>1900</code>.
|
||||
@@ -167,17 +167,17 @@ body{counter-reset:h1 4}
|
||||
<aside>{1,4} matches between 1 and 4 occurrences of a pattern.</aside>
|
||||
<p>In the previous section, you were dealing with a pattern where the same character could be repeated up to three times. There is another way to express this in regular expressions, which some people find more readable. First look at the method we already used in the previous example.
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>import re</kbd>
|
||||
<samp class=p>>>> </samp><kbd>pattern = '^M?M?M?$'</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>re.search(pattern, 'M')</kbd> <span class=u>①</span></a>
|
||||
<samp><_sre.SRE_Match object at 0x008EE090></samp>
|
||||
<samp class=p>>>> </samp><kbd>pattern = '^M?M?M?$'</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>re.search(pattern, 'MM')</kbd> <span class=u>②</span></a>
|
||||
<samp><_sre.SRE_Match object at 0x008EEB48></samp>
|
||||
<samp class=p>>>> </samp><kbd>pattern = '^M?M?M?$'</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>re.search(pattern, 'MMM')</kbd> <span class=u>③</span></a>
|
||||
<samp><_sre.SRE_Match object at 0x008EE090></samp>
|
||||
<a><samp class=p>>>> </samp><kbd>re.search(pattern, 'MMMM')</kbd> <span class=u>④</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>import re</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>pattern = '^M?M?M?$'</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.search(pattern, 'M')</kbd> <span class=u>①</span></a>
|
||||
<samp class=pp><_sre.SRE_Match object at 0x008EE090></samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>pattern = '^M?M?M?$'</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.search(pattern, 'MM')</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp><_sre.SRE_Match object at 0x008EEB48></samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>pattern = '^M?M?M?$'</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.search(pattern, 'MMM')</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp><_sre.SRE_Match object at 0x008EE090></samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.search(pattern, 'MMMM')</kbd> <span class=u>④</span></a>
|
||||
<samp class=p>>>> </samp></pre>
|
||||
<ol>
|
||||
<li>This matches the start of the string, and then the first optional <code>M</code>, but not the second and third <code>M</code> (but that’s okay because they’re optional), and then the end of the string.
|
||||
@@ -186,14 +186,14 @@ body{counter-reset:h1 4}
|
||||
<li>This matches the start of the string, and then all three optional <code>M</code>, but then does not match the the end of the string (because there is still one unmatched <code>M</code>), so the pattern does not match and returns <code>None</code>.
|
||||
</ol>
|
||||
<pre class=screen>
|
||||
<a><samp class=p>>>> </samp><kbd>pattern = '^M{0,3}$'</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>re.search(pattern, 'M')</kbd> <span class=u>②</span></a>
|
||||
<samp><_sre.SRE_Match object at 0x008EEB48></samp>
|
||||
<a><samp class=p>>>> </samp><kbd>re.search(pattern, 'MM')</kbd> <span class=u>③</span></a>
|
||||
<samp><_sre.SRE_Match object at 0x008EE090></samp>
|
||||
<a><samp class=p>>>> </samp><kbd>re.search(pattern, 'MMM')</kbd> <span class=u>④</span></a>
|
||||
<samp><_sre.SRE_Match object at 0x008EEDA8></samp>
|
||||
<a><samp class=p>>>> </samp><kbd>re.search(pattern, 'MMMM')</kbd> <span class=u>⑤</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>pattern = '^M{0,3}$'</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.search(pattern, 'M')</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp><_sre.SRE_Match object at 0x008EEB48></samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.search(pattern, 'MM')</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp><_sre.SRE_Match object at 0x008EE090></samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.search(pattern, 'MMM')</kbd> <span class=u>④</span></a>
|
||||
<samp class=pp><_sre.SRE_Match object at 0x008EEDA8></samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.search(pattern, 'MMMM')</kbd> <span class=u>⑤</span></a>
|
||||
<samp class=p>>>> </samp></pre>
|
||||
<ol>
|
||||
<li>This pattern says: “Match the start of the string, then anywhere from zero to three <code>M</code> characters, then the end of the string.” The 0 and 3 can be any numbers; if you want to match at least one but no more than three <code>M</code> characters, you could say <code>M{1,3}</code>.
|
||||
@@ -205,16 +205,16 @@ body{counter-reset:h1 4}
|
||||
<h3 id=tensandones>Checking For Tens And Ones</h3>
|
||||
<p>Now let’s expand the Roman numeral regular expression to cover the tens and ones place. This example shows the check for tens.
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>pattern = '^M?M?M?(CM|CD|D?C?C?C?)(XC|XL|L?X?X?X?)$'</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>re.search(pattern, 'MCMXL')</kbd> <span class=u>①</span></a>
|
||||
<samp><_sre.SRE_Match object at 0x008EEB48></samp>
|
||||
<a><samp class=p>>>> </samp><kbd>re.search(pattern, 'MCML')</kbd> <span class=u>②</span></a>
|
||||
<samp><_sre.SRE_Match object at 0x008EEB48></samp>
|
||||
<a><samp class=p>>>> </samp><kbd>re.search(pattern, 'MCMLX')</kbd> <span class=u>③</span></a>
|
||||
<samp><_sre.SRE_Match object at 0x008EEB48></samp>
|
||||
<a><samp class=p>>>> </samp><kbd>re.search(pattern, 'MCMLXXX')</kbd> <span class=u>④</span></a>
|
||||
<samp><_sre.SRE_Match object at 0x008EEB48></samp>
|
||||
<a><samp class=p>>>> </samp><kbd>re.search(pattern, 'MCMLXXXX')</kbd> <span class=u>⑤</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>pattern = '^M?M?M?(CM|CD|D?C?C?C?)(XC|XL|L?X?X?X?)$'</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.search(pattern, 'MCMXL')</kbd> <span class=u>①</span></a>
|
||||
<samp class=pp><_sre.SRE_Match object at 0x008EEB48></samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.search(pattern, 'MCML')</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp><_sre.SRE_Match object at 0x008EEB48></samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.search(pattern, 'MCMLX')</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp><_sre.SRE_Match object at 0x008EEB48></samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.search(pattern, 'MCMLXXX')</kbd> <span class=u>④</span></a>
|
||||
<samp class=pp><_sre.SRE_Match object at 0x008EEB48></samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.search(pattern, 'MCMLXXXX')</kbd> <span class=u>⑤</span></a>
|
||||
<samp class=p>>>> </samp></pre>
|
||||
<ol>
|
||||
<li>This matches the start of the string, then the first optional <code>M</code>, then <code>CM</code>, then <code>XL</code>, then the end of the string. Remember, the <code>(A|B|C)</code> syntax means “match exactly one of A, B, or C”. You match <code>XL</code>, so you ignore the <code>XC</code> and <code>L?X?X?X?</code> choices, and then move on to the end of the string. <code>MCML</code> is the Roman numeral representation of <code>1940</code>.
|
||||
@@ -226,18 +226,18 @@ body{counter-reset:h1 4}
|
||||
<aside>(A|B) matches either pattern A or pattern B.</aside>
|
||||
<p>The expression for the ones place follows the same pattern. I’ll spare you the details and show you the end result.
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>pattern = '^M?M?M?(CM|CD|D?C?C?C?)(XC|XL|L?X?X?X?)(IX|IV|V?I?I?I?)$'</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>pattern = '^M?M?M?(CM|CD|D?C?C?C?)(XC|XL|L?X?X?X?)(IX|IV|V?I?I?I?)$'</kbd>
|
||||
</pre><p>So what does that look like using this alternate <code>{n,m}</code> syntax? This example shows the new syntax.
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>pattern = '^M{0,3}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3})$'</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>re.search(pattern, 'MDLV')</kbd> <span class=u>①</span></a>
|
||||
<samp><_sre.SRE_Match object at 0x008EEB48></samp>
|
||||
<a><samp class=p>>>> </samp><kbd>re.search(pattern, 'MMDCLXVI')</kbd> <span class=u>②</span></a>
|
||||
<samp><_sre.SRE_Match object at 0x008EEB48></samp>
|
||||
<a><samp class=p>>>> </samp><kbd>re.search(pattern, 'MMMDCCCLXXXVIII')</kbd> <span class=u>③</span></a>
|
||||
<samp><_sre.SRE_Match object at 0x008EEB48></samp>
|
||||
<a><samp class=p>>>> </samp><kbd>re.search(pattern, 'I')</kbd> <span class=u>④</span></a>
|
||||
<samp><_sre.SRE_Match object at 0x008EEB48></samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>pattern = '^M{0,3}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3})$'</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.search(pattern, 'MDLV')</kbd> <span class=u>①</span></a>
|
||||
<samp class=pp><_sre.SRE_Match object at 0x008EEB48></samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.search(pattern, 'MMDCLXVI')</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp><_sre.SRE_Match object at 0x008EEB48></samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.search(pattern, 'MMMDCCCLXXXVIII')</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp><_sre.SRE_Match object at 0x008EEB48></samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.search(pattern, 'I')</kbd> <span class=u>④</span></a>
|
||||
<samp class=pp><_sre.SRE_Match object at 0x008EEB48></samp></pre>
|
||||
<ol>
|
||||
<li>This matches the start of the string, then one of a possible three <code>M</code> characters, then <code>D?C{0,3}</code>. Of that, it matches the optional <code>D</code> and zero of three possible <code>C</code> characters. Moving on, it matches <code>L?X{0,3}</code> by matching the optional <code>L</code> and zero of three possible <code>X</code> characters. Then it matches <code>V?I{0,3}</code> by matching the optional <code>V</code> and zero of three possible <code>I</code> characters, and finally the end of the string. <code>MDLV</code> is the Roman numeral representation of <code>1555</code>.
|
||||
<li>This matches the start of the string, then two of a possible three <code>M</code> characters, then the <code>D?C{0,3}</code> with a <code>D</code> and one of three possible <code>C</code> characters; then <code>L?X{0,3}</code> with an <code>L</code> and one of three possible <code>X</code> characters; then <code>V?I{0,3}</code> with a <code>V</code> and one of three possible <code>I</code> characters; then the end of the string. <code>MMDCLXVI</code> is the Roman numeral representation of <code>2666</code>.
|
||||
@@ -257,7 +257,7 @@ body{counter-reset:h1 4}
|
||||
</ul>
|
||||
<p>This will be more clear with an example. Let’s revisit the compact regular expression you’ve been working with, and make it a verbose regular expression. This example shows how.
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>pattern = '''
|
||||
<samp class=p>>>> </samp><kbd class=pp>pattern = '''
|
||||
^ # beginning of string
|
||||
M{0,3} # thousands - 0 to 3 M's
|
||||
(CM|CD|D?C{0,3}) # hundreds - 900 (CM), 400 (CD), 0-300 (0 to 3 C's),
|
||||
@@ -268,13 +268,13 @@ body{counter-reset:h1 4}
|
||||
# or 5-8 (V, followed by 0 to 3 I's)
|
||||
$ # end of string
|
||||
'''</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>re.search(pattern, 'M', re.VERBOSE)</kbd> <span class=u>①</span></a>
|
||||
<samp><_sre.SRE_Match object at 0x008EEB48></samp>
|
||||
<a><samp class=p>>>> </samp><kbd>re.search(pattern, 'MCMLXXXIX', re.VERBOSE)</kbd> <span class=u>②</span></a>
|
||||
<samp><_sre.SRE_Match object at 0x008EEB48></samp>
|
||||
<a><samp class=p>>>> </samp><kbd>re.search(pattern, 'MMMDCCCLXXXVIII', re.VERBOSE)</kbd> <span class=u>③</span></a>
|
||||
<samp><_sre.SRE_Match object at 0x008EEB48></samp>
|
||||
<a><samp class=p>>>> </samp><kbd>re.search(pattern, 'M')</kbd> <span class=u>④</span></a></pre>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.search(pattern, 'M', re.VERBOSE)</kbd> <span class=u>①</span></a>
|
||||
<samp class=pp><_sre.SRE_Match object at 0x008EEB48></samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.search(pattern, 'MCMLXXXIX', re.VERBOSE)</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp><_sre.SRE_Match object at 0x008EEB48></samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.search(pattern, 'MMMDCCCLXXXVIII', re.VERBOSE)</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp><_sre.SRE_Match object at 0x008EEB48></samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>re.search(pattern, 'M')</kbd> <span class=u>④</span></a></pre>
|
||||
<ol>
|
||||
<li>The most important thing to remember when using verbose regular expressions is that you need to pass an extra argument when working with them: <code>re.VERBOSE</code> is a constant defined in the <code>re</code> module that signals that the pattern should be treated as a verbose regular expression. As you can see, this pattern has quite a bit of whitespace (all of which is ignored), and several comments (all of which are ignored). Once you ignore the whitespace and the comments, this is exactly the same regular expression as you saw in the previous section, but it’s a lot more readable.
|
||||
<li>This matches the start of the string, then one of a possible three <code>M</code>, then <code>CM</code>, then <code>L</code> and three of a possible three <code>X</code>, then <code>IX</code>, then the end of the string.
|
||||
@@ -302,10 +302,10 @@ body{counter-reset:h1 4}
|
||||
<p>Quite a variety! In each of these cases, I need to know that the area code was <code>800</code>, the trunk was <code>555</code>, and the rest of the phone number was <code>1212</code>. For those with an extension, I need to know that the extension was <code>1234</code>.
|
||||
<p>Let’s work through developing a solution for phone number parsing. This example shows the first step.
|
||||
<pre class=screen>
|
||||
<a><samp class=p>>>> </samp><kbd>phonePattern = re.compile(r'^(\d{3})-(\d{3})-(\d{4})$')</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>phonePattern.search('800-555-1212').groups()</kbd> <span class=u>②</span></a>
|
||||
<samp>('800', '555', '1212')</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>phonePattern.search('800-555-1212-1234')</kbd> <span class=u>③</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>phonePattern = re.compile(r'^(\d{3})-(\d{3})-(\d{4})$')</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>phonePattern.search('800-555-1212').groups()</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>('800', '555', '1212')</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>phonePattern.search('800-555-1212-1234')</kbd> <span class=u>③</span></a>
|
||||
<samp class=p>>>> </samp></pre>
|
||||
<ol>
|
||||
<li>Always read regular expressions from left to right. This one matches the beginning of the string, and then <code>(\d{3})</code>. What’s <code>\d{3}</code>? Well, the <code>{3}</code> means “match exactly three numeric digits”; it’s a variation on the <a href=#nmsyntax><code>{n,m} syntax</code></a> you saw earlier. <code>\d</code> means “any numeric digit” (<code>0</code> through <code>9</code>). Putting it in parentheses means “match exactly three numeric digits, <em>and then remember them as a group that I can ask for later</em>”. Then match a literal hyphen. Then match another group of exactly three digits. Then another literal hyphen. Then another group of exactly four digits. Then match the end of the string.
|
||||
@@ -313,12 +313,12 @@ body{counter-reset:h1 4}
|
||||
<li>This regular expression is not the final answer, because it doesn’t handle a phone number with an extension on the end. For that, you’ll need to expand the regular expression.
|
||||
</ol>
|
||||
<pre class=screen>
|
||||
<a><samp class=p>>>> </samp><kbd>phonePattern = re.compile(r'^(\d{3})-(\d{3})-(\d{4})-(\d+)$')</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>phonePattern.search('800-555-1212-1234').groups()</kbd> <span class=u>②</span></a>
|
||||
<samp>('800', '555', '1212', '1234')</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>phonePattern.search('800 555 1212 1234')</kbd> <span class=u>③</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>phonePattern = re.compile(r'^(\d{3})-(\d{3})-(\d{4})-(\d+)$')</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>phonePattern.search('800-555-1212-1234').groups()</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>('800', '555', '1212', '1234')</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>phonePattern.search('800 555 1212 1234')</kbd> <span class=u>③</span></a>
|
||||
<samp class=p>>>> </samp>
|
||||
<a><samp class=p>>>> </samp><kbd>phonePattern.search('800-555-1212')</kbd> <span class=u>④</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>phonePattern.search('800-555-1212')</kbd> <span class=u>④</span></a>
|
||||
<samp class=p>>>> </samp></pre>
|
||||
<ol>
|
||||
<li>This regular expression is almost identical to the previous one. Just as before, you match the beginning of the string, then a remembered group of three digits, then a hyphen, then a remembered group of three digits, then a hyphen, then a remembered group of four digits. What’s new is that you then match another hyphen, and a remembered group of one or more digits, then the end of the string.
|
||||
@@ -328,14 +328,14 @@ body{counter-reset:h1 4}
|
||||
</ol>
|
||||
<p>The next example shows the regular expression to handle separators between the different parts of the phone number.
|
||||
<pre class=screen>
|
||||
<a><samp class=p>>>> </samp><kbd>phonePattern = re.compile(r'^(\d{3})\D+(\d{3})\D+(\d{4})\D+(\d+)$')</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>phonePattern.search('800 555 1212 1234').groups()</kbd> <span class=u>②</span></a>
|
||||
<samp>('800', '555', '1212', '1234')</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>phonePattern.search('800-555-1212-1234').groups()</kbd> <span class=u>③</span></a>
|
||||
<samp>('800', '555', '1212', '1234')</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>phonePattern.search('80055512121234')</kbd> <span class=u>④</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>phonePattern = re.compile(r'^(\d{3})\D+(\d{3})\D+(\d{4})\D+(\d+)$')</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>phonePattern.search('800 555 1212 1234').groups()</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>('800', '555', '1212', '1234')</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>phonePattern.search('800-555-1212-1234').groups()</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp>('800', '555', '1212', '1234')</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>phonePattern.search('80055512121234')</kbd> <span class=u>④</span></a>
|
||||
<samp class=p>>>> </samp>
|
||||
<a><samp class=p>>>> </samp><kbd>phonePattern.search('800-555-1212')</kbd> <span class=u>⑤</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>phonePattern.search('800-555-1212')</kbd> <span class=u>⑤</span></a>
|
||||
<samp class=p>>>> </samp></pre>
|
||||
<ol>
|
||||
<li>Hang on to your hat. You’re matching the beginning of the string, then a group of three digits, then <code>\D+</code>. What the heck is that? Well, <code>\D</code> matches any character <em>except</em> a numeric digit, and <code>+</code> means “1 or more”. So <code>\D+</code> matches one or more characters that are not digits. This is what you’re using instead of a literal hyphen, to try to match different separators.
|
||||
@@ -346,14 +346,14 @@ body{counter-reset:h1 4}
|
||||
</ol>
|
||||
<p>The next example shows the regular expression for handling phone numbers <em>without</em> separators.
|
||||
<pre class=screen>
|
||||
<a><samp class=p>>>> </samp><kbd>phonePattern = re.compile(r'^(\d{3})\D*(\d{3})\D*(\d{4})\D*(\d*)$')</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>phonePattern.search('80055512121234').groups()</kbd> <span class=u>②</span></a>
|
||||
<samp>('800', '555', '1212', '1234')</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>phonePattern.search('800.555.1212 x1234').groups()</kbd> <span class=u>③</span></a>
|
||||
<samp>('800', '555', '1212', '1234')</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>phonePattern.search('800-555-1212').groups()</kbd> <span class=u>④</span></a>
|
||||
<samp>('800', '555', '1212', '')</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>phonePattern.search('(800)5551212 x1234')</kbd> <span class=u>⑤</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>phonePattern = re.compile(r'^(\d{3})\D*(\d{3})\D*(\d{4})\D*(\d*)$')</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>phonePattern.search('80055512121234').groups()</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>('800', '555', '1212', '1234')</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>phonePattern.search('800.555.1212 x1234').groups()</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp>('800', '555', '1212', '1234')</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>phonePattern.search('800-555-1212').groups()</kbd> <span class=u>④</span></a>
|
||||
<samp class=pp>('800', '555', '1212', '')</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>phonePattern.search('(800)5551212 x1234')</kbd> <span class=u>⑤</span></a>
|
||||
<samp class=p>>>> </samp></pre>
|
||||
<ol>
|
||||
<li>The only change you’ve made since that last step is changing all the <code>+</code> to <code>*</code>. Instead of <code>\D+</code> between the parts of the phone number, you now match on <code>\D*</code>. Remember that <code>+</code> means “1 or more”? Well, <code>*</code> means “zero or more”. So now you should be able to parse phone numbers even when there is no separator character at all.
|
||||
@@ -364,12 +364,12 @@ body{counter-reset:h1 4}
|
||||
</ol>
|
||||
<p>The next example shows how to handle leading characters in phone numbers.
|
||||
<pre class=screen>
|
||||
<a><samp class=p>>>> </samp><kbd>phonePattern = re.compile(r'^\D*(\d{3})\D*(\d{3})\D*(\d{4})\D*(\d*)$')</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>phonePattern.search('(800)5551212 ext. 1234').groups()</kbd> <span class=u>②</span></a>
|
||||
<samp>('800', '555', '1212', '1234')</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>phonePattern.search('800-555-1212').groups()</kbd> <span class=u>③</span></a>
|
||||
<samp>('800', '555', '1212', '')</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>phonePattern.search('work 1-(800) 555.1212 #1234')</kbd> <span class=u>④</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>phonePattern = re.compile(r'^\D*(\d{3})\D*(\d{3})\D*(\d{4})\D*(\d*)$')</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>phonePattern.search('(800)5551212 ext. 1234').groups()</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>('800', '555', '1212', '1234')</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>phonePattern.search('800-555-1212').groups()</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp>('800', '555', '1212', '')</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>phonePattern.search('work 1-(800) 555.1212 #1234')</kbd> <span class=u>④</span></a>
|
||||
<samp class=p>>>> </samp></pre>
|
||||
<ol>
|
||||
<li>This is the same as in the previous example, except now you’re matching <code>\D*</code>, zero or more non-numeric characters, before the first remembered group (the area code). Notice that you’re not remembering these non-numeric characters (they’re not in parentheses). If you find them, you’ll just skip over them and then start remembering the area code whenever you get to it.
|
||||
@@ -379,13 +379,13 @@ body{counter-reset:h1 4}
|
||||
</ol>
|
||||
<p>Let’s back up for a second. So far the regular expressions have all matched from the beginning of the string. But now you see that there may be an indeterminate amount of stuff at the beginning of the string that you want to ignore. Rather than trying to match it all just so you can skip over it, let’s take a different approach: don’t explicitly match the beginning of the string at all. This approach is shown in the next example.
|
||||
<pre class=screen>
|
||||
<a><samp class=p>>>> </samp><kbd>phonePattern = re.compile(r'(\d{3})\D*(\d{3})\D*(\d{4})\D*(\d*)$')</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>phonePattern.search('work 1-(800) 555.1212 #1234').groups()</kbd> <span class=u>②</span></a>
|
||||
<samp>('800', '555', '1212', '1234')</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>phonePattern.search('800-555-1212')</kbd> <span class=u>③</span></a>
|
||||
<samp>('800', '555', '1212', '')</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>phonePattern.search('80055512121234')</kbd> <span class=u>④</span></a>
|
||||
<samp>('800', '555', '1212', '1234')</samp></pre>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>phonePattern = re.compile(r'(\d{3})\D*(\d{3})\D*(\d{4})\D*(\d*)$')</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>phonePattern.search('work 1-(800) 555.1212 #1234').groups()</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>('800', '555', '1212', '1234')</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>phonePattern.search('800-555-1212')</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp>('800', '555', '1212', '')</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>phonePattern.search('80055512121234')</kbd> <span class=u>④</span></a>
|
||||
<samp class=pp>('800', '555', '1212', '1234')</samp></pre>
|
||||
<ol>
|
||||
<li>Note the lack of <code>^</code> in this regular expression. You are not matching the beginning of the string anymore. There’s nothing that says you need to match the entire input with your regular expression. The regular expression engine will do the hard work of figuring out where the input string starts to match, and go from there.
|
||||
<li>Now you can successfully parse a phone number that includes leading characters and a leading digit, plus any number of any kind of separators around each part of the phone number.
|
||||
@@ -395,7 +395,7 @@ body{counter-reset:h1 4}
|
||||
<p>See how quickly a regular expression can get out of control? Take a quick glance at any of the previous iterations. Can you tell the difference between one and the next?
|
||||
<p>While you still understand the final answer (and it is the final answer; if you’ve discovered a case it doesn’t handle, I don’t want to know about it), let’s write it out as a verbose regular expression, before you forget why you made the choices you made.
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>phonePattern = re.compile(r'''
|
||||
<samp class=p>>>> </samp><kbd class=pp>phonePattern = re.compile(r'''
|
||||
# don't match beginning of string, number can start anywhere
|
||||
(\d{3}) # area code is 3 digits (e.g. '800')
|
||||
\D* # optional separator is any number of non-digits
|
||||
@@ -406,10 +406,10 @@ body{counter-reset:h1 4}
|
||||
(\d*) # extension is optional and can be any number of digits
|
||||
$ # end of string
|
||||
''', re.VERBOSE)</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>phonePattern.search('work 1-(800) 555.1212 #1234').groups()</kbd> <span class=u>①</span></a>
|
||||
<samp>('800', '555', '1212', '1234')</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>phonePattern.search('800-555-1212')</kbd> <span class=u>②</span></a>
|
||||
<samp>('800', '555', '1212', '')</samp></pre>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>phonePattern.search('work 1-(800) 555.1212 #1234').groups()</kbd> <span class=u>①</span></a>
|
||||
<samp class=pp>('800', '555', '1212', '1234')</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>phonePattern.search('800-555-1212')</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>('800', '555', '1212', '')</samp></pre>
|
||||
<ol>
|
||||
<li>Other than being spread out over multiple lines, this is exactly the same regular expression as the last step, so it’s no surprise that it parses the same inputs.
|
||||
<li>Final sanity check. Yes, this still works. You’re done.
|
||||
|
||||
+22
-22
@@ -147,12 +147,12 @@ td a:link, td a:visited{border:0}
|
||||
else:
|
||||
<a> raise AttributeError <span class=u>②</span></a></code>
|
||||
|
||||
<samp class=p>>>> </samp><kbd>dyn = Dynamo()</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>dyn.color</kbd> <span class=u>③</span></a>
|
||||
<samp>'PapayaWhip'</samp>
|
||||
<samp class=p>>>> </samp><kbd>dyn.color = 'LemonChiffon'</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>dyn.color</kbd> <span class=u>④</span></a>
|
||||
<samp>'LemonChiffon'</samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>dyn = Dynamo()</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>dyn.color</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp>'PapayaWhip'</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>dyn.color = 'LemonChiffon'</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>dyn.color</kbd> <span class=u>④</span></a>
|
||||
<samp class=pp>'LemonChiffon'</samp></pre>
|
||||
<ol>
|
||||
<li>The attribute name is passed into the <code>__getattr()__</code> method as a string. If the name is <code>'color'</code>, the method returns a value. (In this case, it’s just a hard-coded string, but you would normally do some sort of computation and return the result.)
|
||||
<li>If the attribute name is unknown, the <code>__getattr()__</code> method needs to raise an <code>AttributeError</code> exception, otherwise your code will silently fail when accessing undefined attributes. (Technically, if the method doesn’t raise an exception or explicitly return a value, it returns <code>None</code>, the Python null value. This means that <em>all</em> attributes not explicitly defined will be <code>None</code>, which is almost certainly not what you want.)
|
||||
@@ -170,12 +170,12 @@ td a:link, td a:visited{border:0}
|
||||
else:
|
||||
raise AttributeError</code>
|
||||
|
||||
<samp class=p>>>> </samp><kbd>dyn = SuperDynamo()</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>dyn.color</kbd> <span class=u>①</span></a>
|
||||
<samp>'PapayaWhip'</samp>
|
||||
<samp class=p>>>> </samp><kbd>dyn.color = 'LemonChiffon'</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>dyn.color</kbd> <span class=u>②</span></a>
|
||||
<samp>'PapayaWhip'</samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>dyn = SuperDynamo()</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>dyn.color</kbd> <span class=u>①</span></a>
|
||||
<samp class=pp>'PapayaWhip'</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>dyn.color = 'LemonChiffon'</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>dyn.color</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>'PapayaWhip'</samp></pre>
|
||||
<ol>
|
||||
<li>The <code>__getattribute__()</code> method is called to provide a value for <var>dyn.color</var>.
|
||||
<li>Even after explicitly setting <var>dyn.color</var>, the <code>__getattribute__()</code> method <em>is still called</em> to provide a value for <var>dyn.color</var>. If present, the <code>__getattribute__()</code> method <em>is called unconditionally</em> for every attribute and method lookup, even for attributes that you explicitly set after creating an instance.
|
||||
@@ -194,8 +194,8 @@ td a:link, td a:visited{border:0}
|
||||
def swim(self):
|
||||
pass</code>
|
||||
|
||||
<samp class=p>>>> </samp><kbd>hero = Rastan()</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>hero.swim()</kbd> <span class=u>②</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>hero = Rastan()</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>hero.swim()</kbd> <span class=u>②</span></a>
|
||||
<samp class=traceback>Traceback (most recent call last):
|
||||
File "<stdin>", line 1, in <module>
|
||||
File "<stdin>", line 3, in __getattribute__
|
||||
@@ -361,10 +361,10 @@ class FieldStorage:
|
||||
<p>Using the appropriate special methods, you can define your own classes that act like numbers. That is, you can add them, subtract them, and perform other mathematical operations on them. This is how <a href=advanced-classes.html#implementing-fractions><dfn>fractions</dfn> are implemented</a> — the <code><dfn>Fraction</dfn></code> class implements these special methods, then you can do things like this:
|
||||
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>from fractions import Fraction</kbd>
|
||||
<samp class=p>>>> </samp><kbd>x = Fraction(1, 3)</kbd>
|
||||
<samp class=p>>>> </samp><kbd>x / 3</kbd>
|
||||
<samp>Fraction(1, 9)</samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>from fractions import Fraction</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>x = Fraction(1, 3)</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>x / 3</kbd>
|
||||
<samp class=pp>Fraction(1, 9)</samp></pre>
|
||||
|
||||
<p>Here is the comprehensive list of special methods you need to implement a number-like class.
|
||||
|
||||
@@ -430,10 +430,10 @@ class FieldStorage:
|
||||
<p>That’s all well and good if <var>x</var> is an instance of a class that implements those methods. But what if it doesn’t implement one of them? Or worse, what if it implements it, but it can’t handle certain kinds of arguments? For example:
|
||||
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>from fractions import Fraction</kbd>
|
||||
<samp class=p>>>> </samp><kbd>x = Fraction(1, 3)</kbd>
|
||||
<samp class=p>>>> </samp><kbd>1 / x</kbd>
|
||||
<samp>Fraction(3, 1)</samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>from fractions import Fraction</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>x = Fraction(1, 3)</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>1 / x</kbd>
|
||||
<samp class=pp>Fraction(3, 1)</samp></pre>
|
||||
|
||||
<p>This is <em>not</em> a case of taking a <code>Fraction</code> and dividing it by an integer (as in the previous example). That case was straightforward: <code>x / 3</code> calls <code>x.__truediv__(3)</code>, and the <code>__truediv__()</code> method of the <code>Fraction</code> class handles all the math. But integers don’t “know” how to do arithmetic operations with fractions. So why does this example work?
|
||||
|
||||
|
||||
+91
-91
@@ -84,13 +84,13 @@ My alphabet starts where your alphabet ends! <span class=u>❞</span><br>&m
|
||||
<p>In Python 3, all strings are sequences of Unicode characters. There is no such thing as a Python string encoded in UTF-8, or a Python string encoded as CP-1252. “Is this string UTF-8?” is an invalid question. UTF-8 is a way of encoding characters as a sequence of bytes. If you want to take a string and turn it into a sequence of bytes in a particular character encoding, Python 3 can help you with that. If you want to take a sequence of bytes and turn it into a string, Python 3 can help you with that too. Bytes are not characters; bytes are bytes. Characters are an abstraction. A string is a sequence of those abstractions.
|
||||
|
||||
<pre class=screen>
|
||||
<a><samp class=p>>>> </samp><kbd>s = '深入 Python'</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>len(s)</kbd> <span class=u>②</span></a>
|
||||
<samp>9</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>s[0]</kbd> <span class=u>③</span></a>
|
||||
<samp>'深'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>s + ' 3'</kbd> <span class=u>④</span></a>
|
||||
<samp>'深入 Python 3'</samp></pre>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>s = '深入 Python'</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>len(s)</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>9</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>s[0]</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp>'深'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>s + ' 3'</kbd> <span class=u>④</span></a>
|
||||
<samp class=pp>'深入 Python 3'</samp></pre>
|
||||
<ol>
|
||||
<li>To create a string, enclose it in quotes. Python strings can be defined with either single quotes (<code>'</code>) or double quotes (<code>"</code>).<!--"-->
|
||||
<li>The built-in <code><dfn>len</dfn>()</code> function returns the length of the string, <i>i.e.</i> the number of characters. This is the same function you use to <a href=native-datatypes.html#extendinglists>find the length of a list</a>. A string is like a list of characters.
|
||||
@@ -141,10 +141,10 @@ def approximate_size(size, a_kilobyte_is_1024_bytes=True):
|
||||
<p>Python 3 supports <dfn>formatting</dfn> values into strings. Although this can include very complicated expressions, the most basic usage is to insert a value into a string with single placeholder.
|
||||
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>username = 'mark'</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>password = 'PapayaWhip'</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>"{0}'s password is {1}".format(username, password)</kbd> <span class=u>②</span></a>
|
||||
<samp>"mark's password is PapayaWhip"</samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>username = 'mark'</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>password = 'PapayaWhip'</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>"{0}'s password is {1}".format(username, password)</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>"mark's password is PapayaWhip"</samp></pre>
|
||||
<ol>
|
||||
<li>No, my password is not really <kbd>PapayaWhip</kbd>.
|
||||
<li>There’s a lot going on here. First, that’s a method call on a string literal. <em>Strings are objects</em>, and objects have methods. Second, the whole expression evaluates to a string. Third, <code>{0}</code> and <code>{1}</code> are <i>replacement fields</i>, which are replaced by the arguments passed to the <code><dfn>format</dfn>()</code> method.
|
||||
@@ -155,12 +155,12 @@ def approximate_size(size, a_kilobyte_is_1024_bytes=True):
|
||||
<p>The previous example shows the simplest case, where the replacement fields are simply integers. Integer replacement fields are treated as positional indices into the argument list of the <code>format()</code> method. That means that <code>{0}</code> is replaced by the first argument (<var>username</var> in this case), <code>{1}</code> is replaced by the second argument (<var>password</var>), <i class=baa>&</i>c. You can have as many positional indices as you have arguments, and you can have as many arguments as you want. But replacement fields are much more powerful than that.
|
||||
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>import humansize</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>si_suffixes = humansize.SUFFIXES[1000]</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd>si_suffixes</kbd>
|
||||
<samp>['KB', 'MB', 'GB', 'TB', 'PB', 'EB', 'ZB', 'YB']</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>'1000{0[0]} = 1{0[1]}'.format(si_suffixes)</kbd> <span class=u>②</span></a>
|
||||
<samp>'1000KB = 1MB'</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>import humansize</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>si_suffixes = humansize.SUFFIXES[1000]</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>si_suffixes</kbd>
|
||||
<samp class=pp>['KB', 'MB', 'GB', 'TB', 'PB', 'EB', 'ZB', 'YB']</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>'1000{0[0]} = 1{0[1]}'.format(si_suffixes)</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>'1000KB = 1MB'</samp>
|
||||
</pre>
|
||||
<ol>
|
||||
<li>Rather than calling any function in the <code>humansize</code> module, you’re just grabbing one of the data structures it defines: the list of “SI” (powers-of-1000) suffixes.
|
||||
@@ -181,10 +181,10 @@ def approximate_size(size, a_kilobyte_is_1024_bytes=True):
|
||||
<p>Just to blow your mind, here’s an example that combines all of the above:
|
||||
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>import humansize</kbd>
|
||||
<samp class=p>>>> </samp><kbd>import sys</kbd>
|
||||
<samp class=p>>>> </samp><kbd>'1MB = 1000{0.modules[humansize].SUFFIXES[1000][0]}'.format(sys)</kbd>
|
||||
<samp>'1MB = 1000KB'</samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>import humansize</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>import sys</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>'1MB = 1000{0.modules[humansize].SUFFIXES[1000][0]}'.format(sys)</kbd>
|
||||
<samp class=pp>'1MB = 1000KB'</samp></pre>
|
||||
|
||||
<p>Here’s how it works:
|
||||
|
||||
@@ -213,8 +213,8 @@ def approximate_size(size, a_kilobyte_is_1024_bytes=True):
|
||||
<p>Within a replacement field, a colon (<code>:</code>) marks the start of the format specifier. The format specifier “<code>.1</code>” means “round to the nearest tenth” (<i>i.e.</i> display only one digit after the decimal point). The format specifier “<code>f</code>” means “fixed-point number” (as opposed to exponential notation or some other decimal representation). Thus, given a <var>size</var> of <code>698.25</code> and <var>suffix</var> of <code>'GB'</code>, the formatted string would be <code>'698.3 GB'</code>, because <code>698.25</code> gets rounded to one decimal place, then the suffix is appended after the number.
|
||||
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>'{0:.1f} {1}'.format(698.25, 'GB')</kbd>
|
||||
<samp>'698.3 GB'</samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>'{0:.1f} {1}'.format(698.25, 'GB')</kbd>
|
||||
<samp class=pp>'698.3 GB'</samp></pre>
|
||||
|
||||
<p>For all the gory details on format specifiers, consult the <a href=http://docs.python.org/3.0/library/string.html#format-specification-mini-language>Format Specification Mini-Language</a> in the official Python documentation.
|
||||
|
||||
@@ -229,18 +229,18 @@ def approximate_size(size, a_kilobyte_is_1024_bytes=True):
|
||||
<samp class=p>... </samp><kbd>sult of years of scientif-</kbd>
|
||||
<samp class=p>... </samp><kbd>ic study combined with the</kbd>
|
||||
<samp class=p>... </samp><kbd>experience of years.'''</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>s.splitlines()</kbd> <span class=u>②</span></a>
|
||||
<samp>['Finished files are the re-',
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>s.splitlines()</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>['Finished files are the re-',
|
||||
'sult of years of scientif-',
|
||||
'ic study combined with the',
|
||||
'experience of years.']</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>print(s.lower())</kbd> <span class=u>③</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>print(s.lower())</kbd> <span class=u>③</span></a>
|
||||
<samp>finished files are the re-
|
||||
sult of years of scientif-
|
||||
ic study combined with the
|
||||
experience of years.</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>s.lower().count('f')</kbd> <span class=u>④</span></a>
|
||||
<samp>6</samp></pre>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>s.lower().count('f')</kbd> <span class=u>④</span></a>
|
||||
<samp class=pp>6</samp></pre>
|
||||
<ol>
|
||||
<li>You can input <dfn>multiline</dfn> strings in the Python interactive shell. Once you start a multiline string with triple quotation marks, just hit <kbd>ENTER</kbd> and the interactive shell will prompt you to continue the string. Typing the closing triple quotation marks ends the string, and the next <kbd>ENTER</kbd> will execute the command (in this case, assigning the string to <var>s</var>).
|
||||
<li>The <code><dfn>splitlines</dfn>()</code> method takes one multiline string and returns a list of strings, one for each line of the original. Note that the carriage returns at the end of each line are not included.
|
||||
@@ -251,16 +251,16 @@ experience of years.</samp>
|
||||
<p>Here’s another common case. Let’s say you have a list of key-value pairs in the form <code><var>key1</var>=<var>value1</var>&<var>key2</var>=<var>value2</var></code>, and you want to split them up and make a dictionary of the form <code>{key1: value1, key2: value2}</code>.
|
||||
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>query = 'user=pilgrim&database=master&password=PapayaWhip'</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>a_list = query.split('&')</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd>a_list</kbd>
|
||||
<samp>['user=pilgrim', 'database=master', 'password=PapayaWhip']</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>a_list_of_lists = [v.split('=', 1) for v in a_list]</kbd> <span class=u>②</span></a>
|
||||
<samp class=p>>>> </samp><kbd>a_list_of_lists</kbd>
|
||||
<samp>[['user', 'pilgrim'], ['database', 'master'], ['password', 'PapayaWhip']]</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>a_dict = dict(a_list_of_lists)</kbd> <span class=u>③</span></a>
|
||||
<samp class=p>>>> </samp><kbd>a_dict</kbd>
|
||||
<samp>{'password': 'PapayaWhip', 'user': 'pilgrim', 'database': 'master'}</samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>query = 'user=pilgrim&database=master&password=PapayaWhip'</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>a_list = query.split('&')</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>a_list</kbd>
|
||||
<samp class=pp>['user=pilgrim', 'database=master', 'password=PapayaWhip']</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>a_list_of_lists = [v.split('=', 1) for v in a_list]</kbd> <span class=u>②</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>a_list_of_lists</kbd>
|
||||
<samp class=pp>[['user', 'pilgrim'], ['database', 'master'], ['password', 'PapayaWhip']]</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>a_dict = dict(a_list_of_lists)</kbd> <span class=u>③</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>a_dict</kbd>
|
||||
<samp class=pp>{'password': 'PapayaWhip', 'user': 'pilgrim', 'database': 'master'}</samp></pre>
|
||||
|
||||
<ol>
|
||||
<li>The <code><dfn>split</dfn>()</code> string method takes one argument, a delimiter, and split a string into a list of strings based on the delimiter. Here, the delimiter is an ampersand character, but it could be anything.
|
||||
@@ -275,21 +275,21 @@ experience of years.</samp>
|
||||
<p><dfn>Bytes</dfn> are bytes; characters are an abstraction. An immutable sequence of Unicode characters is called a <i>string</i>. An immutable sequence of numbers-between-0-and-255 is called a <i>bytes</i> object.
|
||||
|
||||
<pre class=screen>
|
||||
<a><samp class=p>>>> </samp><kbd>by = b'abcd\x65'</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd>by</kbd>
|
||||
<samp>b'abcde'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>type(by)</kbd> <span class=u>②</span></a>
|
||||
<samp><class 'bytes'></samp>
|
||||
<a><samp class=p>>>> </samp><kbd>len(by)</kbd> <span class=u>③</span></a>
|
||||
<samp>5</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>by += b'\xff'</kbd> <span class=u>④</span></a>
|
||||
<samp class=p>>>> </samp><kbd>by</kbd>
|
||||
<samp>b'abcde\xff'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>len(by)</kbd> <span class=u>⑤</span></a>
|
||||
<samp>6</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>by[0]</kbd> <span class=u>⑥</span></a>
|
||||
<samp>97</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>by[0] = 102</kbd> <span class=u>⑦</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>by = b'abcd\x65'</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>by</kbd>
|
||||
<samp class=pp>b'abcde'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>type(by)</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp><class 'bytes'></samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>len(by)</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp>5</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>by += b'\xff'</kbd> <span class=u>④</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>by</kbd>
|
||||
<samp class=pp>b'abcde\xff'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>len(by)</kbd> <span class=u>⑤</span></a>
|
||||
<samp class=pp>6</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>by[0]</kbd> <span class=u>⑥</span></a>
|
||||
<samp class=pp>97</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>by[0] = 102</kbd> <span class=u>⑦</span></a>
|
||||
<samp class=traceback>Traceback (most recent call last):
|
||||
File "<stdin>", line 1, in <module>
|
||||
TypeError: 'bytes' object does not support item assignment</samp></pre>
|
||||
@@ -304,15 +304,15 @@ TypeError: 'bytes' object does not support item assignment</samp></pre>
|
||||
</ol>
|
||||
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>by = b'abcd\x65'</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>barr = bytearray(by)</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd>barr</kbd>
|
||||
<samp>bytearray(b'abcde')</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>len(barr)</kbd> <span class=u>②</span></a>
|
||||
<samp>5</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>barr[0] = 102</kbd> <span class=u>③</span></a>
|
||||
<samp class=p>>>> </samp><kbd>barr</kbd>
|
||||
<samp>bytearray(b'fbcde')</samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>by = b'abcd\x65'</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>barr = bytearray(by)</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>barr</kbd>
|
||||
<samp class=pp>bytearray(b'abcde')</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>len(barr)</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>5</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>barr[0] = 102</kbd> <span class=u>③</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>barr</kbd>
|
||||
<samp class=pp>bytearray(b'fbcde')</samp></pre>
|
||||
<ol>
|
||||
<li>To convert an <code>bytes</code> object into a mutable <code>bytearray</code> object, use the built-in <code>bytearray()</code> function.
|
||||
<li>All the methods and operations you can do on a <code>bytes</code> object, you can do on a <code>bytearray</code> object too.
|
||||
@@ -322,18 +322,18 @@ TypeError: 'bytes' object does not support item assignment</samp></pre>
|
||||
<p>The one thing you <em>can never do</em> is mix bytes and strings.
|
||||
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>by = b'd'</kbd>
|
||||
<samp class=p>>>> </samp><kbd>s = 'abcde'</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>by + s</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>by = b'd'</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>s = 'abcde'</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>by + s</kbd> <span class=u>①</span></a>
|
||||
<samp class=traceback>Traceback (most recent call last):
|
||||
File "<stdin>", line 1, in <module>
|
||||
TypeError: can't concat bytes to str</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>s.count(by)</kbd> <span class=u>②</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>s.count(by)</kbd> <span class=u>②</span></a>
|
||||
<samp class=traceback>Traceback (most recent call last):
|
||||
File "<stdin>", line 1, in <module>
|
||||
TypeError: Can't convert 'bytes' object to str implicitly</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>s.count(by.decode('ascii'))</kbd> <span class=u>③</span></a>
|
||||
<samp>1</samp></pre>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>s.count(by.decode('ascii'))</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp>1</samp></pre>
|
||||
<ol>
|
||||
<li>You can’t concatenate bytes and strings. They are two different data types.
|
||||
<li>You can’t count the occurrences of bytes in a string, because there are no bytes in a string. A string is a sequence of characters. Perhaps you meant “count the occurrences of the string that you would get after decoding this sequence of bytes in a particular character encoding”? Well then, you’ll need to say that explicitly. Python 3 won’t <dfn>implicitly</dfn> convert bytes to strings or strings to bytes.
|
||||
@@ -343,29 +343,29 @@ TypeError: Can't convert 'bytes' object to str implicitly</samp>
|
||||
<p>And here is the link between strings and bytes: <code>bytes</code> objects have a <code><dfn>decode</dfn>()</code> method that takes a character encoding and returns a string, and strings have an <code><dfn>encode</dfn>()</code> method that takes a character encoding and returns a <code>bytes</code> object. In the previous example, the decoding was relatively straightforward — converting a sequence of bytes n the <abbr>ASCII</abbr> encoding into a string of characters. But the same process works with any encoding that supports the characters of the string — even legacy (non-Unicode) encodings.
|
||||
|
||||
<pre class=screen>
|
||||
<a><samp class=p>>>> </samp><kbd>a_string = '深入 Python'</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd>len(a_string)</kbd>
|
||||
<samp>9</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>by = a_string.encode('utf-8')</kbd> <span class=u>②</span></a>
|
||||
<samp class=p>>>> </samp><kbd>by</kbd>
|
||||
<samp>b'\xe6\xb7\xb1\xe5\x85\xa5 Python'</samp>
|
||||
<samp class=p>>>> </samp><kbd>len(by)</kbd>
|
||||
<samp>13</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>by = a_string.encode('gb18030')</kbd> <span class=u>③</span></a>
|
||||
<samp class=p>>>> </samp><kbd>by</kbd>
|
||||
<samp>b'\xc9\xee\xc8\xeb Python'</samp>
|
||||
<samp class=p>>>> </samp><kbd>len(by)</kbd>
|
||||
<samp>11</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>by = a_string.encode('big5')</kbd> <span class=u>④</span></a>
|
||||
<samp class=p>>>> </samp><kbd>by</kbd>
|
||||
<samp>b'\xb2`\xa4J Python'</samp>
|
||||
<samp class=p>>>> </samp><kbd>len(by)</kbd>
|
||||
<samp>11</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>roundtrip = by.decode('big5')</kbd> <span class=u>⑤</span></a>
|
||||
<samp class=p>>>> </samp><kbd>roundtrip</kbd>
|
||||
<samp>'深入 Python'</samp>
|
||||
<samp class=p>>>> </samp><kbd>a_string == roundtrip</kbd>
|
||||
<samp>True</samp></pre>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>a_string = '深入 Python'</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>len(a_string)</kbd>
|
||||
<samp class=pp>9</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>by = a_string.encode('utf-8')</kbd> <span class=u>②</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>by</kbd>
|
||||
<samp class=pp>b'\xe6\xb7\xb1\xe5\x85\xa5 Python'</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>len(by)</kbd>
|
||||
<samp class=pp>13</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>by = a_string.encode('gb18030')</kbd> <span class=u>③</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>by</kbd>
|
||||
<samp class=pp>b'\xc9\xee\xc8\xeb Python'</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>len(by)</kbd>
|
||||
<samp class=pp>11</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>by = a_string.encode('big5')</kbd> <span class=u>④</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>by</kbd>
|
||||
<samp class=pp>b'\xb2`\xa4J Python'</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>len(by)</kbd>
|
||||
<samp class=pp>11</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>roundtrip = by.decode('big5')</kbd> <span class=u>⑤</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>roundtrip</kbd>
|
||||
<samp class=pp>'深入 Python'</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>a_string == roundtrip</kbd>
|
||||
<samp class=pp>True</samp></pre>
|
||||
<ol>
|
||||
<li>This is a string. It has nine characters.
|
||||
<li>This is a <code>bytes</code> object. It has 13 bytes. It is the sequence of bytes you get when you take <var>a_string</var> and encode it in UTF-8.
|
||||
|
||||
+19
-19
@@ -202,8 +202,8 @@ while n >= integer:
|
||||
print('subtracting {0} from input, adding {1} to output'.format(integer, numeral))</code></pre>
|
||||
<p>With the debug <code>print()</code> statements, the output looks like this:
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>import roman1</kbd>
|
||||
<samp class=p>>>> </samp><kbd>roman1.to_roman(1424)</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>import roman1</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>roman1.to_roman(1424)</kbd>
|
||||
<samp>subtracting 1000 from input, adding M to output
|
||||
subtracting 400 from input, adding CD to output
|
||||
subtracting 10 from input, adding X to output
|
||||
@@ -229,13 +229,13 @@ OK</samp></pre>
|
||||
<aside>The Pythonic way to halt and catch fire is to raise an exception.</aside>
|
||||
<p>It is not enough to test that functions succeed when given good input; you must also test that they fail when given bad input. And not just any sort of failure; they must fail in the way you expect.
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>import roman1</kbd>
|
||||
<samp class=p>>>> </samp><kbd>roman1.to_roman(4000)</kbd>
|
||||
<samp>'MMMM'</samp>
|
||||
<samp class=p>>>> </samp><kbd>roman1.to_roman(5000)</kbd>
|
||||
<samp>'MMMMM'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>roman1.to_roman(9000)</kbd> <span class=u>①</span></a>
|
||||
<samp>'MMMMMMMMM'</samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>import roman1</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>roman1.to_roman(4000)</kbd>
|
||||
<samp class=pp>'MMMM'</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>roman1.to_roman(5000)</kbd>
|
||||
<samp class=pp>'MMMMM'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>roman1.to_roman(9000)</kbd> <span class=u>①</span></a>
|
||||
<samp class=pp>'MMMMMMMMM'</samp></pre>
|
||||
<ol>
|
||||
<li>That’s definitely not what you wanted — that’s not even a valid Roman numeral! In fact, each of these numbers is outside the range of acceptable input, but the function returns a bogus value anyway. Silently returning bad values is <em>baaaaaaad</em>; if a program is going to fail, it is far better that it fail quickly and noisily. “Halt and catch fire,” as the saying goes. The Pythonic way to halt and catch fire is to raise an exception.
|
||||
</ol>
|
||||
@@ -344,11 +344,11 @@ OK</samp></pre>
|
||||
<p>Along with testing numbers that are too large, you need to test numbers that are too small. As <a href=#divingin>we noted in our functional requirements</a>, Roman numerals cannot express <code>0</code> or negative numbers.
|
||||
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>import roman2</kbd>
|
||||
<samp class=p>>>> </samp><kbd>roman2.to_roman(0)</kbd>
|
||||
<samp>''</samp>
|
||||
<samp class=p>>>> </samp><kbd>roman2.to_roman(-1)</kbd>
|
||||
<samp>''</samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>import roman2</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>roman2.to_roman(0)</kbd>
|
||||
<samp class=pp>''</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>roman2.to_roman(-1)</kbd>
|
||||
<samp class=pp>''</samp></pre>
|
||||
|
||||
<p>Well <em>that’s</em> not good. Let’s add tests for each of these conditions.
|
||||
|
||||
@@ -441,11 +441,11 @@ OK</samp></pre>
|
||||
<p>There was one more <a href=#divingin>functional requirement</a> for converting numbers to Roman numerals: dealing with non-integers.
|
||||
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>import roman3</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>roman3.to_roman(0.5)</kbd> <span class=u>①</span></a>
|
||||
<samp>''</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>roman3.to_roman(1.5)</kbd> <span class=u>②</span></a>
|
||||
<samp>'I'</samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>import roman3</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>roman3.to_roman(0.5)</kbd> <span class=u>①</span></a>
|
||||
<samp class=pp>''</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>roman3.to_roman(1.5)</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>'I'</samp></pre>
|
||||
<ol>
|
||||
<li>Oh, that’s bad.
|
||||
<li>Oh, that’s even worse. Both of these cases should raise an exception. Instead, they give bogus results.
|
||||
|
||||
@@ -254,11 +254,11 @@ mark{display:inline}
|
||||
|
||||
<p class=d>[<a href=examples/feed.xml>download <code>feed.xml</code></a>]
|
||||
<pre class=screen>
|
||||
<a><samp class=p>>>> </samp><kbd>import xml.etree.ElementTree as etree</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>tree = etree.parse('examples/feed.xml')</kbd> <span class=u>②</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>root = tree.getroot()</kbd> <span class=u>③</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>root</kbd> <span class=u>④</span></a>
|
||||
<samp><Element {http://www.w3.org/2005/Atom}feed at cd1eb0></samp></pre>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>import xml.etree.ElementTree as etree</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>tree = etree.parse('examples/feed.xml')</kbd> <span class=u>②</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>root = tree.getroot()</kbd> <span class=u>③</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>root</kbd> <span class=u>④</span></a>
|
||||
<samp class=pp><Element {http://www.w3.org/2005/Atom}feed at cd1eb0></samp></pre>
|
||||
<ol>
|
||||
<li>The ElementTree library is part of the Python standard library, in <code>xml.etree.ElementTree</code>.
|
||||
<li>The primary entry point for the ElementTree library is the <code>parse()</code> function, which can take a filename or a file-like object [FIXME xref]. This function parses the entire document at once. If memory is tight, there are ways to <a href=http://effbot.org/zone/element-iterparse.htm>parse an <abbr>XML</abbr> document incrementally instead</a>.
|
||||
@@ -276,14 +276,14 @@ mark{display:inline}
|
||||
|
||||
<pre class=screen>
|
||||
# continued from the previous example
|
||||
<a><samp class=p>>>> </samp><kbd>root.tag</kbd> <span class=u>①</span></a>
|
||||
<samp>'{http://www.w3.org/2005/Atom}feed'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>len(root)</kbd> <span class=u>②</span></a>
|
||||
<samp>8</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>for child in root:</kbd> <span class=u>③</span></a>
|
||||
<a><samp class=p>... </samp><kbd> print(child)</kbd> <span class=u>④</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>root.tag</kbd> <span class=u>①</span></a>
|
||||
<samp class=pp>'{http://www.w3.org/2005/Atom}feed'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>len(root)</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>8</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>for child in root:</kbd> <span class=u>③</span></a>
|
||||
<a><samp class=p>... </samp><kbd class=pp> print(child)</kbd> <span class=u>④</span></a>
|
||||
<samp class=p>... </samp>
|
||||
<samp><Element {http://www.w3.org/2005/Atom}title at e2b5d0>
|
||||
<samp class=pp><Element {http://www.w3.org/2005/Atom}title at e2b5d0>
|
||||
<Element {http://www.w3.org/2005/Atom}subtitle at e2b4e0>
|
||||
<Element {http://www.w3.org/2005/Atom}id at e2b6c0>
|
||||
<Element {http://www.w3.org/2005/Atom}updated at e2b6f0>
|
||||
@@ -306,18 +306,18 @@ mark{display:inline}
|
||||
|
||||
<pre class=screen>
|
||||
# continuing from the previous example
|
||||
<a><samp class=p>>>> </samp><kbd>root.attrib</kbd> <span class=u>①</span></a>
|
||||
<samp>{'{http://www.w3.org/XML/1998/namespace}lang': 'en'}</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>root[4]</kbd> <span class=u>②</span></a>
|
||||
<samp><Element {http://www.w3.org/2005/Atom}link at e181b0></samp>
|
||||
<a><samp class=p>>>> </samp><kbd>root[4].attrib</kbd> <span class=u>③</span></a>
|
||||
<samp>{'href': 'http://diveintomark.org/',
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>root.attrib</kbd> <span class=u>①</span></a>
|
||||
<samp class=pp>{'{http://www.w3.org/XML/1998/namespace}lang': 'en'}</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>root[4]</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp><Element {http://www.w3.org/2005/Atom}link at e181b0></samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>root[4].attrib</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp>{'href': 'http://diveintomark.org/',
|
||||
'type': 'text/html',
|
||||
'rel': 'alternate'}</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>root[3]</kbd> <span class=u>④</span></a>
|
||||
<samp><Element {http://www.w3.org/2005/Atom}updated at e2b4e0></samp>
|
||||
<a><samp class=p>>>> </samp><kbd>root[3].attrib</kbd> <span class=u>⑤</span></a>
|
||||
<samp>{}</samp></pre>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>root[3]</kbd> <span class=u>④</span></a>
|
||||
<samp class=pp><Element {http://www.w3.org/2005/Atom}updated at e2b4e0></samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>root[3].attrib</kbd> <span class=u>⑤</span></a>
|
||||
<samp class=pp>{}</samp></pre>
|
||||
<ol>
|
||||
<li>The <code>attrib</code> property is a dictionary of the element’s attributes. The original markup here was <code><feed xmlns='http://www.w3.org/2005/Atom' xml:lang='en'></code>. The <code>xml:</code> prefix refers to a built-in namespace that every <abbr>XML</abbr> document can use without declaring it.
|
||||
<li>The fifth child — <code>[4]</code> in a <code>0</code>-based list — is the <code>link</code> element.
|
||||
@@ -333,19 +333,19 @@ mark{display:inline}
|
||||
<p>So far, we’ve worked with this <abbr>XML</abbr> document “from the top down,” starting with the root element, getting its child elements, and so on throughout the document. But many uses of <abbr>XML</abbr> require you to find specific elements. Etree can do that, too.
|
||||
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>import xml.etree.ElementTree as etree</kbd>
|
||||
<samp class=p>>>> </samp><kbd>tree = etree.parse('examples/feed.xml')</kbd>
|
||||
<samp class=p>>>> </samp><kbd>root = tree.getroot()</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>root.findall('{http://www.w3.org/2005/Atom}entry')</kbd> <span class=u>①</span></a>
|
||||
<samp>[<Element {http://www.w3.org/2005/Atom}entry at e2b4e0>,
|
||||
<samp class=p>>>> </samp><kbd class=pp>import xml.etree.ElementTree as etree</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>tree = etree.parse('examples/feed.xml')</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>root = tree.getroot()</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>root.findall('{http://www.w3.org/2005/Atom}entry')</kbd> <span class=u>①</span></a>
|
||||
<samp class=pp>[<Element {http://www.w3.org/2005/Atom}entry at e2b4e0>,
|
||||
<Element {http://www.w3.org/2005/Atom}entry at e2b510>,
|
||||
<Element {http://www.w3.org/2005/Atom}entry at e2b540>]</samp>
|
||||
<samp class=p>>>> </samp><kbd>root.tag</kbd>
|
||||
<samp>'{http://www.w3.org/2005/Atom}feed'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>root.findall('{http://www.w3.org/2005/Atom}feed')</kbd> <span class=u>②</span></a>
|
||||
<samp>[]</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>root.findall('{http://www.w3.org/2005/Atom}author')</kbd> <span class=u>③</span></a>
|
||||
<samp>[]</samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>root.tag</kbd>
|
||||
<samp class=pp>'{http://www.w3.org/2005/Atom}feed'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>root.findall('{http://www.w3.org/2005/Atom}feed')</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>[]</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>root.findall('{http://www.w3.org/2005/Atom}author')</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp>[]</samp></pre>
|
||||
<ol>
|
||||
<li>The <code>findall()</code> method finds child elements that match a specific query. (More on the query format in a minute.)
|
||||
<li>Each element — including the root element, but also child elements — has a <code>findall()</code> method. It finds all matching elements among the element’s children. But why aren’t there any results? Although it may not be obvious, this particular query only searches the element’s children. Since the root <code>feed</code> element has no child named <code>feed</code>, this query returns an empty list.
|
||||
@@ -353,12 +353,12 @@ mark{display:inline}
|
||||
</ol>
|
||||
|
||||
<pre class=screen>
|
||||
<a><samp class=p>>>> </samp><kbd>tree.findall('{http://www.w3.org/2005/Atom}entry')</kbd> <span class=u>①</span></a>
|
||||
<samp>[<Element {http://www.w3.org/2005/Atom}entry at e2b4e0>,
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>tree.findall('{http://www.w3.org/2005/Atom}entry')</kbd> <span class=u>①</span></a>
|
||||
<samp class=pp>[<Element {http://www.w3.org/2005/Atom}entry at e2b4e0>,
|
||||
<Element {http://www.w3.org/2005/Atom}entry at e2b510>,
|
||||
<Element {http://www.w3.org/2005/Atom}entry at e2b540>]</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>tree.findall('{http://www.w3.org/2005/Atom}author')</kbd> <span class=u>②</span></a>
|
||||
<samp>[]</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>tree.findall('{http://www.w3.org/2005/Atom}author')</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>[]</samp>
|
||||
</pre>
|
||||
<ol>
|
||||
<li>For convenience, the <code>tree</code> object (returned from the <code>etree.parse()</code> function) has several methods that mirror the methods on the root element. The results are the same as if you had called the <code>tree.getroot().findall()</code> method.
|
||||
@@ -368,26 +368,26 @@ mark{display:inline}
|
||||
<p>There <em>is</em> a way to search for <em>descendant</em> elements, <i>i.e.</i> children, grandchildren, and any element at any nesting level.
|
||||
|
||||
<pre class=screen>
|
||||
<a><samp class=p>>>> </samp><kbd>all_links = tree.findall('//{http://www.w3.org/2005/Atom}link')</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd>all_links</kbd>
|
||||
<samp>[<Element {http://www.w3.org/2005/Atom}link at e181b0>,
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>all_links = tree.findall('//{http://www.w3.org/2005/Atom}link')</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>all_links</kbd>
|
||||
<samp class=pp>[<Element {http://www.w3.org/2005/Atom}link at e181b0>,
|
||||
<Element {http://www.w3.org/2005/Atom}link at e2b570>,
|
||||
<Element {http://www.w3.org/2005/Atom}link at e2b480>,
|
||||
<Element {http://www.w3.org/2005/Atom}link at e2b5a0>]</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>all_links[0].attrib</kbd> <span class=u>②</span></a>
|
||||
<samp>{'href': 'http://diveintomark.org/',
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>all_links[0].attrib</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>{'href': 'http://diveintomark.org/',
|
||||
'type': 'text/html',
|
||||
'rel': 'alternate'}</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>all_links[1].attrib</kbd> <span class=u>③</span></a>
|
||||
<samp>{'href': 'http://diveintomark.org/archives/2009/03/27/dive-into-history-2009-edition',
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>all_links[1].attrib</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp>{'href': 'http://diveintomark.org/archives/2009/03/27/dive-into-history-2009-edition',
|
||||
'type': 'text/html',
|
||||
'rel': 'alternate'}</samp>
|
||||
<samp class=p>>>> </samp><kbd>all_links[2].attrib</kbd>
|
||||
<samp>{'href': 'http://diveintomark.org/archives/2009/03/21/accessibility-is-a-harsh-mistress',
|
||||
<samp class=p>>>> </samp><kbd class=pp>all_links[2].attrib</kbd>
|
||||
<samp class=pp>{'href': 'http://diveintomark.org/archives/2009/03/21/accessibility-is-a-harsh-mistress',
|
||||
'type': 'text/html',
|
||||
'rel': 'alternate'}</samp>
|
||||
<samp class=p>>>> </samp><kbd>all_links[3].attrib</kbd>
|
||||
<samp>{'href': 'http://diveintomark.org/archives/2008/12/18/give-part-1-container-formats',
|
||||
<samp class=p>>>> </samp><kbd class=pp>all_links[3].attrib</kbd>
|
||||
<samp class=pp>{'href': 'http://diveintomark.org/archives/2008/12/18/give-part-1-container-formats',
|
||||
'type': 'text/html',
|
||||
'rel': 'alternate'}</samp></pre>
|
||||
<ol>
|
||||
@@ -400,16 +400,16 @@ mark{display:inline}
|
||||
|
||||
<pre class=screen>
|
||||
# continuing from the previous example
|
||||
<a><samp class=p>>>> </samp><kbd>it = tree.getiterator('{http://www.w3.org/2005/Atom}link')</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>next(it)</kbd> <span class=u>②</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>it = tree.getiterator('{http://www.w3.org/2005/Atom}link')</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>next(it)</kbd> <span class=u>②</span></a>
|
||||
<Element {http://www.w3.org/2005/Atom}link at 122f1b0>
|
||||
<samp class=p>>>> </samp><kbd>next(it)</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>next(it)</kbd>
|
||||
<Element {http://www.w3.org/2005/Atom}link at 122f1e0>
|
||||
<samp class=p>>>> </samp><kbd>next(it)</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>next(it)</kbd>
|
||||
<Element {http://www.w3.org/2005/Atom}link at 122f210>
|
||||
<samp class=p>>>> </samp><kbd>next(it)</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>next(it)</kbd>
|
||||
<Element {http://www.w3.org/2005/Atom}link at 122f1b0>
|
||||
<samp class=p>>>> </samp><kbd>next(it)</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>next(it)</kbd>
|
||||
<samp class=traceback>Traceback (most recent call last):
|
||||
File "<stdin>", line 1, in <module>
|
||||
StopIteration</samp></pre>
|
||||
@@ -427,11 +427,11 @@ StopIteration</samp></pre>
|
||||
<p><a href=http://codespeak.net/lxml/><code>lxml</code></a> is an open source third-party library that builds on the popular <a href=http://www.xmlsoft.org/>libxml2 parser</a>. It provides a 100% compatible ElementTree <abbr>API</abbr>, then extends it with full XPath support and a few other niceties. There are <a href=http://pypi.python.org/pypi/lxml/>installers available for Windows</a>; Linux users should always try to use distribution-specific tools like <code>yum</code> or <code>apt-get</code> to install precompiled binaries from their repositories. Otherwise you’ll need to <a href=http://codespeak.net/lxml/installation.html>install <code>lxml</code> manually</a>.
|
||||
|
||||
<pre class=screen>
|
||||
<a><samp class=p>>>> </samp><kbd>from lxml import etree</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>tree = etree.parse('examples/feed.xml')</kbd> <span class=u>②</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>root = tree.getroot()</kbd> <span class=u>③</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>root.findall('{http://www.w3.org/2005/Atom}entry')</kbd> <span class=u>④</span></a>
|
||||
<samp>[<Element {http://www.w3.org/2005/Atom}entry at e2b4e0>,
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>from lxml import etree</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>tree = etree.parse('examples/feed.xml')</kbd> <span class=u>②</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>root = tree.getroot()</kbd> <span class=u>③</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>root.findall('{http://www.w3.org/2005/Atom}entry')</kbd> <span class=u>④</span></a>
|
||||
<samp class=pp>[<Element {http://www.w3.org/2005/Atom}entry at e2b4e0>,
|
||||
<Element {http://www.w3.org/2005/Atom}entry at e2b510>,
|
||||
<Element {http://www.w3.org/2005/Atom}entry at e2b540>]</samp></pre>
|
||||
<ol>
|
||||
@@ -451,18 +451,18 @@ except ImportError:
|
||||
<p>But <code>lxml</code> is more than just a faster ElementTree. Its <code>findall()</code> method includes support for more complicated expressions.
|
||||
|
||||
<pre class=screen>
|
||||
<a><samp class=p>>>> </samp><kbd>import lxml.etree</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd>tree = lxml.etree.parse('examples/feed.xml')</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>tree.findall('//{http://www.w3.org/2005/Atom}*[@href]')</kbd> <span class=u>②</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>import lxml.etree</kbd> <span class=u>①</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>tree = lxml.etree.parse('examples/feed.xml')</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>tree.findall('//{http://www.w3.org/2005/Atom}*[@href]')</kbd> <span class=u>②</span></a>
|
||||
[<Element {http://www.w3.org/2005/Atom}link at eeb8a0>,
|
||||
<Element {http://www.w3.org/2005/Atom}link at eeb990>,
|
||||
<Element {http://www.w3.org/2005/Atom}link at eeb960>,
|
||||
<Element {http://www.w3.org/2005/Atom}link at eeb9c0>]
|
||||
<a><samp class=p>>>> </samp><kbd>tree.findall("//{http://www.w3.org/2005/Atom}*[@href='http://diveintomark.org/']")</kbd> <span class=u>③</span></a>
|
||||
<samp>[<Element {http://www.w3.org/2005/Atom}link at eeb930>]</samp>
|
||||
<samp class=p>>>> </samp><kbd>NS = '{http://www.w3.org/2005/Atom}'</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>tree.findall('//{NS}author[{NS}uri]'.format(NS=NS))</kbd> <span class=u>④</span></a>
|
||||
<samp>[<Element {http://www.w3.org/2005/Atom}author at eeba80>,
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>tree.findall("//{http://www.w3.org/2005/Atom}*[@href='http://diveintomark.org/']")</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp>[<Element {http://www.w3.org/2005/Atom}link at eeb930>]</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>NS = '{http://www.w3.org/2005/Atom}'</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>tree.findall('//{NS}author[{NS}uri]'.format(NS=NS))</kbd> <span class=u>④</span></a>
|
||||
<samp class=pp>[<Element {http://www.w3.org/2005/Atom}author at eeba80>,
|
||||
<Element {http://www.w3.org/2005/Atom}author at eebba0>]</samp></pre>
|
||||
<ol>
|
||||
<li>In this example, I’m going to <code>import lxml.etree</code> (instead of, say, <code>from lxml import etree</code>), to emphasize that these features are specific to <code>lxml</code>.
|
||||
@@ -474,16 +474,16 @@ except ImportError:
|
||||
<p>Not enough for you? <code>lxml</code> also integrates support for arbitrary XPath expressions. I’m not going to go into depth about XPath syntax; that could be a whole book unto itself! But I will show you how it integrates into <code>lxml</code>.
|
||||
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>import lxml.etree</kbd>
|
||||
<samp class=p>>>> </samp><kbd>tree = lxml.etree.parse('examples/feed.xml')</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>NSMAP = {'atom': 'http://www.w3.org/2005/Atom'}</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>entries = tree.xpath("//atom:category[@term='accessibility']/..",</kbd> <span class=u>②</span></a>
|
||||
<samp class=p>... </samp><kbd> namespaces=NSMAP)</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>entries</kbd> <span class=u>③</span></a>
|
||||
<samp>[<Element {http://www.w3.org/2005/Atom}entry at e2b630>]</samp>
|
||||
<samp class=p>>>> </samp><kbd>entry = entries[0]</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>entry.xpath('./atom:title/text()', namespaces=nsmap)</kbd> <span class=u>④</span></a>
|
||||
<samp>['Accessibility is a harsh mistress']</samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>import lxml.etree</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>tree = lxml.etree.parse('examples/feed.xml')</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>NSMAP = {'atom': 'http://www.w3.org/2005/Atom'}</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>entries = tree.xpath("//atom:category[@term='accessibility']/..",</kbd> <span class=u>②</span></a>
|
||||
<samp class=p>... </samp><kbd class=pp> namespaces=NSMAP)</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>entries</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp>[<Element {http://www.w3.org/2005/Atom}entry at e2b630>]</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>entry = entries[0]</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>entry.xpath('./atom:title/text()', namespaces=nsmap)</kbd> <span class=u>④</span></a>
|
||||
<samp class=pp>['Accessibility is a harsh mistress']</samp></pre>
|
||||
<ol>
|
||||
<li>To perform XPath queries on namespaced elements, you need to define a namespace prefix mapping. This is just a Python dictionary.
|
||||
<li>Here is an XPath query. The XPath expression searches for <code>category</code> elements (in the Atom namespace) that contain a <code>term</code> attribute with the value <code>accessibility</code>. But that’s not actually the query result. Look at the very end of the query string; did you notice the <code>/..</code> bit? That means “and then return the parent element of the <code>category</code> element you just found.” So this single XPath query will find all entries with a child element of <code><category term='accessibility'></code>.
|
||||
@@ -498,11 +498,11 @@ except ImportError:
|
||||
<p>Python’s support for <abbr>XML</abbr> is not limited to parsing existing documents. You can also create <abbr>XML</abbr> documents from scratch.
|
||||
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>import xml.etree.ElementTree as etree</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>new_feed = etree.Element('{http://www.w3.org/2005/Atom}feed',</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>... </samp><kbd> attrib={'{http://www.w3.org/XML/1998/namespace}lang': 'en'})</kbd> <span class=u>②</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>print(etree.tostring(new_feed))</kbd> <span class=u>③</span></a>
|
||||
<samp><ns0:feed xmlns:ns0='http://www.w3.org/2005/Atom' xml:lang='en'/></samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>import xml.etree.ElementTree as etree</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>new_feed = etree.Element('{http://www.w3.org/2005/Atom}feed',</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>... </samp><kbd class=pp> attrib={'{http://www.w3.org/XML/1998/namespace}lang': 'en'})</kbd> <span class=u>②</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>print(etree.tostring(new_feed))</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp><ns0:feed xmlns:ns0='http://www.w3.org/2005/Atom' xml:lang='en'/></samp></pre>
|
||||
<ol>
|
||||
<li>To create a new element, instantiate the <code>Element</code> class. You pass the element name (namespace + local name) as the first argument. This statement creates a <code>feed</code> element in the Atom namespace. This will be our new document’s root element.
|
||||
<li>To add attributes to the newly created element, pass a dictionary of attribute names and values in the <var>attrib</var> argument. Note that the attribute name should be in the standard ElementTree format, <code>{<var>namespace</var>}<var>localname</var></code>.
|
||||
@@ -524,14 +524,14 @@ except ImportError:
|
||||
<p>The built-in ElementTree library does not offer this fine-grained control over serializing namespaced elements, but <code>lxml</code> does.
|
||||
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>import lxml.etree</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>NSMAP = {None: 'http://www.w3.org/2005/Atom'}</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>new_feed = lxml.etree.Element('feed', nsmap=NSMAP)</kbd> <span class=u>②</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>print(lxml.etree.tounicode(new_feed))</kbd> <span class=u>③</span></a>
|
||||
<samp><feed xmlns='http://www.w3.org/2005/Atom'/></samp>
|
||||
<a><samp class=p>>>> </samp><kbd>new_feed.set('{http://www.w3.org/XML/1998/namespace}lang', 'en')</kbd> <span class=u>④</span></a>
|
||||
<samp class=p>>>> </samp><kbd>print(lxml.etree.tounicode(new_feed))</kbd>
|
||||
<samp><feed xmlns='http://www.w3.org/2005/Atom' xml:lang='en'/></samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>import lxml.etree</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>NSMAP = {None: 'http://www.w3.org/2005/Atom'}</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>new_feed = lxml.etree.Element('feed', nsmap=NSMAP)</kbd> <span class=u>②</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>print(lxml.etree.tounicode(new_feed))</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp><feed xmlns='http://www.w3.org/2005/Atom'/></samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>new_feed.set('{http://www.w3.org/XML/1998/namespace}lang', 'en')</kbd> <span class=u>④</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>print(lxml.etree.tounicode(new_feed))</kbd>
|
||||
<samp class=pp><feed xmlns='http://www.w3.org/2005/Atom' xml:lang='en'/></samp></pre>
|
||||
<ol>
|
||||
<li>To start, define a namespace mapping as a dictionary. Dictionary values are namespaces; dictionary keys are the desired prefix. Using <code>None</code> as a prefix effectively declares a default namespace.
|
||||
<li>Now you can pass the <code>lxml</code>-specific <var>nsmap</var> argument when you create an element, and <code>lxml</code> will respect the namespace prefixes you’ve defined.
|
||||
@@ -542,15 +542,15 @@ except ImportError:
|
||||
<p>Are <abbr>XML</abbr> documents limited to one element per document? No, of course not. You can easily create child elements, too.
|
||||
|
||||
<pre class=screen>
|
||||
<a><samp class=p>>>> </samp><kbd>title = lxml.etree.SubElement(new_feed, 'title',</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>... </samp><kbd> attrib={'type':'html'})</kbd> <span class=u>②</span></a>
|
||||
<samp class=p>>>> </samp><kbd>print(lxml.etree.tounicode(new_feed))</kbd>
|
||||
<samp><feed xmlns='http://www.w3.org/2005/Atom' xml:lang='en'><title type='html'/></feed></samp>
|
||||
<a><samp class=p>>>> </samp><kbd>title.text = 'dive into &hellip;'</kbd> <span class=u>③</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>print(lxml.etree.tounicode(new_feed))</kbd> <span class=u>④</span></a>
|
||||
<samp><feed xmlns='http://www.w3.org/2005/Atom' xml:lang='en'><title type='html'>dive into &amp;hellip;</title></feed></samp>
|
||||
<a><samp class=p>>>> </samp><kbd>print(lxml.etree.tounicode(new_feed, pretty_print=True))</kbd> <span class=u>⑤</span></a>
|
||||
<samp><feed xmlns='http://www.w3.org/2005/Atom' xml:lang='en'>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>title = lxml.etree.SubElement(new_feed, 'title',</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>... </samp><kbd class=pp> attrib={'type':'html'})</kbd> <span class=u>②</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>print(lxml.etree.tounicode(new_feed))</kbd>
|
||||
<samp class=pp><feed xmlns='http://www.w3.org/2005/Atom' xml:lang='en'><title type='html'/></feed></samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>title.text = 'dive into &hellip;'</kbd> <span class=u>③</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>print(lxml.etree.tounicode(new_feed))</kbd> <span class=u>④</span></a>
|
||||
<samp class=pp><feed xmlns='http://www.w3.org/2005/Atom' xml:lang='en'><title type='html'>dive into &amp;hellip;</title></feed></samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>print(lxml.etree.tounicode(new_feed, pretty_print=True))</kbd> <span class=u>⑤</span></a>
|
||||
<samp class=pp><feed xmlns='http://www.w3.org/2005/Atom' xml:lang='en'>
|
||||
<title type='html'>dive into&amp;hellip;</title>
|
||||
</feed></samp></pre>
|
||||
<ol>
|
||||
@@ -583,8 +583,8 @@ except ImportError:
|
||||
<p>That’s an error, because the <code>&hellip;</code> entity is not defined in <abbr>XML</abbr>. (It is defined in <abbr>HTML</abbr>.) If you try to parse this broken feed with the default settings, <code>lxml</code> will choke on the undefined entity.
|
||||
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>import lxml.etree</kbd>
|
||||
<samp class=p>>>> </samp><kbd>tree = lxml.etree.parse('examples/feed-broken.xml')</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>import lxml.etree</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>tree = lxml.etree.parse('examples/feed-broken.xml')</kbd>
|
||||
<samp class=traceback>Traceback (most recent call last):
|
||||
File "<stdin>", line 1, in <module>
|
||||
File "lxml.etree.pyx", line 2693, in lxml.etree.parse (src/lxml/lxml.etree.c:52591)
|
||||
@@ -600,17 +600,17 @@ lxml.etree.XMLSyntaxError: Entity 'hellip' not defined, line 3, column 28</samp>
|
||||
<p>To parse this broken <abbr>XML</abbr> document, despite its wellformedness error, you need to create a custom <abbr>XML</abbr> parser.
|
||||
|
||||
<pre class=screen>
|
||||
<a><samp class=p>>>> </samp><kbd>parser = lxml.etree.XMLParser(recover=True)</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>tree = lxml.etree.parse('examples/feed-broken.xml', parser)</kbd> <span class=u>②</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>parser.error_log</kbd> <span class=u>③</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>parser = lxml.etree.XMLParser(recover=True)</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>tree = lxml.etree.parse('examples/feed-broken.xml', parser)</kbd> <span class=u>②</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>parser.error_log</kbd> <span class=u>③</span></a>
|
||||
<samp>examples/feed-broken.xml:3:28:FATAL:PARSER:ERR_UNDECLARED_ENTITY: Entity 'hellip' not defined</samp>
|
||||
<samp class=p>>>> </samp><kbd>tree.findall('{http://www.w3.org/2005/Atom}title')</kbd>
|
||||
<samp>[<Element {http://www.w3.org/2005/Atom}title at ead510>]</samp>
|
||||
<samp class=p>>>> </samp><kbd>title = tree.findall('{http://www.w3.org/2005/Atom}title')[0]</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>title.text</kbd> <span class=u>④</span></a>
|
||||
<samp>'dive into '</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>print(lxml.etree.tounicode(tree.getroot()))</kbd> <span class=u>⑤</span></a>
|
||||
<samp><feed xmlns='http://www.w3.org/2005/Atom' xml:lang='en'>
|
||||
<samp class=p>>>> </samp><kbd class=pp>tree.findall('{http://www.w3.org/2005/Atom}title')</kbd>
|
||||
<samp class=pp>[<Element {http://www.w3.org/2005/Atom}title at ead510>]</samp>
|
||||
<samp class=p>>>> </samp><kbd class=pp>title = tree.findall('{http://www.w3.org/2005/Atom}title')[0]</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>title.text</kbd> <span class=u>④</span></a>
|
||||
<samp class=pp>'dive into '</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>print(lxml.etree.tounicode(tree.getroot()))</kbd> <span class=u>⑤</span></a>
|
||||
<samp class=pp><feed xmlns='http://www.w3.org/2005/Atom' xml:lang='en'>
|
||||
<title>dive into </title>
|
||||
.
|
||||
. [rest of serialization snipped for brevity]
|
||||
|
||||
@@ -110,17 +110,17 @@ if __name__ == '__main__':
|
||||
<p>You can also pass values into a function by name.
|
||||
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>from humansize import approximate_size</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd>approximate_size(4000, a_kilobyte_is_1024_bytes=False)</kbd> <span class=u>①</span></a>
|
||||
<samp>'4.0 KB'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>approximate_size(size=4000, a_kilobyte_is_1024_bytes=False)</kbd> <span class=u>②</span></a>
|
||||
<samp>'4.0 KB'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>approximate_size(a_kilobyte_is_1024_bytes=False, size=4000)</kbd> <span class=u>③</span></a>
|
||||
<samp>'4.0 KB'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>approximate_size(a_kilobyte_is_1024_bytes=False, 4000)</kbd> <span class=u>④</span></a>
|
||||
<samp class=p>>>> </samp><kbd class=pp>from humansize import approximate_size</kbd>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>approximate_size(4000, a_kilobyte_is_1024_bytes=False)</kbd> <span class=u>①</span></a>
|
||||
<samp class=pp>'4.0 KB'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>approximate_size(size=4000, a_kilobyte_is_1024_bytes=False)</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>'4.0 KB'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>approximate_size(a_kilobyte_is_1024_bytes=False, size=4000)</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp>'4.0 KB'</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>approximate_size(a_kilobyte_is_1024_bytes=False, 4000)</kbd> <span class=u>④</span></a>
|
||||
<samp class=traceback> File "<stdin>", line 1
|
||||
SyntaxError: non-keyword arg after keyword arg</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>approximate_size(size=4000, False)</kbd> <span class=u>⑤</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>approximate_size(size=4000, False)</kbd> <span class=u>⑤</span></a>
|
||||
<samp class=traceback> File "<stdin>", line 1
|
||||
SyntaxError: non-keyword arg after keyword arg</samp></pre>
|
||||
<ol>
|
||||
@@ -163,10 +163,10 @@ SyntaxError: non-keyword arg after keyword arg</samp></pre>
|
||||
<p>In case you missed it, I just said that Python functions have attributes, and that those attributes are available at runtime. A function, like everything else in Python, is an object.
|
||||
<p>Run the interactive Python shell and follow along:
|
||||
<pre class=screen>
|
||||
<a><samp class=p>>>> </samp><kbd>import humansize</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>print(humansize.approximate_size(4096, True))</kbd> <span class=u>②</span></a>
|
||||
<samp>4.0 KiB</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>print(humansize.approximate_size.__doc__)</kbd> <span class=u>③</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>import humansize</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>print(humansize.approximate_size(4096, True))</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>4.0 KiB</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>print(humansize.approximate_size.__doc__)</kbd> <span class=u>③</span></a>
|
||||
<samp>Convert a file size to human-readable form.
|
||||
|
||||
Keyword arguments:
|
||||
@@ -188,20 +188,20 @@ SyntaxError: non-keyword arg after keyword arg</samp></pre>
|
||||
<h3 id=importsearchpath>The <code>import</code> Search Path</h3>
|
||||
<p>Before this goes any further, I want to briefly mention the library search path. Python looks in several places when you try to import a module. Specifically, it looks in all the directories defined in <code>sys.path</code>. This is just a list, and you can easily view it or modify it with standard list methods. (You’ll learn more about lists in <a href=native-datatypes.html#lists>Native Datatypes</a>.)
|
||||
<pre class=screen>
|
||||
<a><samp class=p>>>> </samp><kbd>import sys</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>sys.path</kbd> <span class=u>②</span></a>
|
||||
<samp>['',
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>import sys</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>sys.path</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>['',
|
||||
'/usr/lib/python30.zip',
|
||||
'/usr/lib/python3.0',
|
||||
'/usr/lib/python3.0/plat-linux2@EXTRAMACHDEPPATH@',
|
||||
'/usr/lib/python3.0/lib-dynload',
|
||||
'/usr/lib/python3.0/dist-packages',
|
||||
'/usr/local/lib/python3.0/dist-packages']</samp>
|
||||
<a><samp class=p>>>> </samp><kbd>sys</kbd> <span class=u>③</span></a>
|
||||
<samp><module 'sys' (built-in)></samp>
|
||||
<a><samp class=p>>>> </samp><kbd>sys.path.insert(0, '/home/mark/py')</kbd> <span class=u>④</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd>sys.path</kbd> <span class=u>⑤</span></a>
|
||||
<samp>['/home/mark/py',
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>sys</kbd> <span class=u>③</span></a>
|
||||
<samp class=pp><module 'sys' (built-in)></samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>sys.path.insert(0, '/home/mark/py')</kbd> <span class=u>④</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>sys.path</kbd> <span class=u>⑤</span></a>
|
||||
<samp class=pp>['/home/mark/py',
|
||||
'',
|
||||
'/usr/lib/python30.zip',
|
||||
'/usr/lib/python3.0',
|
||||
@@ -261,9 +261,9 @@ if __name__ == '__main__':
|
||||
</blockquote>
|
||||
<p>So what makes this <code>if</code> statement special? Well, modules are objects, and all modules have a built-in attribute <code>__name__</code>. A module’s <code>__name__</code> depends on how you’re using the module. If you <code>import</code> the module, then <code>__name__</code> is the module’s filename, without a directory path or file extension.
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>import humansize</kbd>
|
||||
<samp class=p>>>> </samp><kbd>humansize.__name__</kbd>
|
||||
<samp>'humansize'</samp></pre>
|
||||
<samp class=p>>>> </samp><kbd class=pp>import humansize</kbd>
|
||||
<samp class=p>>>> </samp><kbd class=pp>humansize.__name__</kbd>
|
||||
<samp class=pp>'humansize'</samp></pre>
|
||||
<p>But you can also run the module directly as a standalone program, in which case <code>__name__</code> will be a special default value, <code>__main__</code>. Python will evaluate this <code>if</code> statement, find a true expression, and execute the <code>if</code> code block. In this case, to print two values.
|
||||
<pre class=screen>
|
||||
<samp class=p>c:\home\diveintopython3> </samp><kbd>c:\python30\python.exe humansize.py</kbd>
|
||||
|
||||
Reference in New Issue
Block a user