diff --git a/comprehensions.html b/comprehensions.html
index 92d9b98..fa30792 100644
--- a/comprehensions.html
+++ b/comprehensions.html
@@ -20,30 +20,203 @@ body{counter-reset:h1 3}
 </blockquote>
 <p id=toc>&nbsp;
 <h2 id=divingin>Diving In</h2>
-<p class=f>FIXME
+<p class=f>This chapter will teach you about list comprehensions, dictionary comprehensions, and set comprehensions: three related concepts centered around one very powerful technique. But first, I want to take a little detour into two modules that will help you navigate your local file system.
+
+<h2 id=os>The <code>os</code> module</h2>
+
+<p>Python 3 comes with a module called <code>os</code>, which stands for &#8220;operating system.&#8221; The <a href=http://docs.python.org/3.1/library/os.html><code>os</code> module</a> contains a plethora of functions to get information on&nbsp;&mdash;&nbsp;and in some cases, to manipulate&nbsp;&mdash;&nbsp;local directories, files, processes, and environment variables. Python does its best to offer a unified <abbr>API</abbr> across <a href=installing-python.html>all supported operating systems</a> so your programs can run on any computer with as little platform-specific code as possible.
+
+<h3 id=getcwd>The Current Working Directory</h3>
+
+<p>When you&#8217;re just getting started with Python, you&#8217;re going to spend a lot of time in <a href=installing-python.html#idle>the Python Shell</a>. Throughout this book, you will see examples that go like this:
+
+<ol>
+<li>Import one of the modules in the <a href=examples/><code>examples</code> folder</a>
+<li>Call a function in that module
+<li>Explain the result
+</ol>
+
+<p>If you don&#8217;t know about the current working directory, step 1 will probably fail with an <code>ImportError</code>. Why? Because Python will look for the example module in <a href=your-first-python-program.html#importsearchpath>the import search path</a>, but it won&#8217;t find it because the <code>examples</code> folder isn&#8217;t one of the directories in the search path. To get past this, you can do one of two things:
+
+<ol>
+<li>Add the <code>examples</code> folder to the import search path
+<li>Change the current working directory to the <code>examples</code> folder
+</ol>
+
+<p>The current working directory is an invisible property that Python holds in memory at all times. There is always a current working directory, whether you&#8217;re in the Python Shell, running your own Python script from the command line, or running a Python <abbr>CGI</abbr> script on a web server somewhere.
+
+<p>The <code>os</code> module contains two functions to deal with the current working directory.
+
+<pre class=screen>
+<a><samp class=p>>>> </samp><kbd class=pp>import os</kbd>                                            <span class=u>&#x2460;</span></a>
+<a><samp class=p>>>> </samp><kbd class=pp>print(os.getcwd())</kbd>                                   <span class=u>&#x2461;</span></a>
+<samp class=pp>C:\Python31</samp>
+<a><samp class=p>>>> </samp><kbd class=pp>os.chdir('/Users/pilgrim/diveintopython3/examples')</kbd>  <span class=u>&#x2462;</span></a>
+<a><samp class=p>>>> </samp><kbd class=pp>print(os.getcwd())</kbd>                                   <span class=u>&#x2463;</span></a>
+<samp class=pp>C:\Users\pilgrim\diveintopython3\examples</samp></pre>
+<ol>
+<li>When you run the graphical Python Shell, the current working directory starts as the directory where the Python Shell executable is. On Windows, this depends on where you installed Python; the default directory is <code>c:\Python31</code>. If you run the Python Shell from the command line, the current working directory starts as the directory you were in when you ran <code>python3</code>.
+<li>FIXME
+<li>FIXME
+<li>FIXME
+</ol>
+
+<h3 id=ospath>The <code>os.path</code> module</h3>
+
+<p>FIXME The <code>os.path</code> module has several functions for manipulating files and directories. Here, we're looking at handling pathnames and listing the contents of a directory.
+<pre class=screen>
+<samp class=p>>>> </samp><kbd>import os</kbd>
+<samp class=p>>>> </samp><kbd>os.path.join("c:\\music\\ap\\", "mahadeva.mp3")</kbd> <span>&#x2460;</span> <span>&#x2461;</span>
+'c:\\music\\ap\\mahadeva.mp3'
+<samp class=p>>>> </samp><kbd>os.path.join("c:\\music\\ap", "mahadeva.mp3")</kbd>   <span>&#x2462;</span>
+'c:\\music\\ap\\mahadeva.mp3'
+<samp class=p>>>> </samp><kbd>os.path.expanduser("~")</kbd>       <span>&#x2463;</span>
+'c:\\Documents and Settings\\mpilgrim\\My Documents'
+<samp class=p>>>> </samp><kbd>os.path.join(os.path.expanduser("~"), "Python")</kbd> <span>&#x2464;</span>
+'c:\\Documents and Settings\\mpilgrim\\My Documents\\Python'</pre>
+<ol>
+<li><code>os.path</code> is a reference to a module -- which module depends on your platform. Just as <a href="#crossplatform.example" title="Example 6.2. Supporting Platform-Specific Functionality"><code>getpass</code></a> encapsulates differences between platforms by setting <var>getpass</var> to a platform-specific function, <code>os</code> encapsulates differences between platforms by setting <var>path</var> to a platform-specific module.
+<li>The <code>join</code> function of <code>os.path</code> constructs a pathname out of one or more partial pathnames. In this case, it simply concatenates strings. (Note that dealing
+            with pathnames on Windows is annoying because the backslash character must be escaped.)
+<li>In this slightly less trivial case, <code>join</code> will add an extra backslash to the pathname before joining it to the filename. I was overjoyed when I discovered this, since
+<code>addSlashIfNecessary</code> is one of the stupid little functions I always need to write when building up my toolbox in a new language. <em>Do not</em> write this stupid little function in Python; smart people have already taken care of it for you.
+<li><code>expanduser</code> will expand a pathname that uses <code>~</code> to represent the current user's home directory. This works on any platform where users have a home directory, like Windows,
+<abbr>UNIX</abbr>, and Mac OS X; it has no effect on Mac OS.
+<li>Combining these techniques, you can easily construct pathnames for directories and files under the user's home directory.
+</ol>
+
+<p>FIXME
+
+<pre class=screen><samp class=p>>>> </samp><kbd>os.path.split("c:\\music\\ap\\mahadeva.mp3")</kbd>      <span>&#x2460;</span>
+('c:\\music\\ap', 'mahadeva.mp3')
+<samp class=p>>>> </samp><kbd>(filepath, filename) = os.path.split("c:\\music\\ap\\mahadeva.mp3")</kbd> <span>&#x2461;</span>
+<samp class=p>>>> </samp><kbd>filepath</kbd>      <span>&#x2462;</span>
+'c:\\music\\ap'
+<samp class=p>>>> </samp><kbd>filename</kbd>      <span>&#x2463;</span>
+'mahadeva.mp3'
+<samp class=p>>>> </samp><kbd>(shortname, extension) = os.path.splitext(filename)</kbd>                 <span>&#x2464;</span>
+<samp class=p>>>> </samp><kbd>shortname</kbd>
+'mahadeva'
+<samp class=p>>>> </samp><kbd>extension</kbd>
+'.mp3'</pre>
+<ol>
+<li>The <code>split</code> function splits a full pathname and returns a tuple containing the path and filename. Remember when I said you could use
+<a href="#odbchelper.multiassign" title="3.4.2. Assigning Multiple Values at Once">multi-variable assignment</a> to return multiple values from a function?  Well, <code>split</code> is such a function.
+<li>You assign the return value of the <code>split</code> function into a tuple of two variables. Each variable receives the value of the corresponding element of the returned tuple.
+<li>The first variable, <var>filepath</var>, receives the value of the first element of the tuple returned from <code>split</code>, the file path.
+<li>The second variable, <var>filename</var>, receives the value of the second element of the tuple returned from <code>split</code>, the filename.
+<li><code>os.path</code> also contains a function <code>splitext</code>, which splits a filename and returns a tuple containing the filename and the file extension.  You use the same technique
+            to assign each of them to separate variables.
+</ol>
+
+<p>FIXME
+
+<pre class=screen><samp class=p>>>> </samp><kbd>os.listdir("c:\\music\\_singles\\")</kbd>              <span>&#x2460;</span>
+<samp>['a_time_long_forgotten_con.mp3', 'hellraiser.mp3',
+'kairo.mp3', 'long_way_home1.mp3', 'sidewinder.mp3', 
+'spinning.mp3']</samp>
+<samp class=p>>>> </samp><kbd>dirname = "c:\\"</kbd>
+<samp class=p>>>> </samp><kbd>os.listdir(dirname)</kbd>            <span>&#x2461;</span>
+<samp>['AUTOEXEC.BAT', 'boot.ini', 'CONFIG.SYS', 'cygwin',
+'docbook', 'Documents and Settings', 'Incoming', 'Inetpub', 'IO.SYS',
+'MSDOS.SYS', 'Music', 'NTDETECT.COM', 'ntldr', 'pagefile.sys',
+'Program Files', 'Python20', 'RECYCLER',
+'System Volume Information', 'TEMP', 'WINNT']</samp>
+<samp class=p>>>> </samp><kbd>[f for f in os.listdir(dirname)</kbd>
+<samp class=p>...    </samp>if os.path.isfile(os.path.join(dirname, f))] <span>&#x2462;</span>
+<samp>['AUTOEXEC.BAT', 'boot.ini', 'CONFIG.SYS', 'IO.SYS', 'MSDOS.SYS',
+'NTDETECT.COM', 'ntldr', 'pagefile.sys']</samp>
+<samp class=p>>>> </samp><kbd>[f for f in os.listdir(dirname)</kbd>
+<samp class=p>...    </samp>if os.path.isdir(os.path.join(dirname, f))]  <span>&#x2463;</span>
+<samp>['cygwin', 'docbook', 'Documents and Settings', 'Incoming',
+'Inetpub', 'Music', 'Program Files', 'Python20', 'RECYCLER',
+'System Volume Information', 'TEMP', 'WINNT']</samp></pre>
+<ol>
+<li>The <code>listdir</code> function takes a pathname and returns a list of the contents of the directory.
+<li><code>listdir</code> returns both files and folders, with no indication of which is which.
+<li>You can use <a href="#apihelper.filter" title="4.5. Filtering Lists">list filtering</a> and the <code>isfile</code> function of the <code>os.path</code> module to separate the files from the folders. <code>isfile</code> takes a pathname and returns 1 if the path represents a file, and 0 otherwise. Here you're using <code><code>os.path</code>.<code>join</code></code> to ensure a full pathname, but <code>isfile</code> also works with a partial path, relative to the current working directory. You can use <code>os.getcwd()</code> to get the current working directory.
+<li><code>os.path</code> also has a <code>isdir</code> function which returns 1 if the path represents a directory, and 0 otherwise. You can use this to get a list of the subdirectories
+            within a directory.
+</ol>
+
+<h2 id=glob>The <code>glob</code> module</h2>
+
+<p>FIXME
+
+<pre><code>def listDirectory(directory, fileExtList):
+    "get list of file info objects for files of particular extensions"
+    fileList = [os.path.normcase(f)
+                for f in os.listdir(directory)]            <span>&#x2460;</span> <span>&#x2461;</span>
+    fileList = [os.path.join(directory, f) 
+               for f in fileList
+                if os.path.splitext(f)[1] in fileExtList]  <span>&#x2462;</span> <span>&#x2463;</span> <span>&#x2464;</span></code></pre>
+<ol>
+<li><code>os.listdir(directory)</code> returns a list of all the files and folders in <var>directory</var>.
+<li>Iterating through the list with <var>f</var>, you use <code>os.path.normcase(f)</code> to normalize the case according to operating system defaults. <code>normcase</code> is a useful little function that compensates for case-insensitive operating systems that think that <code>mahadeva.mp3</code> and <code>mahadeva.MP3</code> are the same file. For instance, on Windows and Mac OS, <code>normcase</code> will convert the entire filename to lowercase; on <abbr>UNIX</abbr>-compatible systems, it will return the filename unchanged.
+<li>Iterating through the normalized list with <var>f</var> again, you use <code>os.path.splitext(f)</code> to split each filename into name and extension.
+<li>For each file, you see if the extension is in the list of file extensions you care about (<var>fileExtList</var>, which was passed to the <code>listDirectory</code> function).
+<li>For each file you care about, you use <code>os.path.join(directory, f)</code> to construct the full pathname of the file, and return a list of the full pathnames.
+</ol>
+
+<blockquote class=note>
+<p><span class=u>&#x261E;</span>Whenever possible, you should use the functions in <code>os</code> and <code>os.path</code> for file, directory, and path manipulations. These modules are wrappers for platform-specific modules, so functions like <code>os.path.split()</code> work on <abbr>UNIX</abbr>, Windows, Mac OS X, and any other platform supported by Python.
+</blockquote>
+
+<p>There is one other way to get the contents of a directory. It's very powerful, and it uses the sort of wildcards that you may already be familiar with from working on the command line.
+
+<pre class=screen>
+<samp class=p>>>> </samp><kbd>os.listdir("c:\\music\\_singles\\")</kbd>               <span>&#x2460;</span>
+<samp>['a_time_long_forgotten_con.mp3', 'hellraiser.mp3',
+'kairo.mp3', 'long_way_home1.mp3', 'sidewinder.mp3',
+'spinning.mp3']</samp>
+<samp class=p>>>> </samp><kbd>import glob</kbd>
+<samp class=p>>>> </samp><kbd>glob.glob('c:\\music\\_singles\\*.mp3')</kbd>           <span>&#x2461;</span>
+<samp>['c:\\music\\_singles\\a_time_long_forgotten_con.mp3',
+'c:\\music\\_singles\\hellraiser.mp3',
+'c:\\music\\_singles\\kairo.mp3',
+'c:\\music\\_singles\\long_way_home1.mp3',
+'c:\\music\\_singles\\sidewinder.mp3',
+'c:\\music\\_singles\\spinning.mp3']</samp>
+<samp class=p>>>> </samp><kbd>glob.glob('c:\\music\\_singles\\s*.mp3')</kbd>          <span>&#x2462;</span>
+<samp>['c:\\music\\_singles\\sidewinder.mp3',
+'c:\\music\\_singles\\spinning.mp3']</samp>
+<samp class=p>>>> </samp><kbd>glob.glob('c:\\music\\*\\*.mp3')</kbd><span>&#x2463;</span>
+</pre>
+<ol>
+<li>As you saw earlier, <code>os.listdir</code> simply takes a directory path and lists all files and directories in that directory.
+<li>The <code>glob</code> module, on the other hand, takes a wildcard and returns the full path of all files and directories matching the wildcard.
+             Here the wildcard is a directory path plus "*.mp3", which will match all <code>.mp3</code> files. Note that each element of the returned list already includes the full path of the file.
+<li>If you want to find all the files in a specific directory that start with "s" and end with ".mp3", you can do that too.
+<li>Now consider this scenario: you have a <code>music</code> directory, with several subdirectories within it, with <code>.mp3</code> files within each subdirectory. You can get a list of all of those with a single call to <code>glob</code>, by using two wildcards at once. One wildcard is the <code>"*.mp3"</code> (to match <code>.mp3</code> files), and one wildcard is <em>within the directory path itself</em>, to match any subdirectory within <code>c:\music</code>. That's a crazy amount of power packed into one deceptively simple-looking function!
+</ol>
 
 <h2 id=list-comprehensions>List Comprehensions</h2>
 
-<p>FIXME
-<!--
 <p>One of the most powerful features of Python is the list comprehension, which provides a compact way of mapping a list into another list by applying a function to each
    of the elements of the list.
-<div class=example><h3>Example 3.24. Introducing List Comprehensions</h3><pre class=screen><samp class=p>>>> </samp><kbd>li = [1, 9, 8, 4]</kbd>
-<samp class=p>>>> </samp><kbd>[elem*2 for elem in li]</kbd>      <span>&#x2460;</span>
+
+<pre class=screen><samp class=p>>>> </samp><kbd>li = [1, 9, 8, 4]</kbd>
+<samp class=p>>>> </samp><kbd>[elem * 2 for elem in li]</kbd>      <span>&#x2460;</span>
 [2, 18, 16, 8]
 <samp class=p>>>> </samp><kbd>li</kbd>         <span>&#x2461;</span>
 [1, 9, 8, 4]
-<samp class=p>>>> </samp><kbd>li = [elem*2 for elem in li]</kbd> <span>&#x2462;</span>
+<samp class=p>>>> </samp><kbd>li = [elem * 2 for elem in li]</kbd> <span>&#x2462;</span>
 <samp class=p>>>> </samp><kbd>li</kbd>
 [2, 18, 16, 8]</pre>
 <ol>
 <li>To make sense of this, look at it from right to left. <var>li</var> is the list you're mapping. Python loops through <var>li</var> one element at a time, temporarily assigning the value of each element to the variable <var>elem</var>. Python then applies the function <code><var>elem</var>*2</code> and appends that result to the returned list.
 <li>Note that list comprehensions do not change the original list.
 <li>It is safe to assign the result of a list comprehension to the variable that you're mapping. Python constructs the new list in memory, and when the list comprehension is complete, it assigns the result to the variable.
+</ol>
 
-<p>Here are the list comprehensions in the <code>buildConnectionString</code> function that you declared in <a href="#odbchelper">Chapter 2</a>:<pre><code>
-["%s=%s" % (k, v) for k, v in params.items()]</pre><p>First, notice that you're calling the <code>items</code> function of the <var>params</var> dictionary. This function returns a list of tuples of all the data in the dictionary.
-<div class=example><h3 id="odbchelper.items">Example 3.25. The <code>keys</code>, <code>values</code>, and <code>items</code> Functions</h3><pre class=screen><samp class=p>>>> </samp><kbd>params = {"server":"mpilgrim", "database":"master", "uid":"sa", "pwd":"secret"}</kbd>
+<p>FIXME Here are the list comprehensions in the <code>buildConnectionString</code> function that you declared in <a href="#odbchelper">Chapter 2</a>:
+
+<pre><code>["%s=%s" % (k, v) for k, v in params.items()]</code></pre>
+
+<p>First, notice that you're calling the <code>items</code> function of the <var>params</var> dictionary. This function returns a list of tuples of all the data in the dictionary.
+
+<pre class=screen>
+<samp class=p>>>> </samp><kbd>params = {"server":"mpilgrim", "database":"master", "uid":"sa", "pwd":"secret"}</kbd>
 <samp class=p>>>> </samp><kbd>params.keys()</kbd>   <span>&#x2460;</span>
 ['server', 'uid', 'database', 'pwd']
 <samp class=p>>>> </samp><kbd>params.values()</kbd> <span>&#x2461;</span>
@@ -55,9 +228,13 @@ body{counter-reset:h1 3}
             (remember that elements in a dictionary are unordered), but it is a list.
 <li>The <code>values</code> method returns a list of all the values. The list is in the same order as the list returned by <code>keys</code>, so <code>params.values()[n] == params[params.keys()[n]]</code> for all values of <var>n</var>.
 <li>The <code>items</code> method returns a list of tuples of the form <code>(<var>key</var>, <var>value</var>)</code>. The list contains all the data in the dictionary.
+</ol>
+
 <p>Now let's see what <code>buildConnectionString</code> does. It takes a list, <code><var>params</var>.<code>items</code>()</code>, and maps it to a new list by applying string formatting to each element. The new list will have the same number of elements
 as <code><var>params</var>.<code>items</code>()</code>, but each element in the new list will be a string that contains both a key and its associated value from the <var>params</var> dictionary.
-<div class=example><h3>Example 3.26. List Comprehensions in <code>buildConnectionString</code>, Step by Step</h3><pre class=screen><samp class=p>>>> </samp><kbd>params = {"server":"mpilgrim", "database":"master", "uid":"sa", "pwd":"secret"}</kbd>
+
+<pre class=screen>
+<samp class=p>>>> </samp><kbd>params = {"server":"mpilgrim", "database":"master", "uid":"sa", "pwd":"secret"}</kbd>
 <samp class=p>>>> </samp><kbd>params.items()</kbd>
 [('server', 'mpilgrim'), ('uid', 'sa'), ('database', 'master'), ('pwd', 'secret')]
 <samp class=p>>>> </samp><kbd>[k for k, v in params.items()]</kbd>                <span>&#x2460;</span>
@@ -71,7 +248,7 @@ as <code><var>params</var>.<code>items</code>()</code>, but each element in the
 <li>Here you're doing the same thing, but ignoring the value of <var>k</var>, so this list comprehension ends up being equivalent to <code><var>params</var>.<code>values</code>()</code>.
 <li>Combining the previous two examples with some simple <a href="#odbchelper.stringformatting" title="3.5. Formatting Strings">string formatting</a>, you get a list of strings that include both the key and value of each element of the dictionary. This looks suspiciously
             like the <a href="#odbchelper.output">output</a> of the program. All that remains is to join the elements in this list into a single string.
--->
+</ol>
 
 <p class=a>&#x2042;
 
@@ -89,7 +266,7 @@ as <code><var>params</var>.<code>items</code>()</code>, but each element in the
 
 <h2 id=furtherreading>Further Reading</h2>
 <ul>
-<li>FIXME
+<li><a href=http://docs.python.org/3.1/library/os.html><code>os</code> module</a>
 </ul>
 <p class=v><a href=native-datatypes.html rel=prev title='back to &#8220;Native Datatypes&#8221;'><span class=u>&#x261C;</span></a> <a href=strings.html rel=next title='onward to &#8220;Strings&#8221;'><span class=u>&#x261E;</span></a>
 <p class=c>&copy; 2001&ndash;9 <a href=about.html>Mark Pilgrim</a>
diff --git a/dip2 b/dip2
index f4716ec..59210bb 100755
--- a/dip2
+++ b/dip2
@@ -181,132 +181,11 @@ stat</samp>
 <li><a href="http://www.python.org/doc/current/lib/"><i class=citetitle>Python Library Reference</i></a> documents the <a href="http://www.python.org/doc/current/lib/module-sys.html"><code>sys</code></a> module.
 
 </ul>
-<h2 id="fileinfo.os">6.5. Working with Directories</h2>
-<p>The <code>os.path</code> module has several functions for manipulating files and directories. Here, we're looking at handling pathnames and listing
-   the contents of a directory.
-<div class=example><h3 id="fileinfo.os.path.join.example">Example 6.16. Constructing Pathnames</h3><pre class=screen>
-<samp class=p>>>> </samp><kbd>import os</kbd>
-<samp class=p>>>> </samp><kbd>os.path.join("c:\\music\\ap\\", "mahadeva.mp3")</kbd> <span>&#x2460;</span> <span>&#x2461;</span>
-'c:\\music\\ap\\mahadeva.mp3'
-<samp class=p>>>> </samp><kbd>os.path.join("c:\\music\\ap", "mahadeva.mp3")</kbd>   <span>&#x2462;</span>
-'c:\\music\\ap\\mahadeva.mp3'
-<samp class=p>>>> </samp><kbd>os.path.expanduser("~")</kbd>       <span>&#x2463;</span>
-'c:\\Documents and Settings\\mpilgrim\\My Documents'
-<samp class=p>>>> </samp><kbd>os.path.join(os.path.expanduser("~"), "Python")</kbd> <span>&#x2464;</span>
-'c:\\Documents and Settings\\mpilgrim\\My Documents\\Python'</pre>
-<ol>
-<li><code>os.path</code> is a reference to a module -- which module depends on your platform. Just as <a href="#crossplatform.example" title="Example 6.2. Supporting Platform-Specific Functionality"><code>getpass</code></a> encapsulates differences between platforms by setting <var>getpass</var> to a platform-specific function, <code>os</code> encapsulates differences between platforms by setting <var>path</var> to a platform-specific module.
-<li>The <code>join</code> function of <code>os.path</code> constructs a pathname out of one or more partial pathnames. In this case, it simply concatenates strings. (Note that dealing
-            with pathnames on Windows is annoying because the backslash character must be escaped.)
-<li>In this slightly less trivial case, <code>join</code> will add an extra backslash to the pathname before joining it to the filename. I was overjoyed when I discovered this, since
-<code>addSlashIfNecessary</code> is one of the stupid little functions I always need to write when building up my toolbox in a new language. <em>Do not</em> write this stupid little function in Python; smart people have already taken care of it for you.
-<li><code>expanduser</code> will expand a pathname that uses <code>~</code> to represent the current user's home directory. This works on any platform where users have a home directory, like Windows,
-<abbr>UNIX</abbr>, and Mac OS X; it has no effect on Mac OS.
-<li>Combining these techniques, you can easily construct pathnames for directories and files under the user's home directory.
-<div class=example><h3 id="splittingpathnames.example">Example 6.17. Splitting Pathnames</h3><pre class=screen><samp class=p>>>> </samp><kbd>os.path.split("c:\\music\\ap\\mahadeva.mp3")</kbd>      <span>&#x2460;</span>
-('c:\\music\\ap', 'mahadeva.mp3')
-<samp class=p>>>> </samp><kbd>(filepath, filename) = os.path.split("c:\\music\\ap\\mahadeva.mp3")</kbd> <span>&#x2461;</span>
-<samp class=p>>>> </samp><kbd>filepath</kbd>      <span>&#x2462;</span>
-'c:\\music\\ap'
-<samp class=p>>>> </samp><kbd>filename</kbd>      <span>&#x2463;</span>
-'mahadeva.mp3'
-<samp class=p>>>> </samp><kbd>(shortname, extension) = os.path.splitext(filename)</kbd>                 <span>&#x2464;</span>
-<samp class=p>>>> </samp><kbd>shortname</kbd>
-'mahadeva'
-<samp class=p>>>> </samp><kbd>extension</kbd>
-'.mp3'</pre>
-<ol>
-<li>The <code>split</code> function splits a full pathname and returns a tuple containing the path and filename. Remember when I said you could use
-<a href="#odbchelper.multiassign" title="3.4.2. Assigning Multiple Values at Once">multi-variable assignment</a> to return multiple values from a function?  Well, <code>split</code> is such a function.
-<li>You assign the return value of the <code>split</code> function into a tuple of two variables. Each variable receives the value of the corresponding element of the returned tuple.
-<li>The first variable, <var>filepath</var>, receives the value of the first element of the tuple returned from <code>split</code>, the file path.
-<li>The second variable, <var>filename</var>, receives the value of the second element of the tuple returned from <code>split</code>, the filename.
-<li><code>os.path</code> also contains a function <code>splitext</code>, which splits a filename and returns a tuple containing the filename and the file extension.  You use the same technique
-            to assign each of them to separate variables.
-<div class=example><h3 id="fileinfo.listdir.example">Example 6.18. Listing Directories</h3><pre class=screen><samp class=p>>>> </samp><kbd>os.listdir("c:\\music\\_singles\\")</kbd>              <span>&#x2460;</span>
-<samp>['a_time_long_forgotten_con.mp3', 'hellraiser.mp3',
-'kairo.mp3', 'long_way_home1.mp3', 'sidewinder.mp3', 
-'spinning.mp3']</samp>
-<samp class=p>>>> </samp><kbd>dirname = "c:\\"</kbd>
-<samp class=p>>>> </samp><kbd>os.listdir(dirname)</kbd>            <span>&#x2461;</span>
-<samp>['AUTOEXEC.BAT', 'boot.ini', 'CONFIG.SYS', 'cygwin',
-'docbook', 'Documents and Settings', 'Incoming', 'Inetpub', 'IO.SYS',
-'MSDOS.SYS', 'Music', 'NTDETECT.COM', 'ntldr', 'pagefile.sys',
-'Program Files', 'Python20', 'RECYCLER',
-'System Volume Information', 'TEMP', 'WINNT']</samp>
-<samp class=p>>>> </samp><kbd>[f for f in os.listdir(dirname)</kbd>
-<samp class=p>...    </samp>if os.path.isfile(os.path.join(dirname, f))] <span>&#x2462;</span>
-<samp>['AUTOEXEC.BAT', 'boot.ini', 'CONFIG.SYS', 'IO.SYS', 'MSDOS.SYS',
-'NTDETECT.COM', 'ntldr', 'pagefile.sys']</samp>
-<samp class=p>>>> </samp><kbd>[f for f in os.listdir(dirname)</kbd>
-<samp class=p>...    </samp>if os.path.isdir(os.path.join(dirname, f))]  <span>&#x2463;</span>
-<samp>['cygwin', 'docbook', 'Documents and Settings', 'Incoming',
-'Inetpub', 'Music', 'Program Files', 'Python20', 'RECYCLER',
-'System Volume Information', 'TEMP', 'WINNT']</span></pre>
-<ol>
-<li>The <code>listdir</code> function takes a pathname and returns a list of the contents of the directory.
-<li><code>listdir</code> returns both files and folders, with no indication of which is which.
-<li>You can use <a href="#apihelper.filter" title="4.5. Filtering Lists">list filtering</a> and the <code>isfile</code> function of the <code>os.path</code> module to separate the files from the folders. <code>isfile</code> takes a pathname and returns 1 if the path represents a file, and 0 otherwise. Here you're using <code><code>os.path</code>.<code>join</code></code> to ensure a full pathname, but <code>isfile</code> also works with a partial path, relative to the current working directory. You can use <code>os.getcwd()</code> to get the current working directory.
-<li><code>os.path</code> also has a <code>isdir</code> function which returns 1 if the path represents a directory, and 0 otherwise. You can use this to get a list of the subdirectories
-            within a directory.
-<div class=example><h3>Example 6.19. Listing Directories in <code>fileinfo.py</code></h3><pre><code>
-def listDirectory(directory, fileExtList):    
-    "get list of file info objects for files of particular extensions" 
-    fileList = [os.path.normcase(f)
-                for f in os.listdir(directory)]            <span>&#x2460;</span> <span>&#x2461;</span>
-    fileList = [os.path.join(directory, f) 
-               for f in fileList
-                if os.path.splitext(f)[1] in fileExtList]  <span>&#x2462;</span> <span>&#x2463;</span> <span>&#x2464;</span></pre>
-<ol>
-<li><code>os.listdir(directory)</code> returns a list of all the files and folders in <var>directory</var>.
-<li>Iterating through the list with <var>f</var>, you use <code>os.path.normcase(f)</code> to normalize the case according to operating system defaults. <code>normcase</code> is a useful little function that compensates for case-insensitive operating systems that think that <code>mahadeva.mp3</code> and <code>mahadeva.MP3</code> are the same file. For instance, on Windows and Mac OS, <code>normcase</code> will convert the entire filename to lowercase; on <abbr>UNIX</abbr>-compatible systems, it will return the filename unchanged.
-<li>Iterating through the normalized list with <var>f</var> again, you use <code>os.path.splitext(f)</code> to split each filename into name and extension.
-<li>For each file, you see if the extension is in the list of file extensions you care about (<var>fileExtList</var>, which was passed to the <code>listDirectory</code> function).
-<li>For each file you care about, you use <code>os.path.join(directory, f)</code> to construct the full pathname of the file, and return a list of the full pathnames.
-<table id="tip.os" class=note border="0" summary="">
-
-<td rowspan="2" align="center" valign="top" width="1%"><img src="images/note.png" alt="Note" title="" width="24" height="24"><td colspan="2" align="left" valign="top" width="99%">Whenever possible, you should use the functions in <code>os</code> and <code>os.path</code> for file, directory, and path manipulations. These modules are wrappers for platform-specific modules, so functions like
-<code>os.path.split</code> work on <abbr>UNIX</abbr>, Windows, Mac OS, and any other platform supported by Python.
-<p>There is one other way to get the contents of a directory. It's very powerful, and it uses the sort of wildcards that you
-may already be familiar with from working on the command line.
-<div class=example><h3 id="fileinfo.os.glob.example">Example 6.20. Listing Directories with <code>glob</code></h3><pre class=screen>
-<samp class=p>>>> </samp><kbd>os.listdir("c:\\music\\_singles\\")</kbd>               <span>&#x2460;</span>
-<samp>['a_time_long_forgotten_con.mp3', 'hellraiser.mp3',
-'kairo.mp3', 'long_way_home1.mp3', 'sidewinder.mp3',
-'spinning.mp3']</samp>
-<samp class=p>>>> </samp><kbd>import glob</kbd>
-<samp class=p>>>> </samp><kbd>glob.glob('c:\\music\\_singles\\*.mp3')</kbd>           <span>&#x2461;</span>
-<samp>['c:\\music\\_singles\\a_time_long_forgotten_con.mp3',
-'c:\\music\\_singles\\hellraiser.mp3',
-'c:\\music\\_singles\\kairo.mp3',
-'c:\\music\\_singles\\long_way_home1.mp3',
-'c:\\music\\_singles\\sidewinder.mp3',
-'c:\\music\\_singles\\spinning.mp3']</samp>
-<samp class=p>>>> </samp><kbd>glob.glob('c:\\music\\_singles\\s*.mp3')</kbd>          <span>&#x2462;</span>
-<samp>['c:\\music\\_singles\\sidewinder.mp3',
-'c:\\music\\_singles\\spinning.mp3']</samp>
-<samp class=p>>>> </samp><kbd>glob.glob('c:\\music\\*\\*.mp3')</kbd><span>&#x2463;</span>
-</pre>
-<ol>
-<li>As you saw earlier, <code>os.listdir</code> simply takes a directory path and lists all files and directories in that directory.
-<li>The <code>glob</code> module, on the other hand, takes a wildcard and returns the full path of all files and directories matching the wildcard.
-             Here the wildcard is a directory path plus "*.mp3", which will match all <code>.mp3</code> files. Note that each element of the returned list already includes the full path of the file.
-<li>If you want to find all the files in a specific directory that start with "s" and end with ".mp3", you can do that too.
-<li>Now consider this scenario: you have a <code>music</code> directory, with several subdirectories within it, with <code>.mp3</code> files within each subdirectory. You can get a list of all of those with a single call to <code>glob</code>, by using two wildcards at once. One wildcard is the <code>"*.mp3"</code> (to match <code>.mp3</code> files), and one wildcard is <em>within the directory path itself</em>, to match any subdirectory within <code>c:\music</code>. That's a crazy amount of power packed into one deceptively simple-looking function!
-<div class=itemizedlist>
-<h3>Further Reading on the <code>os</code> Module</h3>
-<ul>
-<li><a href="http://www.faqts.com/knowledge-base/index.phtml/fid/199/">Python Knowledge Base</a> answers <a href="http://www.faqts.com/knowledge-base/index.phtml/fid/240">questions about the <code>os</code> module</a>.
-
-<li><a href="http://www.python.org/doc/current/lib/"><i class=citetitle>Python Library Reference</i></a> documents the <a href="http://www.python.org/doc/current/lib/module-os.html"><code>os</code></a> module and the <a href="http://www.python.org/doc/current/lib/module-os.path.html"><code>os.path</code></a> module.
-
-</ul>
 
 
 
 
 
-[HTML stuff was here]
 
 
 
@@ -690,731 +569,6 @@ def main(argv):
 
 
 
-[HTTP web services stuff was here]
-
-
-
-
-
-[unit testing stuff was here]
-
-
-
-
-<div class=chapter>
-<h2 id="roman1.5">Chapter 14. Test-First Programming</h2>
-<h2 id="roman.stage1">14.1. <code>roman.py</code>, stage 1</h2>
-<p>Now that the unit tests are complete, it's time to start writing the code that the test cases are attempting to test. You're
-   going to do this in stages, so you can see all the unit tests fail, then watch them pass one by one as you fill in the gaps
-   in <code>roman.py</code>.
-<div class=example><h3>Example 14.1. <code>roman1.py</code></h3>
-<p>This file is available in <code>py/roman/stage1/</code> in the examples directory.
-<p>If you have not already done so, you can <a href="http://diveintopython3.org/download/diveintopython3-examples-5.4.zip" title="Download example scripts">download this and other examples</a> used in this book.
-<pre><code>
-"""Convert to and from Roman numerals"""
-
-#Define exceptions
-class RomanError(Exception): pass                <span>&#x2460;</span>
-class OutOfRangeError(RomanError): pass          <span>&#x2461;</span>
-class NotIntegerError(RomanError): pass
-class InvalidRomanNumeralError(RomanError): pass <span>&#x2462;</span>
-
-def to_roman(n):
-    """convert integer to Roman numeral"""
-    pass     <span>&#x2463;</span>
-
-def from_roman(s):
-    """convert Roman numeral to integer"""
-    pass
-</pre>
-<ol>
-<li>This is how you define your own custom exceptions in Python. Exceptions are classes, and you create your own by subclassing existing exceptions. It is strongly recommended (but not
-            required) that you subclass <code>Exception</code>, which is the base class that all built-in exceptions inherit from. Here I am defining <code>RomanError</code> (inherited from <code>Exception</code>) to act as the base class for all my other custom exceptions to follow. This is a matter of style; I could just as easily
-            have inherited each individual exception from the <code>Exception</code> class directly.
-<li>The <code>OutOfRangeError</code> and <code>NotIntegerError</code> exceptions will eventually be used by <code>to_roman()</code> to flag various forms of invalid input, as specified in <a href="#roman.tobadinput.example" title="Example 13.3. Testing bad input to to_roman"><code>ToRomanBadInput</code></a>.
-<li>The <code>InvalidRomanNumeralError</code> exception will eventually be used by <code>from_roman()</code> to flag invalid input, as specified in <a href="#roman.frombadinput.example" title="Example 13.4. Testing bad input to from_roman"><code>FromRomanBadInput</code></a>.
-<li>At this stage, you want to define the <abbr>API</abbr> of each of your functions, but you don't want to code them yet, so you stub them out using the Python reserved word <a href="#fileinfo.class.simplest" title="Example 5.3. The Simplest Python Class"><code>pass</code></a>.
-<p>Now for the big moment (drum roll please): you're finally going to run the unit test against this stubby little module. At
-this point, every test case should fail. In fact, if any test case passes in stage 1, you should go back to <code>romantest.py</code> and re-evaluate why you coded a test so useless that it passes with do-nothing functions.
-<li>At this stage, you want to define the <abbr>API</abbr> of each of your functions, but you don't want to code them yet, so you stub them out using the Python reserved word <a href="#fileinfo.class.simplest" title="Example 5.3. The Simplest Python Class"><code>pass</code></a>.
-<p>Run <code>romantest1.py</code> with the <code>-v</code> command-line option, which will give more verbose output so you can see exactly what's going on as each test case runs. 
-With any luck, your output should look like this:
-<div class=example><h3 id="roman.stage1.output">Example 14.2. Output of <code>romantest1.py</code> against <code>roman1.py</code></h3><pre class=screen><samp>from_roman should only accept uppercase input ... ERROR
-to_roman should always return uppercase ... ERROR
-from_roman should fail with malformed antecedents ... FAIL
-from_roman should fail with repeated pairs of numerals ... FAIL
-from_roman should fail with too many repeated numerals ... FAIL
-from_roman should give known result with known input ... FAIL
-to_roman should give known result with known input ... FAIL
-from_roman(to_roman(n))==n for all n ... FAIL
-to_roman should fail with non-integer input ... FAIL
-to_roman should fail with negative input ... FAIL
-to_roman should fail with large input ... FAIL
-to_roman should fail with 0 input ... FAIL
-
-======================================================================
-ERROR: from_roman should only accept uppercase input
-----------------------------------------------------------------------
-</span><samp class=traceback>Traceback (most recent call last):
-  File "C:\docbook\dip\py\roman\stage1\romantest1.py", line 154, in testFromRomanCase
-    roman1.from_roman(numeral.upper())
-AttributeError: 'None' object has no attribute 'upper'</span><samp>
-======================================================================
-ERROR: to_roman should always return uppercase
-----------------------------------------------------------------------
-</span><samp class=traceback>Traceback (most recent call last):
-  File "C:\docbook\dip\py\roman\stage1\romantest1.py", line 148, in testToRomanCase
-    self.assertEqual(numeral, numeral.upper())
-AttributeError: 'None' object has no attribute 'upper'</span><samp>
-======================================================================
-FAIL: from_roman should fail with malformed antecedents
-----------------------------------------------------------------------
-</span><samp class=traceback>Traceback (most recent call last):
-  File "C:\docbook\dip\py\roman\stage1\romantest1.py", line 133, in testMalformedAntecedent
-    self.assertRaises(roman1.InvalidRomanNumeralError, roman1.from_roman, s)
-  File "c:\python21\lib\unittest.py", line 266, in failUnlessRaises
-    raise self.failureException, excName
-AssertionError: InvalidRomanNumeralError</span><samp>
-======================================================================
-FAIL: from_roman should fail with repeated pairs of numerals
-----------------------------------------------------------------------
-</span><samp class=traceback>Traceback (most recent call last):
-  File "C:\docbook\dip\py\roman\stage1\romantest1.py", line 127, in testRepeatedPairs
-    self.assertRaises(roman1.InvalidRomanNumeralError, roman1.from_roman, s)
-  File "c:\python21\lib\unittest.py", line 266, in failUnlessRaises
-    raise self.failureException, excName
-AssertionError: InvalidRomanNumeralError</span><samp>
-======================================================================
-FAIL: from_roman should fail with too many repeated numerals
-----------------------------------------------------------------------
-</span><samp class=traceback>Traceback (most recent call last):
-  File "C:\docbook\dip\py\roman\stage1\romantest1.py", line 122, in testTooManyRepeatedNumerals
-    self.assertRaises(roman1.InvalidRomanNumeralError, roman1.from_roman, s)
-  File "c:\python21\lib\unittest.py", line 266, in failUnlessRaises
-    raise self.failureException, excName
-AssertionError: InvalidRomanNumeralError</span><samp>
-======================================================================
-FAIL: from_roman should give known result with known input
-----------------------------------------------------------------------
-</span><samp class=traceback>Traceback (most recent call last):
-  File "C:\docbook\dip\py\roman\stage1\romantest1.py", line 99, in testFromRomanKnownValues
-    self.assertEqual(integer, result)
-  File "c:\python21\lib\unittest.py", line 273, in failUnlessEqual
-    raise self.failureException, (msg or '%s != %s' % (first, second))
-AssertionError: 1 != None</span><samp>
-======================================================================
-FAIL: to_roman should give known result with known input
-----------------------------------------------------------------------
-</span><samp class=traceback>Traceback (most recent call last):
-  File "C:\docbook\dip\py\roman\stage1\romantest1.py", line 93, in testToRomanKnownValues
-    self.assertEqual(numeral, result)
-  File "c:\python21\lib\unittest.py", line 273, in failUnlessEqual
-    raise self.failureException, (msg or '%s != %s' % (first, second))
-AssertionError: I != None</span><samp>
-======================================================================
-FAIL: from_roman(to_roman(n))==n for all n
-----------------------------------------------------------------------
-</span><samp class=traceback>Traceback (most recent call last):
-  File "C:\docbook\dip\py\roman\stage1\romantest1.py", line 141, in testSanity
-    self.assertEqual(integer, result)
-  File "c:\python21\lib\unittest.py", line 273, in failUnlessEqual
-    raise self.failureException, (msg or '%s != %s' % (first, second))
-AssertionError: 1 != None</span><samp>
-======================================================================
-FAIL: to_roman should fail with non-integer input
-----------------------------------------------------------------------
-</span><samp class=traceback>Traceback (most recent call last):
-  File "C:\docbook\dip\py\roman\stage1\romantest1.py", line 116, in testNonInteger
-    self.assertRaises(roman1.NotIntegerError, roman1.to_roman, 0.5)
-  File "c:\python21\lib\unittest.py", line 266, in failUnlessRaises
-    raise self.failureException, excName
-AssertionError: NotIntegerError</span><samp>
-======================================================================
-FAIL: to_roman should fail with negative input
-----------------------------------------------------------------------
-</span><samp class=traceback>Traceback (most recent call last):
-  File "C:\docbook\dip\py\roman\stage1\romantest1.py", line 112, in testNegative
-    self.assertRaises(roman1.OutOfRangeError, roman1.to_roman, -1)
-  File "c:\python21\lib\unittest.py", line 266, in failUnlessRaises
-    raise self.failureException, excName
-AssertionError: OutOfRangeError</span><samp>
-======================================================================
-FAIL: to_roman should fail with large input
-----------------------------------------------------------------------
-</span><samp class=traceback>Traceback (most recent call last):
-  File "C:\docbook\dip\py\roman\stage1\romantest1.py", line 104, in testTooLarge
-    self.assertRaises(roman1.OutOfRangeError, roman1.to_roman, 4000)
-  File "c:\python21\lib\unittest.py", line 266, in failUnlessRaises
-    raise self.failureException, excName
-AssertionError: OutOfRangeError</span><samp>
-======================================================================
-FAIL: to_roman should fail with 0 input               </span><span>&#x2460;</span><samp>
-----------------------------------------------------------------------
-</span><samp class=traceback>Traceback (most recent call last):
-  File "C:\docbook\dip\py\roman\stage1\romantest1.py", line 108, in testZero
-    self.assertRaises(roman1.OutOfRangeError, roman1.to_roman, 0)
-  File "c:\python21\lib\unittest.py", line 266, in failUnlessRaises
-    raise self.failureException, excName
-AssertionError: OutOfRangeError    </span><span>&#x2461;</span><samp>
-----------------------------------------------------------------------
-Ran 12 tests in 0.040s             </span><span>&#x2462;</span><samp>
-
-FAILED (failures=10, errors=2)     </span><span>&#x2463;</span></pre>
-<h2 id="roman.stage2">14.2. <code>roman.py</code>, stage 2</h2>
-<p>Now that you have the framework of the <code>roman</code> module laid out, it's time to start writing code and passing test cases.
-<div class=example><h3 id="roman.stage2.example">Example 14.3. <code>roman2.py</code></h3>
-<p>This file is available in <code>py/roman/stage2/</code> in the examples directory.
-<p>If you have not already done so, you can <a href="http://diveintopython3.org/download/diveintopython3-examples-5.4.zip" title="Download example scripts">download this and other examples</a> used in this book.
-<pre><code>
-"""Convert to and from Roman numerals"""
-
-#Define exceptions
-class RomanError(Exception): pass
-class OutOfRangeError(RomanError): pass
-class NotIntegerError(RomanError): pass
-class InvalidRomanNumeralError(RomanError): pass
-
-#Define digit mapping
-romanNumeralMap = (('M',  1000), <span>&#x2460;</span>
- ('CM', 900),
- ('D',  500),
- ('CD', 400),
- ('C',  100),
- ('XC', 90),
- ('L',  50),
- ('XL', 40),
- ('X',  10),
- ('IX', 9),
- ('V',  5),
- ('IV', 4),
- ('I',  1))
-
-def to_roman(n):
-    """convert integer to Roman numeral"""
-    result = ""
-    for numeral, integer in romanNumeralMap:
-        while n >= integer:      <span>&#x2461;</span>
-            result += numeral
-            n -= integer
-    return result
-
-def from_roman(s):
-    """convert Roman numeral to integer"""
-    pass
-</pre>
-<ol>
-<li><var>romanNumeralMap</var> is a tuple of tuples which defines three things:
-<div class=orderedlist>
-<ol>
-<li>The character representations of the most basic Roman numerals. Note that this is not just the single-character Roman numerals;
-   you're also defining two-character pairs like <code>CM</code> (&#8220;one hundred less than one thousand&#8221;); this will make the <code>to_roman()</code> code simpler later.
-
-<li>The order of the Roman numerals. They are listed in descending value order, from <code>M</code> all the way down to <code>I</code>.
-
-<li>The value of each Roman numeral. Each inner tuple is a pair of <code>(<var>numeral</var>, <var>value</var>)</code>.
-
-</ol>
-<li>Here's where your rich data structure pays off, because you don't need any special logic to handle the subtraction rule. 
-            To convert to Roman numerals, you simply iterate through <var>romanNumeralMap</var> looking for the largest integer value less than or equal to the input. Once found, you add the Roman numeral representation
-            to the end of the output, subtract the corresponding integer value from the input, lather, rinse, repeat.
-<div class=example><h3>Example 14.4. How <code>to_roman()</code> works</h3>
-<p>If you're not clear how <code>to_roman()</code> works, add a <code>print</code> statement to the end of the <code>while</code> loop:<pre><code>
-        while n >= integer:
-            result += numeral
-            n -= integer
-            print 'subtracting', integer, 'from input, adding', numeral, 'to output'</pre><pre class=screen>
-<samp class=p>>>> </samp><kbd>import roman2</kbd>
-<samp class=p>>>> </samp><kbd>roman2.to_roman(1424)</kbd>
-<samp>subtracting 1000 from input, adding M to output
-subtracting 400 from input, adding CD to output
-subtracting 10 from input, adding X to output
-subtracting 10 from input, adding X to output
-subtracting 4 from input, adding IV to output
-'MCDXXIV'</span>
-</pre><p>So <code>to_roman()</code> appears to work, at least in this manual spot check. But will it pass the unit testing?  Well no, not entirely.
-<div class=example><h3>Example 14.5. Output of <code>romantest2.py</code> against <code>roman2.py</code></h3>
-<p>Remember to run <code>romantest2.py</code> with the <code>-v</code> command-line flag to enable verbose mode.
-<pre class=screen><samp>from_roman should only accept uppercase input ... FAIL
-to_roman should always return uppercase ... ok</span><span>&#x2460;</span><samp>
-from_roman should fail with malformed antecedents ... FAIL
-from_roman should fail with repeated pairs of numerals ... FAIL
-from_roman should fail with too many repeated numerals ... FAIL
-from_roman should give known result with known input ... FAIL
-to_roman should give known result with known input ... ok       </span><span>&#x2461;</span><samp>
-from_roman(to_roman(n))==n for all n ... FAIL
-to_roman should fail with non-integer input ... FAIL            </span><span>&#x2462;</span><samp>
-to_roman should fail with negative input ... FAIL
-to_roman should fail with large input ... FAIL
-to_roman should fail with 0 input ... FAIL</span></pre>
-<ol>
-<li><code>to_roman()</code> does, in fact, always return uppercase, because <var>romanNumeralMap</var> defines the Roman numeral representations as uppercase. So this test passes already.
-<li>Here's the big news: this version of the <code>to_roman()</code> function passes the <a href="#roman.testtoromanknownvalues.example" title="Example 13.2. testToRomanKnownValues">known values test</a>. Remember, it's not comprehensive, but it does put the function through its paces with a variety of good inputs, including
-            inputs that produce every single-character Roman numeral, the largest possible input (<code>3999</code>), and the input that produces the longest possible Roman numeral (<code>3888</code>). At this point, you can be reasonably confident that the function works for any good input value you could throw at it.
-<li>However, the function does not &#8220;work&#8221; for bad values; it fails every single <a href="#roman.tobadinput.example" title="Example 13.3. Testing bad input to to_roman">bad input test</a>. That makes sense, because you didn't include any checks for bad input. Those test cases look for specific exceptions to
-            be raised (via <code>assertRaises</code>), and you're never raising them. You'll do that in the next stage.
-<p>Here's the rest of the output of the unit test, listing the details of all the failures. You're down to 10.
-<pre class=screen><samp>
-======================================================================
-FAIL: from_roman should only accept uppercase input
-----------------------------------------------------------------------
-</span><samp class=traceback>Traceback (most recent call last):
-  File "C:\docbook\dip\py\roman\stage2\romantest2.py", line 156, in testFromRomanCase
-    roman2.from_roman, numeral.lower())
-  File "c:\python21\lib\unittest.py", line 266, in failUnlessRaises
-    raise self.failureException, excName
-AssertionError: InvalidRomanNumeralError</span><samp>
-======================================================================
-FAIL: from_roman should fail with malformed antecedents
-----------------------------------------------------------------------
-</span><samp class=traceback>Traceback (most recent call last):
-  File "C:\docbook\dip\py\roman\stage2\romantest2.py", line 133, in testMalformedAntecedent
-    self.assertRaises(roman2.InvalidRomanNumeralError, roman2.from_roman, s)
-  File "c:\python21\lib\unittest.py", line 266, in failUnlessRaises
-    raise self.failureException, excName
-AssertionError: InvalidRomanNumeralError</span><samp>
-======================================================================
-FAIL: from_roman should fail with repeated pairs of numerals
-----------------------------------------------------------------------
-</span><samp class=traceback>Traceback (most recent call last):
-  File "C:\docbook\dip\py\roman\stage2\romantest2.py", line 127, in testRepeatedPairs
-    self.assertRaises(roman2.InvalidRomanNumeralError, roman2.from_roman, s)
-  File "c:\python21\lib\unittest.py", line 266, in failUnlessRaises
-    raise self.failureException, excName
-AssertionError: InvalidRomanNumeralError</span><samp>
-======================================================================
-FAIL: from_roman should fail with too many repeated numerals
-----------------------------------------------------------------------
-</span><samp class=traceback>Traceback (most recent call last):
-  File "C:\docbook\dip\py\roman\stage2\romantest2.py", line 122, in testTooManyRepeatedNumerals
-    self.assertRaises(roman2.InvalidRomanNumeralError, roman2.from_roman, s)
-  File "c:\python21\lib\unittest.py", line 266, in failUnlessRaises
-    raise self.failureException, excName
-AssertionError: InvalidRomanNumeralError</span><samp>
-======================================================================
-FAIL: from_roman should give known result with known input
-----------------------------------------------------------------------
-</span><samp class=traceback>Traceback (most recent call last):
-  File "C:\docbook\dip\py\roman\stage2\romantest2.py", line 99, in testFromRomanKnownValues
-    self.assertEqual(integer, result)
-  File "c:\python21\lib\unittest.py", line 273, in failUnlessEqual
-    raise self.failureException, (msg or '%s != %s' % (first, second))
-AssertionError: 1 != None</span><samp>
-======================================================================
-FAIL: from_roman(to_roman(n))==n for all n
-----------------------------------------------------------------------
-</span><samp class=traceback>Traceback (most recent call last):
-  File "C:\docbook\dip\py\roman\stage2\romantest2.py", line 141, in testSanity
-    self.assertEqual(integer, result)
-  File "c:\python21\lib\unittest.py", line 273, in failUnlessEqual
-    raise self.failureException, (msg or '%s != %s' % (first, second))
-AssertionError: 1 != None</span><samp>
-======================================================================
-FAIL: to_roman should fail with non-integer input
-----------------------------------------------------------------------
-</span><samp class=traceback>Traceback (most recent call last):
-  File "C:\docbook\dip\py\roman\stage2\romantest2.py", line 116, in testNonInteger
-    self.assertRaises(roman2.NotIntegerError, roman2.to_roman, 0.5)
-  File "c:\python21\lib\unittest.py", line 266, in failUnlessRaises
-    raise self.failureException, excName
-AssertionError: NotIntegerError</span><samp>
-======================================================================
-FAIL: to_roman should fail with negative input
-----------------------------------------------------------------------
-</span><samp class=traceback>Traceback (most recent call last):
-  File "C:\docbook\dip\py\roman\stage2\romantest2.py", line 112, in testNegative
-    self.assertRaises(roman2.OutOfRangeError, roman2.to_roman, -1)
-  File "c:\python21\lib\unittest.py", line 266, in failUnlessRaises
-    raise self.failureException, excName
-AssertionError: OutOfRangeError</span><samp>
-======================================================================
-FAIL: to_roman should fail with large input
-----------------------------------------------------------------------
-</span><samp class=traceback>Traceback (most recent call last):
-  File "C:\docbook\dip\py\roman\stage2\romantest2.py", line 104, in testTooLarge
-    self.assertRaises(roman2.OutOfRangeError, roman2.to_roman, 4000)
-  File "c:\python21\lib\unittest.py", line 266, in failUnlessRaises
-    raise self.failureException, excName
-AssertionError: OutOfRangeError</span><samp>
-======================================================================
-FAIL: to_roman should fail with 0 input
-----------------------------------------------------------------------
-</span><samp class=traceback>Traceback (most recent call last):
-  File "C:\docbook\dip\py\roman\stage2\romantest2.py", line 108, in testZero
-    self.assertRaises(roman2.OutOfRangeError, roman2.to_roman, 0)
-  File "c:\python21\lib\unittest.py", line 266, in failUnlessRaises
-    raise self.failureException, excName
-AssertionError: OutOfRangeError</span><samp>
-----------------------------------------------------------------------
-Ran 12 tests in 0.320s
-
-FAILED (failures=10)</span></pre><h2 id="roman.stage3">14.3. <code>roman.py</code>, stage 3</h2>
-<p>Now that <code>to_roman()</code> behaves correctly with good input (integers from <code>1</code> to <code>3999</code>), it's time to make it behave correctly with bad input (everything else).
-<div class=example><h3>Example 14.6. <code>roman3.py</code></h3>
-<p>This file is available in <code>py/roman/stage3/</code> in the examples directory.
-<p>If you have not already done so, you can <a href="http://diveintopython3.org/download/diveintopython3-examples-5.4.zip" title="Download example scripts">download this and other examples</a> used in this book.
-<pre><code>
-"""Convert to and from Roman numerals"""
-
-#Define exceptions
-class RomanError(Exception): pass
-class OutOfRangeError(RomanError): pass
-class NotIntegerError(RomanError): pass
-class InvalidRomanNumeralError(RomanError): pass
-
-#Define digit mapping
-romanNumeralMap = (('M',  1000),
- ('CM', 900),
- ('D',  500),
- ('CD', 400),
- ('C',  100),
- ('XC', 90),
- ('L',  50),
- ('XL', 40),
- ('X',  10),
- ('IX', 9),
- ('V',  5),
- ('IV', 4),
- ('I',  1))
-
-def to_roman(n):
-    """convert integer to Roman numeral"""
-    if not (0 &lt; n &lt; 4000):         <span>&#x2460;</span>
-        raise OutOfRangeError, "number out of range (must be 1..3999)" <span>&#x2461;</span>
-    if int(n) &lt;> n:                <span>&#x2462;</span>
-        raise NotIntegerError, "non-integers can not be converted"
-
-    result = ""  <span>&#x2463;</span>
-    for numeral, integer in romanNumeralMap:
-        while n >= integer:
-            result += numeral
-            n -= integer
-    return result
-
-def from_roman(s):
-    """convert Roman numeral to integer"""
-    pass
-</pre>
-<ol>
-<li>This is a nice Pythonic shortcut: multiple comparisons at once. This is equivalent to <code>if not ((0 &lt; n) and (n &lt; 4000))</code>, but it's much easier to read. This is the range check, and it should catch inputs that are too large, negative, or zero.
-<li>You raise exceptions yourself with the <code>raise</code> statement. You can raise any of the built-in exceptions, or you can raise any of your custom exceptions that you've defined.
-             The second parameter, the error message, is optional; if given, it is displayed in the traceback that is printed if the exception
-            is never handled.
-<li>This is the non-integer check. Non-integers can not be converted to Roman numerals.
-<li>The rest of the function is unchanged.
-<div class=example><h3>Example 14.7. Watching <code>to_roman()</code> handle bad input</h3><pre class=screen>
-<samp class=p>>>> </samp><kbd>import roman3</kbd>
-<samp class=p>>>> </samp><kbd>roman3.to_roman(4000)</kbd>
-<samp class=traceback>Traceback (most recent call last):
-  File "&lt;interactive input>", line 1, in ?
-  File "roman3.py", line 27, in to_roman
-    raise OutOfRangeError, "number out of range (must be 1..3999)"
-OutOfRangeError: number out of range (must be 1..3999)</samp>
-<samp class=p>>>> </samp><kbd>roman3.to_roman(1.5)</kbd>
-<samp class=traceback>Traceback (most recent call last):
-  File "&lt;interactive input>", line 1, in ?
-  File "roman3.py", line 29, in to_roman
-    raise NotIntegerError, "non-integers can not be converted"
-NotIntegerError: non-integers can not be converted</span>
-</pre><div class=example><h3>Example 14.8. Output of <code>romantest3.py</code> against <code>roman3.py</code></h3><pre class=screen><samp>from_roman should only accept uppercase input ... FAIL
-to_roman should always return uppercase ... ok
-from_roman should fail with malformed antecedents ... FAIL
-from_roman should fail with repeated pairs of numerals ... FAIL
-from_roman should fail with too many repeated numerals ... FAIL
-from_roman should give known result with known input ... FAIL
-to_roman should give known result with known input ... ok </span><span>&#x2460;</span><samp>
-from_roman(to_roman(n))==n for all n ... FAIL
-to_roman should fail with non-integer input ... ok        </span><span>&#x2461;</span><samp>
-to_roman should fail with negative input ... ok           </span><span>&#x2462;</span><samp>
-to_roman should fail with large input ... ok
-to_roman should fail with 0 input ... ok</span></pre>
-<ol>
-<li><code>to_roman()</code> still passes the <a href="#roman.testtoromanknownvalues.example" title="Example 13.2. testToRomanKnownValues">known values test</a>, which is comforting. All the tests that passed in <a href="#roman.stage2" title="14.2. roman.py, stage 2">stage 2</a> still pass, so the latest code hasn't broken anything.
-<li>More exciting is the fact that all of the <a href="#roman.tobadinput.example" title="Example 13.3. Testing bad input to to_roman">bad input tests</a> now pass. This test, <code>testNonInteger</code>, passes because of the <code>int(n) &lt;> n</code> check. When a non-integer is passed to <code>to_roman()</code>, the <code>int(n) &lt;> n</code> check notices it and raises the <code>NotIntegerError</code> exception, which is what <code>testNonInteger</code> is looking for.
-<li>This test, <code>testNegative</code>, passes because of the <code>not (0 &lt; n &lt; 4000)</code> check, which raises an <code>OutOfRangeError</code> exception, which is what <code>testNegative</code> is looking for.
-<pre class=screen><samp>
-======================================================================
-FAIL: from_roman should only accept uppercase input
-----------------------------------------------------------------------
-</span><samp class=traceback>Traceback (most recent call last):
-  File "C:\docbook\dip\py\roman\stage3\romantest3.py", line 156, in testFromRomanCase
-    roman3.from_roman, numeral.lower())
-  File "c:\python21\lib\unittest.py", line 266, in failUnlessRaises
-    raise self.failureException, excName
-AssertionError: InvalidRomanNumeralError</span><samp>
-======================================================================
-FAIL: from_roman should fail with malformed antecedents
-----------------------------------------------------------------------
-</span><samp class=traceback>Traceback (most recent call last):
-  File "C:\docbook\dip\py\roman\stage3\romantest3.py", line 133, in testMalformedAntecedent
-    self.assertRaises(roman3.InvalidRomanNumeralError, roman3.from_roman, s)
-  File "c:\python21\lib\unittest.py", line 266, in failUnlessRaises
-    raise self.failureException, excName
-AssertionError: InvalidRomanNumeralError</span><samp>
-======================================================================
-FAIL: from_roman should fail with repeated pairs of numerals
-----------------------------------------------------------------------
-</span><samp class=traceback>Traceback (most recent call last):
-  File "C:\docbook\dip\py\roman\stage3\romantest3.py", line 127, in testRepeatedPairs
-    self.assertRaises(roman3.InvalidRomanNumeralError, roman3.from_roman, s)
-  File "c:\python21\lib\unittest.py", line 266, in failUnlessRaises
-    raise self.failureException, excName
-AssertionError: InvalidRomanNumeralError</span><samp>
-======================================================================
-FAIL: from_roman should fail with too many repeated numerals
-----------------------------------------------------------------------
-</span><samp class=traceback>Traceback (most recent call last):
-  File "C:\docbook\dip\py\roman\stage3\romantest3.py", line 122, in testTooManyRepeatedNumerals
-    self.assertRaises(roman3.InvalidRomanNumeralError, roman3.from_roman, s)
-  File "c:\python21\lib\unittest.py", line 266, in failUnlessRaises
-    raise self.failureException, excName
-AssertionError: InvalidRomanNumeralError</span><samp>
-======================================================================
-FAIL: from_roman should give known result with known input
-----------------------------------------------------------------------
-</span><samp class=traceback>Traceback (most recent call last):
-  File "C:\docbook\dip\py\roman\stage3\romantest3.py", line 99, in testFromRomanKnownValues
-    self.assertEqual(integer, result)
-  File "c:\python21\lib\unittest.py", line 273, in failUnlessEqual
-    raise self.failureException, (msg or '%s != %s' % (first, second))
-AssertionError: 1 != None</span><samp>
-======================================================================
-FAIL: from_roman(to_roman(n))==n for all n
-----------------------------------------------------------------------
-</span><samp class=traceback>Traceback (most recent call last):
-  File "C:\docbook\dip\py\roman\stage3\romantest3.py", line 141, in testSanity
-    self.assertEqual(integer, result)
-  File "c:\python21\lib\unittest.py", line 273, in failUnlessEqual
-    raise self.failureException, (msg or '%s != %s' % (first, second))
-AssertionError: 1 != None</span><samp>
-----------------------------------------------------------------------
-Ran 12 tests in 0.401s
-
-FAILED (failures=6)</span> <span>&#x2460;</span></pre>
-<ol>
-<li>You're down to 6 failures, and all of them involve <code>from_roman()</code>: the known values test, the three separate bad input tests, the case check, and the sanity check. That means that <code>to_roman()</code> has passed all the tests it can pass by itself. (It's involved in the sanity check, but that also requires that <code>from_roman()</code> be written, which it isn't yet.)  Which means that you must stop coding <code>to_roman()</code> now. No tweaking, no twiddling, no extra checks &#8220;just in case&#8221;. Stop. Now. Back away from the keyboard.
-<table class=note border="0" summary="">
-
-<td rowspan="2" align="center" valign="top" width="1%"><img src="images/note.png" alt="Note" title="" width="24" height="24"><td colspan="2" align="left" valign="top" width="99%">The most important thing that comprehensive unit testing can tell you is when to stop coding. When all the unit tests for
-      a function pass, stop coding the function. When all the unit tests for an entire module pass, stop coding the module.
-<h2 id="roman.stage4">14.4. <code>roman.py</code>, stage 4</h2>
-<p>Now that <code>to_roman()</code> is done, it's time to start coding <code>from_roman()</code>. 
-   the <code>to_roman()</code> function.
-<div class=example><h3>Example 14.9. <code>roman4.py</code></h3>
-<p>This file is available in <code>py/roman/stage4/</code> in the examples directory.
-<p>If you have not already done so, you can <a href="http://diveintopython3.org/download/diveintopython3-examples-5.4.zip" title="Download example scripts">download this and other examples</a> used in this book.
-<pre><code>
-"""Convert to and from Roman numerals"""
-
-#Define exceptions
-class RomanError(Exception): pass
-class OutOfRangeError(RomanError): pass
-class NotIntegerError(RomanError): pass
-class InvalidRomanNumeralError(RomanError): pass
-
-#Define digit mapping
-romanNumeralMap = (('M',  1000),
- ('CM', 900),
- ('D',  500),
- ('CD', 400),
- ('C',  100),
- ('XC', 90),
- ('L',  50),
- ('XL', 40),
- ('X',  10),
- ('IX', 9),
- ('V',  5),
- ('IV', 4),
- ('I',  1))
-
-# to_roman function omitted for clarity (it hasn't changed)
-
-def from_roman(s):
-    """convert Roman numeral to integer"""
-    result = 0
-    index = 0
-    for numeral, integer in romanNumeralMap:
-        while s[index:index+len(numeral)] == numeral: <span>&#x2460;</span>
-            result += integer
-            index += len(numeral)
-    return result
-</pre>
-<ol>
-<li>The pattern here is the same as <a href="#roman.stage2.example" title="Example 14.3. roman2.py"><code>to_roman()</code></a>. You iterate through your Roman numeral data structure (a tuple of tuples), and instead of matching the highest integer
-            values as often as possible, you match the &#8220;highest&#8221; Roman numeral character strings as often as possible.
-<div class=example><h3>Example 14.10. How <code>from_roman()</code> works</h3>
-<p>If you're not clear how <code>from_roman()</code> works, add a <code>print</code> statement to the end of the <code>while</code> loop:<pre><code>
-        while s[index:index+len(numeral)] == numeral:
-            result += integer
-            index += len(numeral)
-            print 'found', numeral, 'of length', len(numeral), ', adding', integer</pre><pre class=screen>
-<samp class=p>>>> </samp><kbd>import roman4</kbd>
-<samp class=p>>>> </samp><kbd>roman4.from_roman('MCMLXXII')</kbd>
-<samp>found M , of length 1, adding 1000
-found CM , of length 2, adding 900
-found L , of length 1, adding 50
-found X , of length 1, adding 10
-found X , of length 1, adding 10
-found I , of length 1, adding 1
-found I , of length 1, adding 1
-1972</span></pre><div class=example><h3>Example 14.11. Output of <code>romantest4.py</code> against <code>roman4.py</code></h3><pre class=screen><samp>from_roman should only accept uppercase input ... FAIL
-to_roman should always return uppercase ... ok
-from_roman should fail with malformed antecedents ... FAIL
-from_roman should fail with repeated pairs of numerals ... FAIL
-from_roman should fail with too many repeated numerals ... FAIL
-from_roman should give known result with known input ... ok </span><span>&#x2460;</span><samp>
-to_roman should give known result with known input ... ok
-from_roman(to_roman(n))==n for all n ... ok</span><span>&#x2461;</span><samp>
-to_roman should fail with non-integer input ... ok
-to_roman should fail with negative input ... ok
-to_roman should fail with large input ... ok
-to_roman should fail with 0 input ... ok</span></pre>
-<ol>
-<li>Two pieces of exciting news here. The first is that <code>from_roman()</code> works for good input, at least for all the <a href="#roman.testtoromanknownvalues.example" title="Example 13.2. testToRomanKnownValues">known values</a> you test.
-<li>The second is that the <a href="#roman.sanity.example" title="Example 13.5. Testing to_roman against from_roman">sanity check</a> also passed. Combined with the known values tests, you can be reasonably sure that both <code>to_roman()</code> and <code>from_roman()</code> work properly for all possible good values. (This is not guaranteed; it is theoretically possible that <code>to_roman()</code> has a bug that produces the wrong Roman numeral for some particular set of inputs, <em>and</em> that <code>from_roman()</code> has a reciprocal bug that produces the same wrong integer values for exactly that set of Roman numerals that <code>to_roman()</code> generated incorrectly. Depending on your application and your requirements, this possibility may bother you; if so, write
-            more comprehensive test cases until it doesn't bother you.)
-<pre class=screen><samp>
-======================================================================
-FAIL: from_roman should only accept uppercase input
-----------------------------------------------------------------------
-</span><samp class=traceback>Traceback (most recent call last):
-  File "C:\docbook\dip\py\roman\stage4\romantest4.py", line 156, in testFromRomanCase
-    roman4.from_roman, numeral.lower())
-  File "c:\python21\lib\unittest.py", line 266, in failUnlessRaises
-    raise self.failureException, excName
-AssertionError: InvalidRomanNumeralError</span><samp>
-======================================================================
-FAIL: from_roman should fail with malformed antecedents
-----------------------------------------------------------------------
-</span><samp class=traceback>Traceback (most recent call last):
-  File "C:\docbook\dip\py\roman\stage4\romantest4.py", line 133, in testMalformedAntecedent
-    self.assertRaises(roman4.InvalidRomanNumeralError, roman4.from_roman, s)
-  File "c:\python21\lib\unittest.py", line 266, in failUnlessRaises
-    raise self.failureException, excName
-AssertionError: InvalidRomanNumeralError</span><samp>
-======================================================================
-FAIL: from_roman should fail with repeated pairs of numerals
-----------------------------------------------------------------------
-</span><samp class=traceback>Traceback (most recent call last):
-  File "C:\docbook\dip\py\roman\stage4\romantest4.py", line 127, in testRepeatedPairs
-    self.assertRaises(roman4.InvalidRomanNumeralError, roman4.from_roman, s)
-  File "c:\python21\lib\unittest.py", line 266, in failUnlessRaises
-    raise self.failureException, excName
-AssertionError: InvalidRomanNumeralError</span><samp>
-======================================================================
-FAIL: from_roman should fail with too many repeated numerals
-----------------------------------------------------------------------
-</span><samp class=traceback>Traceback (most recent call last):
-  File "C:\docbook\dip\py\roman\stage4\romantest4.py", line 122, in testTooManyRepeatedNumerals
-    self.assertRaises(roman4.InvalidRomanNumeralError, roman4.from_roman, s)
-  File "c:\python21\lib\unittest.py", line 266, in failUnlessRaises
-    raise self.failureException, excName
-AssertionError: InvalidRomanNumeralError</span><samp>
-----------------------------------------------------------------------
-Ran 12 tests in 1.222s
-
-FAILED (failures=4)</span></pre><h2 id="roman.stage5">14.5. <code>roman.py</code>, stage 5</h2>
-<div class=example><h3>Example 14.12. <code>roman5.py</code></h3>
-<p>This file is available in <code>py/roman/stage5/</code> in the examples directory.
-<p>If you have not already done so, you can <a href="http://diveintopython3.org/download/diveintopython3-examples-5.4.zip" title="Download example scripts">download this and other examples</a> used in this book.
-<pre><code>
-"""Convert to and from Roman numerals"""
-import re
-
-#Define exceptions
-class RomanError(Exception): pass
-class OutOfRangeError(RomanError): pass
-class NotIntegerError(RomanError): pass
-class InvalidRomanNumeralError(RomanError): pass
-
-#Define digit mapping
-romanNumeralMap = (('M',  1000),
- ('CM', 900),
- ('D',  500),
- ('CD', 400),
- ('C',  100),
- ('XC', 90),
- ('L',  50),
- ('XL', 40),
- ('X',  10),
- ('IX', 9),
- ('V',  5),
- ('IV', 4),
- ('I',  1))
-
-def to_roman(n):
-    """convert integer to Roman numeral"""
-    if not (0 &lt; n &lt; 4000):
-        raise OutOfRangeError, "number out of range (must be 1..3999)"
-    if int(n) &lt;> n:
-        raise NotIntegerError, "non-integers can not be converted"
-
-    result = ""
-    for numeral, integer in romanNumeralMap:
-        while n >= integer:
-            result += numeral
-            n -= integer
-    return result
-
-#Define pattern to detect valid Roman numerals
-romanNumeralPattern = '^M?M?M?(CM|CD|D?C?C?C?)(XC|XL|L?X?X?X?)(IX|IV|V?I?I?I?)$' <span>&#x2460;</span>
-
-def from_roman(s):
-    """convert Roman numeral to integer"""
-    if not re.search(romanNumeralPattern, s):<span>&#x2461;</span>
-        raise InvalidRomanNumeralError, 'Invalid Roman numeral: %s' % s
-
-    result = 0
-    index = 0
-    for numeral, integer in romanNumeralMap:
-        while s[index:index+len(numeral)] == numeral:
-            result += integer
-            index += len(numeral)
-    return result
-</pre>
-<ol>
-<li>This is just a continuation of the pattern you discussed in <a href="#re.roman" title="7.3. Case Study: Roman Numerals">Section 7.3, &#8220;Case Study: Roman Numerals&#8221;</a>. The tens places is either <code>XC</code> (<code>90</code>), <code>XL</code> (<code>40</code>), or an optional <code>L</code> followed by 0 to 3 optional <code>X</code> characters. The ones place is either <code>IX</code> (<code>9</code>), <code>IV</code> (<code>4</code>), or an optional <code>V</code> followed by 0 to 3 optional <code>I</code> characters.
-<li>Having encoded all that logic into a regular expression, the code to check for invalid Roman numerals becomes trivial. If
-<code>re.search</code> returns an object, then the regular expression matched and the input is valid; otherwise, the input is invalid.
-<p>At this point, you are allowed to be skeptical that that big ugly regular expression could possibly catch all the types of
-invalid Roman numerals. But don't take my word for it, look at the results:
-<div class=example><h3>Example 14.13. Output of <code>romantest5.py</code> against <code>roman5.py</code></h3><pre class=screen><samp>
-from_roman should only accept uppercase input ... ok          </span><span>&#x2460;</span><samp>
-to_roman should always return uppercase ... ok
-from_roman should fail with malformed antecedents ... ok      </span><span>&#x2461;</span><samp>
-from_roman should fail with repeated pairs of numerals ... ok </span><span>&#x2462;</span><samp>
-from_roman should fail with too many repeated numerals ... ok
-from_roman should give known result with known input ... ok
-to_roman should give known result with known input ... ok
-from_roman(to_roman(n))==n for all n ... ok
-to_roman should fail with non-integer input ... ok
-to_roman should fail with negative input ... ok
-to_roman should fail with large input ... ok
-to_roman should fail with 0 input ... ok
-
-----------------------------------------------------------------------
-Ran 12 tests in 2.864s
-
-OK     </span><span>&#x2463;</span></pre>
-<ol>
-<li>One thing I didn't mention about regular expressions is that, by default, they are case-sensitive. Since the regular expression
-<var>romanNumeralPattern</var> was expressed in uppercase characters, the <code>re.search</code> check will reject any input that isn't completely uppercase. So the uppercase input test passes.
-<li>More importantly, the bad input tests pass. For instance, the malformed antecedents test checks cases like <code>MCMC</code>. As you've seen, this does not match the regular expression, so <code>from_roman()</code> raises an <code>InvalidRomanNumeralError</code> exception, which is what the malformed antecedents test case is looking for, so the test passes.
-<li>In fact, all the bad input tests pass. This regular expression catches everything you could think of when you made your test
-            cases.
-<li><table class=note border="0" summary="">
-
-<td rowspan="2" align="center" valign="top" width="1%"><img src="images/note.png" alt="Note" title="" width="24" height="24"><td colspan="2" align="left" valign="top" width="99%">When all of your tests pass, stop coding.
-
-
-
-
-
-[functional programming stuff was here]
-
-
-
-
-
 <p>The following is a complete Python program that acts as a cheap and simple regression testing framework. It takes unit tests that you've written for individual
 modules, collects them all into one big test suite, and runs them all at once. I actually use this script as part of the
 build process for this book; I have unit tests for several of the example programs (not just the <code>roman.py</code> module featured in <a href="#roman" title="Chapter 13. Unit Testing">Chapter 13, <i>Unit Testing</i></a>), and the first thing my automated build script does is run this program to make sure all my examples still work. If this
@@ -1762,621 +916,3 @@ if __name__ == "__main__":
 <p><sup>[<a name="ftn.d0e35697" href="#d0e35697">7</a>] </sup>Technically, the second argument to <code>filter</code> can be any sequence, including lists, tuples, and custom classes that act like lists by defining the <code>__getitem__</code> special method. If possible, <code>filter</code> will return the same datatype as you give it, so filtering a list returns a list, but filtering a tuple returns a tuple.
 <div class=footnote>
 <p><sup>[<a name="ftn.d0e36079" href="#d0e36079">8</a>] </sup>Again, I should point out that <code>map</code> can take a list, a tuple, or any object that acts like a sequence. See previous footnote about <code>filter</code>.
-
-
-
-
-
-
-
-
-
-
-<div class=chapter>
-<h2 id="soundex">Chapter 18. Performance Tuning</h2>
-<p>Performance tuning is a many-splendored thing. Just because Python is an interpreted language doesn't mean you shouldn't worry about code optimization. But don't worry about it <em>too</em> much.
-<h2 id="soundex.divein">18.1. Diving in</h2>
-<p>There are so many pitfalls involved in optimizing your code, it's hard to know where to start.
-<p>Let's start here: <em>are you sure you need to do it at all?</em>  Is your code really so bad?  Is it worth the time to tune it?  Over the lifetime of your application, how much time is going
-to be spent running that code, compared to the time spent waiting for a remote database server, or waiting for user input?
-<p>Second, <em>are you sure you're done coding?</em>  Premature optimization is like spreading frosting on a half-baked cake. You spend hours or days (or more) optimizing your
-code for performance, only to discover it doesn't do what you need it to do. That's time down the drain.
-<p>This is not to say that code optimization is worthless, but you need to look at the whole system and decide whether it's the
-best use of your time. Every minute you spend optimizing code is a minute you're not spending adding new features, or writing
-documentation, or playing with your kids, or writing unit tests.
-<p>Oh yes, unit tests. It should go without saying that you need a complete set of unit tests before you begin performance tuning.
-The last thing you need is to introduce new bugs while fiddling with your algorithms.
-<p>With these caveats in place, let's look at some techniques for optimizing Python code. The code in question is an implementation of the Soundex algorithm. Soundex was a method used in the early 20th century
-for categorizing surnames in the United States census. It grouped similar-sounding names together, so even if a name was
-misspelled, researchers had a chance of finding it. Soundex is still used today for much the same reason, although of course
-we use computerized database servers now. Most database servers include a Soundex function.
-<p>There are several subtle variations of the Soundex algorithm. This is the one used in this chapter:
-<div class=orderedlist>
-<ol>
-<li>Keep the first letter of the name as-is.
-<li>Convert the remaining letters to digits, according to a specific table:
-<div class=itemizedlist>
-<ul>
-<li>B, F, P, and V become 1.
-<li>C, G, J, K, Q, S, X, and Z become 2.
-<li>D and T become 3.
-<li>L becomes 4.
-<li>M and N become 5.
-<li>R becomes 6.
-<li>All other letters become 9.
-</ul>
-
-<li>Remove consecutive duplicates.
-<li>Remove all 9s altogether.
-<li>If the result is shorter than four characters (the first letter plus three digits), pad the result with trailing zeros.
-<li>if the result is longer than four characters, discard everything after the fourth character.
-</ol>
-<p>For example, my name, <code>Pilgrim</code>, becomes P942695. That has no consecutive duplicates, so nothing to do there. Then you remove the 9s, leaving P4265. That's
-too long, so you discard the excess character, leaving P426.
-<p>Another example: <code>Woo</code> becomes W99, which becomes W9, which becomes W, which gets padded with zeros to become W000.
-<p>Here's a first attempt at a Soundex function:
-<div class=example><h3>Example 18.1. <code>soundex/stage1/soundex1a.py</code></h3>
-<p>If you have not already done so, you can <a href="http://diveintopython3.org/download/diveintopython3-examples-5.4.zip" title="Download example scripts">download this and other examples</a> used in this book.
-<pre><code>
-import string, re
-
-charToSoundex = {"A": "9",
-                 "B": "1",
-                 "C": "2",
-                 "D": "3",
-                 "E": "9",
-                 "F": "1",
-                 "G": "2",
-                 "H": "9",
-                 "I": "9",
-                 "J": "2",
-                 "K": "2",
-                 "L": "4",
-                 "M": "5",
-                 "N": "5",
-                 "O": "9",
-                 "P": "1",
-                 "Q": "2",
-                 "R": "6",
-                 "S": "2",
-                 "T": "3",
-                 "U": "9",
-                 "V": "1",
-                 "W": "9",
-                 "X": "2",
-                 "Y": "9",
-                 "Z": "2"}
-
-def soundex(source):
-    "convert string to Soundex equivalent"
-
-    # Soundex requirements:
-    # source string must be at least 1 character
-    # and must consist entirely of letters
-    allChars = string.uppercase + string.lowercase
-    if not re.search('^[%s]+$' % allChars, source):
-        return "0000"
-
-    # Soundex algorithm:
-    # 1. make first character uppercase
-    source = source[0].upper() + source[1:]
-    
-    # 2. translate all other characters to Soundex digits
-    digits = source[0]
-    for s in source[1:]:
-        s = s.upper()
-        digits += charToSoundex[s]
-
-    # 3. remove consecutive duplicates
-    digits2 = digits[0]
-    for d in digits[1:]:
-        if digits2[-1] != d:
-            digits2 += d
-        
-    # 4. remove all "9"s
-    digits3 = re.sub('9', '', digits2)
-    
-    # 5. pad end with "0"s to 4 characters
-    while len(digits3) &lt; 4:
-        digits3 += "0"
-        
-    # 6. return first 4 characters
-    return digits3[:4]
-
-if __name__ == '__main__':
-    from timeit import Timer
-    names = ('Woo', 'Pilgrim', 'Flingjingwaller')
-    for name in names:
-        statement = "soundex('%s')" % name
-        t = Timer(statement, "from __main__ import soundex")
-        print name.ljust(15), soundex(name), min(t.repeat())
-</pre><div class=itemizedlist>
-<h3>Further Reading on Soundex</h3>
-<ul>
-<li><a href="http://www.avotaynu.com/soundex.html">Soundexing and Genealogy</a> gives a chronology of the evolution of the Soundex and its regional variations.
-
-</ul>
-<h2 id="soundex.timeit">18.2. Using the <code>timeit</code> Module</h2>
-<p>The most important thing you need to know about optimizing Python code is that you shouldn't write your own timing function.
-<p>Timing short pieces of code is incredibly complex. How much processor time is your computer devoting to running this code?
-Are there things running in the background?  Are you sure?  Every modern computer has background processes running, some all
-the time, some intermittently. Cron jobs fire off at consistent intervals; background services occasionally &#8220;wake up&#8221; to do useful things like check for new mail, connect to instant messaging servers, check for application updates, scan for
-viruses, check whether a disk has been inserted into your CD drive in the last 100 nanoseconds, and so on. Before you start
-your timing tests, turn everything off and disconnect from the network. Then turn off all the things you forgot to turn off
-the first time, then turn off the service that's incessantly checking whether the network has come back yet, then ...
-<p>And then there's the matter of the variations introduced by the timing framework itself. Does the Python interpreter cache method name lookups?  Does it cache code block compilations?  Regular expressions?  Will your code have
-side effects if run more than once?  Don't forget that you're dealing with small fractions of a second, so small mistakes
-in your timing framework will irreparably skew your results.
-<p>The Python community has a saying: &#8220;Python comes with batteries included.&#8221;  Don't write your own timing framework. Python 2.3 comes with a perfectly good one called <code>timeit</code>.
-<div class=example><h3>Example 18.2. Introducing <code>timeit</code></h3>
-<p>If you have not already done so, you can <a href="http://diveintopython3.org/download/diveintopython3-examples-5.4.zip" title="Download example scripts">download this and other examples</a> used in this book.
-<pre class=screen>
-<samp class=p>>>> </samp><kbd>import timeit</kbd>
-<samp class=p>>>> </samp><kbd>t = timeit.Timer("soundex.soundex('Pilgrim')",</kbd>
-<samp class=p>...    </samp>"import soundex")   <span>&#x2460;</span>
-<samp class=p>>>> </samp><kbd>t.timeit()</kbd>              <span>&#x2461;</span>
-8.21683733547
-<samp class=p>>>> </samp><kbd>t.repeat(3, 2000000)</kbd>    <span>&#x2462;</span>
-[16.48319309109, 16.46128984923, 16.44203948912]
-</pre>
-<ol>
-<li>The <code>timeit</code> module defines one class, <code>Timer</code>, which takes two arguments. Both arguments are strings. The first argument is the statement you wish to time; in this case,
-            you are timing a call to the Soundex function within the <code>soundex</code> with an argument of <code>'Pilgrim'</code>. The second argument to the <code>Timer</code> class is the import statement that sets up the environment for the statement. Internally, <code>timeit</code> sets up an isolated virtual environment, manually executes the setup statement (importing the <code>soundex</code> module), then manually compiles and executes the timed statement (calling the Soundex function).
-<li>Once you have the <code>Timer</code> object, the easiest thing to do is call <code>timeit()</code>, which calls your function 1 million times and returns the number of seconds it took to do it.
-<li>The other major method of the <code>Timer</code> object is <code>repeat()</code>, which takes two optional arguments. The first argument is the number of times to repeat the entire test, and the second
-            argument is the number of times to call the timed statement within each test. Both arguments are optional, and they default
-            to <code>3</code> and <code>1000000</code> respectively. The <code>repeat()</code> method returns a list of the times each test cycle took, in seconds.
-<blockquote class="note FIXME">
-<p><span>&#x261E;</span>You can use the <code>timeit</code> module on the command line to test an existing Python program, without modifying the code. See <a href="http://docs.python.org/lib/node396.html">http://docs.python.org/lib/node396.html</a> for documentation on the command-line flags.
-<p>Note that <code>repeat()</code> returns a list of times. The times will almost never be identical, due to slight variations in how much processor time the
-Python interpreter is getting (and those pesky background processes that you can't get rid of). Your first thought might be to
-say &#8220;Let's take the average and call that The True Number.&#8221;
-<p>In fact, that's almost certainly wrong. The tests that took longer didn't take longer because of variations in your code
-or in the Python interpreter; they took longer because of those pesky background processes, or other factors outside of the Python interpreter that you can't fully eliminate. If the different timing results differ by more than a few percent, you still
-have too much variability to trust the results. Otherwise, take the minimum time and discard the rest.
-<p>Python has a handy <code>min</code> function that takes a list and returns the smallest value:
-<pre class=screen>
-<samp class=p>>>> </samp><kbd>min(t.repeat(3, 1000000))</kbd>
-8.22203948912
-</pre><blockquote class="note FIXME">
-<p><span>&#x261E;</span>The <code>timeit</code> module only works if you already know what piece of code you need to optimize. If you have a larger Python program and don't know where your performance problems are, check out <a href="http://docs.python.org/lib/module-hotshot.html">the <code>hotshot</code> module.</a><h2 id="soundex.stage1">18.3. Optimizing Regular Expressions</h2>
-<p>The first thing the Soundex function checks is whether the input is a non-empty string of letters. What's the best way to
-   do this?
-<p>If you answered &#8220;regular expressions&#8221;, go sit in the corner and contemplate your bad instincts. Regular expressions are almost never the right answer; they should
-be avoided whenever possible. Not only for performance reasons, but simply because they're difficult to debug and maintain.
-Also for performance reasons.
-<p>This code fragment from <code>soundex/stage1/soundex1a.py</code> checks whether the function argument <var>source</var> is a word made entirely of letters, with at least one letter (not the empty string):
-<pre><code>
-    allChars = string.uppercase + string.lowercase
-    if not re.search('^[%s]+$' % allChars, source):
-        return "0000"
-</pre><p>How does <code>soundex1a.py</code> perform?  For convenience, the <code>__main__</code> section of the script contains this code that calls the <code>timeit</code> module, sets up a timing test with three different names, tests each name three times, and displays the minimum time for
-each:
-<pre><code>
-if __name__ == '__main__':
-    from timeit import Timer
-    names = ('Woo', 'Pilgrim', 'Flingjingwaller')
-    for name in names:
-        statement = "soundex('%s')" % name
-        t = Timer(statement, "from __main__ import soundex")
-        print name.ljust(15), soundex(name), min(t.repeat())
-</pre><p>So how does <code>soundex1a.py</code> perform with this regular expression?
-<pre class=screen>
-<samp class=p>C:\samples\soundex\stage1></samp>python soundex1a.py
-<samp>Woo             W000 19.3356647283
-Pilgrim         P426 24.0772053431
-Flingjingwaller F452 35.0463220884</span>
-</pre><p>As you might expect, the algorithm takes significantly longer when called with longer names. There will be a few things we
-can do to narrow that gap (make the function take less relative time for longer input), but the nature of the algorithm dictates
-that it will never run in constant time.
-<p>The other thing to keep in mind is that we are testing a representative sample of names. <code>Woo</code> is a kind of trivial case, in that it gets shorted down to a single letter and then padded with zeros. <code>Pilgrim</code> is a normal case, of average length and a mixture of significant and ignored letters. <code>Flingjingwaller</code> is extraordinarily long and contains consecutive duplicates. Other tests might also be helpful, but this hits a good range
-of different cases.
-<p>So what about that regular expression?  Well, it's inefficient. Since the expression is testing for ranges of characters
-(<code>A-Z</code> in uppercase, and <code>a-z</code> in lowercase), we can use a shorthand regular expression syntax. Here is <code>soundex/stage1/soundex1b.py</code>:
-<pre><code>
-    if not re.search('^[A-Za-z]+$', source):
-        return "0000"
-</pre><p><code>timeit</code> says <code>soundex1b.py</code> is slightly faster than <code>soundex1a.py</code>, but nothing to get terribly excited about:
-<pre class=screen>
-<samp class=p>C:\samples\soundex\stage1></samp>python soundex1b.py
-<samp>Woo             W000 17.1361133887
-Pilgrim         P426 21.8201693232
-Flingjingwaller F452 32.7262294509</span>
-</pre><p>We saw in <a href="#roman.refactoring" title="15.3. Refactoring">Section 15.3, &#8220;Refactoring&#8221;</a> that regular expressions can be compiled and reused for faster results. Since this regular expression never changes across
-function calls, we can compile it once and use the compiled version. Here is <code>soundex/stage1/soundex1c.py</code>:
-<pre><code>
-isOnlyChars = re.compile('^[A-Za-z]+$').search
-def soundex(source):
-    if not isOnlyChars(source):
-        return "0000"
-</pre><p>Using a compiled regular expression in <code>soundex1c.py</code> is significantly faster:
-<pre class=screen>
-<samp class=p>C:\samples\soundex\stage1></samp>python soundex1c.py
-<samp>Woo             W000 14.5348347346
-Pilgrim         P426 19.2784703084
-Flingjingwaller F452 30.0893873383</span>
-</pre><p>But is this the wrong path?  The logic here is simple: the input <var>source</var> needs to be non-empty, and it needs to be composed entirely of letters. Wouldn't it be faster to write a loop checking each
-character, and do away with regular expressions altogether?
-<p>Here is <code>soundex/stage1/soundex1d.py</code>:
-<pre><code>
-    if not source:
-        return "0000"
-    for c in source:
-        if not ('A' &lt;= c &lt;= 'Z') and not ('a' &lt;= c &lt;= 'z'):
-            return "0000"
-</pre><p>It turns out that this technique in <code>soundex1d.py</code> is <em>not</em> faster than using a compiled regular expression (although it is faster than using a non-compiled regular expression):
-<pre class=screen>
-<samp class=p>C:\samples\soundex\stage1></samp>python soundex1d.py
-<samp>Woo             W000 15.4065058548
-Pilgrim         P426 22.2753567842
-Flingjingwaller F452 37.5845122774</span>
-</pre><p>Why isn't <code>soundex1d.py</code> faster?  The answer lies in the interpreted nature of Python. The regular expression engine is written in C, and compiled to run natively on your computer. On the other hand, this
-loop is written in Python, and runs through the Python interpreter. Even though the loop is relatively simple, it's not simple enough to make up for the overhead of being interpreted.
-Regular expressions are never the right answer... except when they are.
-<p>It turns out that Python offers an obscure string method. You can be excused for not knowing about it, since it's never been mentioned in this book.
-The method is called <code>isalpha()</code>, and it checks whether a string contains only letters.
-<p>This is <code>soundex/stage1/soundex1e.py</code>:
-<pre><code>
-    if (not source) and (not source.isalpha()):
-        return "0000"
-</pre><p>How much did we gain by using this specific method in <code>soundex1e.py</code>?  Quite a bit.
-<pre class=screen>
-<samp class=p>C:\samples\soundex\stage1></samp>python soundex1e.py
-<samp>Woo             W000 13.5069504644
-Pilgrim         P426 18.2199394057
-Flingjingwaller F452 28.9975225902</span>
-</pre><div class=example><h3>Example 18.3. Best Result So Far: <code>soundex/stage1/soundex1e.py</code></h3><pre><code>
-import string, re
-
-charToSoundex = {"A": "9",
-                 "B": "1",
-                 "C": "2",
-                 "D": "3",
-                 "E": "9",
-                 "F": "1",
-                 "G": "2",
-                 "H": "9",
-                 "I": "9",
-                 "J": "2",
-                 "K": "2",
-                 "L": "4",
-                 "M": "5",
-                 "N": "5",
-                 "O": "9",
-                 "P": "1",
-                 "Q": "2",
-                 "R": "6",
-                 "S": "2",
-                 "T": "3",
-                 "U": "9",
-                 "V": "1",
-                 "W": "9",
-                 "X": "2",
-                 "Y": "9",
-                 "Z": "2"}
-
-def soundex(source):
-    if (not source) and (not source.isalpha()):
-        return "0000"
-    source = source[0].upper() + source[1:]
-    digits = source[0]
-    for s in source[1:]:
-        s = s.upper()
-        digits += charToSoundex[s]
-    digits2 = digits[0]
-    for d in digits[1:]:
-        if digits2[-1] != d:
-            digits2 += d
-    digits3 = re.sub('9', '', digits2)
-    while len(digits3) &lt; 4:
-        digits3 += "0"
-    return digits3[:4]
-
-if __name__ == '__main__':
-    from timeit import Timer
-    names = ('Woo', 'Pilgrim', 'Flingjingwaller')
-    for name in names:
-        statement = "soundex('%s')" % name
-        t = Timer(statement, "from __main__ import soundex")
-        print name.ljust(15), soundex(name), min(t.repeat())
-</pre><h2 id="soundex.stage2">18.4. Optimizing Dictionary Lookups</h2>
-<p>The second step of the Soundex algorithm is to convert characters to digits in a specific pattern. What's the best way to
-   do this?
-<p>The most obvious solution is to define a dictionary with individual characters as keys and their corresponding digits as values,
-and do dictionary lookups on each character. This is what we have in <code>soundex/stage1/soundex1c.py</code> (the current best result so far):
-<pre><code>
-charToSoundex = {"A": "9",
-                 "B": "1",
-                 "C": "2",
-                 "D": "3",
-                 "E": "9",
-                 "F": "1",
-                 "G": "2",
-                 "H": "9",
-                 "I": "9",
-                 "J": "2",
-                 "K": "2",
-                 "L": "4",
-                 "M": "5",
-                 "N": "5",
-                 "O": "9",
-                 "P": "1",
-                 "Q": "2",
-                 "R": "6",
-                 "S": "2",
-                 "T": "3",
-                 "U": "9",
-                 "V": "1",
-                 "W": "9",
-                 "X": "2",
-                 "Y": "9",
-                 "Z": "2"}
-
-def soundex(source):
-    # ... input check omitted for brevity ...
-    source = source[0].upper() + source[1:]
-    digits = source[0]
-    for s in source[1:]:
-        s = s.upper()
-        digits += charToSoundex[s]
-</pre><p>You timed <code>soundex1c.py</code> already; this is how it performs:
-<pre class=screen>
-<samp class=p>C:\samples\soundex\stage1></samp>python soundex1c.py
-<samp>Woo             W000 14.5341678901
-Pilgrim         P426 19.2650071448
-Flingjingwaller F452 30.1003563302</span>
-</pre><p>This code is straightforward, but is it the best solution?  Calling <code>upper()</code> on each individual character seems inefficient; it would probably be better to call <code>upper()</code> once on the entire string.
-<p>Then there's the matter of incrementally building the <var>digits</var> string. Incrementally building strings like this is horribly inefficient; internally, the Python interpreter needs to create a new string each time through the loop, then discard the old one.
-<p>Python is good at lists, though. It can treat a string as a list of characters automatically. And lists are easy to combine into
-strings again, using the string method <code>join()</code>.
-<p>Here is <code>soundex/stage2/soundex2a.py</code>, which converts letters to digits by using &#8614; and <code>lambda</code>:
-<pre><code>
-def soundex(source):
-    # ...
-    source = source.upper()
-    digits = source[0] + "".join(map(lambda c: charToSoundex[c], source[1:]))
-</pre><p>Surprisingly, <code>soundex2a.py</code> is not faster:
-<pre class=screen>
-<samp class=p>C:\samples\soundex\stage2></samp>python soundex2a.py
-<samp>Woo             W000 15.0097526362
-Pilgrim         P426 19.254806407
-Flingjingwaller F452 29.3790847719</span>
-</pre><p>The overhead of the anonymous <code>lambda</code> function kills any performance you gain by dealing with the string as a list of characters.
-<p><code>soundex/stage2/soundex2b.py</code> uses a list comprehension instead of &#8614; and <code>lambda</code>:
-<pre><code>
-    source = source.upper()
-    digits = source[0] + "".join([charToSoundex[c] for c in source[1:]])
-</pre><p>Using a list comprehension in <code>soundex2b.py</code> is faster than using &#8614; and <code>lambda</code> in <code>soundex2a.py</code>, but still not faster than the original code (incrementally building a string in <code>soundex1c.py</code>):
-<pre class=screen>
-<samp class=p>C:\samples\soundex\stage2></samp>python soundex2b.py
-<samp>Woo             W000 13.4221324219
-Pilgrim         P426 16.4901234654
-Flingjingwaller F452 25.8186157738</span>
-</pre><p>It's time for a radically different approach. Dictionary lookups are a general purpose tool. Dictionary keys can be any
-length string (or many other data types), but in this case we are only dealing with single-character keys <em>and</em> single-character values. It turns out that Python has a specialized function for handling exactly this situation: the <code>string.maketrans</code> function.
-<p>This is <code>soundex/stage2/soundex2c.py</code>:
-<pre><code>
-allChar = string.uppercase + string.lowercase
-charToSoundex = string.maketrans(allChar, "91239129922455912623919292" * 2)
-def soundex(source):
-    # ...
-    digits = source[0].upper() + source[1:].translate(charToSoundex)
-</pre><p>What the heck is going on here?  <code>string.maketrans</code> creates a translation matrix between two strings: the first argument and the second argument. In this case, the first argument
-is the string <code>ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz</code>, and the second argument is the string <code>9123912992245591262391929291239129922455912623919292</code>. See the pattern?  It's the same conversion pattern we were setting up longhand with a dictionary. A maps to 9, B maps
-to 1, C maps to 2, and so forth. But it's not a dictionary; it's a specialized data structure that you can access using the
-string method <code>translate</code>, which translates each character into the corresponding digit, according to the matrix defined by <code>string.maketrans</code>.
-<p><code>timeit</code> shows that <code>soundex2c.py</code> is significantly faster than defining a dictionary and looping through the input and building the output incrementally:
-<pre class=screen>
-<samp class=p>C:\samples\soundex\stage2></samp>python soundex2c.py
-<samp>Woo             W000 11.437645008
-Pilgrim         P426 13.2825062962
-Flingjingwaller F452 18.5570110168</span>
-</pre><p>You're not going to get much better than that. Python has a specialized function that does exactly what you want to do; use it and move on.
-<div class=example><h3>Example 18.4. Best Result So Far: <code>soundex/stage2/soundex2c.py</code></h3><pre><code>
-import string, re
-
-allChar = string.uppercase + string.lowercase
-charToSoundex = string.maketrans(allChar, "91239129922455912623919292" * 2)
-isOnlyChars = re.compile('^[A-Za-z]+$').search
-
-def soundex(source):
-    if not isOnlyChars(source):
-        return "0000"
-    digits = source[0].upper() + source[1:].translate(charToSoundex)
-    digits2 = digits[0]
-    for d in digits[1:]:
-        if digits2[-1] != d:
-            digits2 += d
-    digits3 = re.sub('9', '', digits2)
-    while len(digits3) &lt; 4:
-        digits3 += "0"
-    return digits3[:4]
-
-if __name__ == '__main__':
-    from timeit import Timer
-    names = ('Woo', 'Pilgrim', 'Flingjingwaller')
-    for name in names:
-        statement = "soundex('%s')" % name
-        t = Timer(statement, "from __main__ import soundex")
-        print name.ljust(15), soundex(name), min(t.repeat())
-</pre><h2 id="soundex.stage3">18.5. Optimizing List Operations</h2>
-<p>The third step in the Soundex algorithm is eliminating consecutive duplicate digits. What's the best way to do this?
-<p>Here's the code we have so far, in <code>soundex/stage2/soundex2c.py</code>:
-<pre><code>
-    digits2 = digits[0]
-    for d in digits[1:]:
-        if digits2[-1] != d:
-            digits2 += d
-</pre><p>Here are the performance results for <code>soundex2c.py</code>:
-<pre class=screen>
-<samp class=p>C:\samples\soundex\stage2></samp>python soundex2c.py
-<samp>Woo             W000 12.6070768771
-Pilgrim         P426 14.4033353401
-Flingjingwaller F452 19.7774882003</span>
-</pre><p>The first thing to consider is whether it's efficient to check <var>digits[-1]</var> each time through the loop. Are list indexes expensive?  Would we be better off maintaining the last digit in a separate
-variable, and checking that instead?
-<p>To answer this question, here is <code>soundex/stage3/soundex3a.py</code>:
-<pre><code>
-    digits2 = ''
-    last_digit = ''
-    for d in digits:
-        if d != last_digit:
-            digits2 += d
-            last_digit = d
-</pre><p><code>soundex3a.py</code> does not run any faster than <code>soundex2c.py</code>, and may even be slightly slower (although it's not enough of a difference to say for sure):
-<pre class=screen>
-<samp class=p>C:\samples\soundex\stage3></samp>python soundex3a.py
-<samp>Woo             W000 11.5346048171
-Pilgrim         P426 13.3950636184
-Flingjingwaller F452 18.6108927252</span>
-</pre><p>Why isn't <code>soundex3a.py</code> faster?  It turns out that list indexes in Python are extremely efficient. Repeatedly accessing <var>digits2[-1]</var> is no problem at all. On the other hand, manually maintaining the last seen digit in a separate variable means we have <em>two</em> variable assignments for each digit we're storing, which wipes out any small gains we might have gotten from eliminating
-the list lookup.
-<p>Let's try something radically different. If it's possible to treat a string as a list of characters, it should be possible
-to use a list comprehension to iterate through the list. The problem is, the code needs access to the previous character
-in the list, and that's not easy to do with a straightforward list comprehension.
-<p>However, it is possible to create a list of index numbers using the built-in <code>range()</code> function, and use those index numbers to progressively search through the list and pull out each character that is different
-from the previous character. That will give you a list of characters, and you can use the string method <code>join()</code> to reconstruct a string from that.
-<p>Here is <code>soundex/stage3/soundex3b.py</code>:
-<pre><code>
-    digits2 = "".join([digits[i] for i in range(len(digits))
-     if i == 0 or digits[i-1] != digits[i]])
-</pre><p>Is this faster?  In a word, no.
-<pre class=screen>
-<samp class=p>C:\samples\soundex\stage3></samp>python soundex3b.py
-<samp>Woo             W000 14.2245271396
-Pilgrim         P426 17.8337165757
-Flingjingwaller F452 25.9954005327</span>
-</pre><p>It's possible that the techniques so far as have been &#8220;string-centric&#8221;. Python can convert a string into a list of characters with a single command: <code>list('abc')</code> returns <code>['a', 'b', 'c']</code>. Furthermore, lists can be <em>modified in place</em> very quickly. Instead of incrementally building a new list (or string) out of the source string, why not move elements around
-within a single list?
-<p>Here is <code>soundex/stage3/soundex3c.py</code>, which modifies a list in place to remove consecutive duplicate elements:
-<pre><code>
-    digits = list(source[0].upper() + source[1:].translate(charToSoundex))
-    i=0
-    for item in digits:
-        if item==digits[i]: continue
-        i+=1
-        digits[i]=item
-    del digits[i+1:]
-    digits2 = "".join(digits)
-</pre><p>Is this faster than <code>soundex3a.py</code> or <code>soundex3b.py</code>?  No, in fact it's the slowest method yet:
-<pre class=screen>
-<samp class=p>C:\samples\soundex\stage3></samp>python soundex3c.py
-<samp>Woo             W000 14.1662554878
-Pilgrim         P426 16.0397885765
-Flingjingwaller F452 22.1789341942</span>
-</pre><p>We haven't made any progress here at all, except to try and rule out several &#8220;clever&#8221; techniques. The fastest code we've seen so far was the original, most straightforward method (<code>soundex2c.py</code>). Sometimes it doesn't pay to be clever.
-<div class=example><h3>Example 18.5. Best Result So Far: <code>soundex/stage2/soundex2c.py</code></h3><pre><code>
-import string, re
-
-allChar = string.uppercase + string.lowercase
-charToSoundex = string.maketrans(allChar, "91239129922455912623919292" * 2)
-isOnlyChars = re.compile('^[A-Za-z]+$').search
-
-def soundex(source):
-    if not isOnlyChars(source):
-        return "0000"
-    digits = source[0].upper() + source[1:].translate(charToSoundex)
-    digits2 = digits[0]
-    for d in digits[1:]:
-        if digits2[-1] != d:
-            digits2 += d
-    digits3 = re.sub('9', '', digits2)
-    while len(digits3) &lt; 4:
-        digits3 += "0"
-    return digits3[:4]
-
-if __name__ == '__main__':
-    from timeit import Timer
-    names = ('Woo', 'Pilgrim', 'Flingjingwaller')
-    for name in names:
-        statement = "soundex('%s')" % name
-        t = Timer(statement, "from __main__ import soundex")
-        print name.ljust(15), soundex(name), min(t.repeat())
-</pre><h2 id="soundex.stage4">18.6. Optimizing String Manipulation</h2>
-<p>The final step of the Soundex algorithm is padding short results with zeros, and truncating long results. What is the best
-   way to do this?
-<p>This is what we have so far, taken from <code>soundex/stage2/soundex2c.py</code>:
-<pre><code>
-    digits3 = re.sub('9', '', digits2)
-    while len(digits3) &lt; 4:
-        digits3 += "0"
-    return digits3[:4]
-</pre><p>These are the results for <code>soundex2c.py</code>:
-<pre class=screen>
-<samp class=p>C:\samples\soundex\stage2></samp>python soundex2c.py
-<samp>Woo             W000 12.6070768771
-Pilgrim         P426 14.4033353401
-Flingjingwaller F452 19.7774882003</span>
-</pre><p>The first thing to consider is replacing that regular expression with a loop. This code is from <code>soundex/stage4/soundex4a.py</code>:
-<pre><code>
-    digits3 = ''
-    for d in digits2:
-        if d != '9':
-            digits3 += d
-</pre><p>Is <code>soundex4a.py</code> faster?  Yes it is:
-<pre class=screen>
-<samp class=p>C:\samples\soundex\stage4></samp>python soundex4a.py
-<samp>Woo             W000 6.62865531792
-Pilgrim         P426 9.02247576158
-Flingjingwaller F452 13.6328416042</span>
-</pre><p>But wait a minute. A loop to remove characters from a string?  We can use a simple string method for that. Here's <code>soundex/stage4/soundex4b.py</code>:
-<pre><code>
-    digits3 = digits2.replace('9', '')
-</pre><p>Is <code>soundex4b.py</code> faster?  That's an interesting question. It depends on the input:
-<pre class=screen>
-<samp class=p>C:\samples\soundex\stage4></samp>python soundex4b.py
-<samp>Woo             W000 6.75477414029
-Pilgrim         P426 7.56652144337
-Flingjingwaller F452 10.8727729362</span>
-</pre><p>The string method in <code>soundex4b.py</code> is faster than the loop for most names, but it's actually slightly slower than <code>soundex4a.py</code> in the trivial case (of a very short name). Performance optimizations aren't always uniform; tuning that makes one case
-faster can sometimes make other cases slower. In this case, the majority of cases will benefit from the change, so let's
-leave it at that, but the principle is an important one to remember.
-<p>Last but not least, let's examine the final two steps of the algorithm: padding short results with zeros, and truncating long
-results to four characters. The code you see in <code>soundex4b.py</code> does just that, but it's horribly inefficient. Take a look at <code>soundex/stage4/soundex4c.py</code> to see why:
-<pre><code>
-    digits3 += '000'
-    return digits3[:4]
-</pre><p>Why do we need a <code>while</code> loop to pad out the result?  We know in advance that we're going to truncate the result to four characters, and we know that
-we already have at least one character (the initial letter, which is passed unchanged from the original <var>source</var> variable). That means we can simply add three zeros to the output, then truncate it. Don't get stuck in a rut over the
-exact wording of the problem; looking at the problem slightly differently can lead to a simpler solution.
-<p>How much speed do we gain in <code>soundex4c.py</code> by dropping the <code>while</code> loop?  It's significant:
-<pre class=screen>
-<samp class=p>C:\samples\soundex\stage4></samp>python soundex4c.py
-<samp>Woo             W000 4.89129791636
-Pilgrim         P426 7.30642134685
-Flingjingwaller F452 10.689832367</span>
-</pre><p>Finally, there is still one more thing you can do to these three lines of code to make them faster: you can combine them into
-one line. Take a look at <code>soundex/stage4/soundex4d.py</code>:
-<pre><code>
-    return (digits2.replace('9', '') + '000')[:4]
-</pre><p>Putting all this code on one line in <code>soundex4d.py</code> is barely faster than <code>soundex4c.py</code>:
-<pre class=screen>
-<samp class=p>C:\samples\soundex\stage4></samp>python soundex4d.py
-<samp>Woo             W000 4.93624105857
-Pilgrim         P426 7.19747593619
-Flingjingwaller F452 10.5490700634</span>
-</pre><p>It is also significantly less readable, and for not much performance gain. Is that worth it?  I hope you have good comments.
-Performance isn't everything. Your optimization efforts must always be balanced against threats to your program's readability
-and maintainability.
-<h2 id="soundex.summary">18.7. Summary</h2>
-<p>This chapter has illustrated several important aspects of performance tuning in Python, and performance tuning in general.
-<div class=itemizedlist>
-<ul>
-<li>If you need to choose between regular expressions and writing a loop, choose regular expressions. The regular expression
-      engine is compiled in C and runs natively on your computer; your loop is written in Python and runs through the Python interpreter.
-
-<li>If you need to choose between regular expressions and string methods, choose string methods. Both are compiled in C, so choose
-      the simpler one.
-
-<li>General-purpose dictionary lookups are fast, but specialtiy functions such as <code>string.maketrans</code> and string methods such as <code>isalpha()</code> are faster. If Python has a custom-tailored function for you, use it.
-
-<li>Don't be too clever. Sometimes the most obvious algorithm is also the fastest.
-<li>Don't sweat it too much. Performance isn't everything.
-</ul>
-<p>I can't emphasize that last point strongly enough. Over the course of this chapter, you made this function three times faster
-and saved 20 seconds over 1 million function calls. Great. Now think: over the course of those million function calls, how
-many seconds will your surrounding application wait for a database connection?  Or wait for disk I/O?  Or wait for user input?
-Don't spend too much time over-optimizing one algorithm, or you'll ignore obvious improvements somewhere else. Develop an
-instinct for the sort of code that Python runs well, correct obvious blunders if you find them, and leave the rest alone.
-</body>
-</html>
diff --git a/diveintopython3.org b/diveintopython3.org
index 5f7e69d..3332e63 100755
--- a/diveintopython3.org
+++ b/diveintopython3.org
@@ -2,10 +2,10 @@
 * Your First Python Program
 ** TODO mention why from module import * is only allowed at module level
 * Native Datatypes
-** TODO section (chapter?) on comprehensions
-*** TODO list comprehensions
-*** TODO set comprehensions
-*** TODO dictionary comprehensions
+* TODO Comprehensions
+** List comprehensions
+** Set comprehensions
+** Dictionary comprehensions
 * Strings
 * Regular Expressions
 * Closures & Generators
@@ -13,9 +13,7 @@
 * DONE 2nd draft Advanced Iterators
   SCHEDULED: <2009-07-15 Wed> CLOSED: [2009-07-15 Wed 20:57]
 * TODO 2nd draft Unit Testing
-* TODO 1st draft Advanced Unit Testing
 * TODO 2nd draft Refactoring
-* TODO 1st draft Advanced Classes
 * DONE 1st draft Files
   SCHEDULED: <2009-07-16 Thu> CLOSED: [2009-07-19 Sun 15:26]
 ** Reading from text files