mirror of
https://github.com/kennethreitz/dive-into-python3.git
synced 2026-06-05 23:10:17 +00:00
more stubbing
This commit is contained in:
+189
-12
@@ -20,30 +20,203 @@ body{counter-reset:h1 3}
|
||||
</blockquote>
|
||||
<p id=toc>
|
||||
<h2 id=divingin>Diving In</h2>
|
||||
<p class=f>FIXME
|
||||
<p class=f>This chapter will teach you about list comprehensions, dictionary comprehensions, and set comprehensions: three related concepts centered around one very powerful technique. But first, I want to take a little detour into two modules that will help you navigate your local file system.
|
||||
|
||||
<h2 id=os>The <code>os</code> module</h2>
|
||||
|
||||
<p>Python 3 comes with a module called <code>os</code>, which stands for “operating system.” The <a href=http://docs.python.org/3.1/library/os.html><code>os</code> module</a> contains a plethora of functions to get information on — and in some cases, to manipulate — local directories, files, processes, and environment variables. Python does its best to offer a unified <abbr>API</abbr> across <a href=installing-python.html>all supported operating systems</a> so your programs can run on any computer with as little platform-specific code as possible.
|
||||
|
||||
<h3 id=getcwd>The Current Working Directory</h3>
|
||||
|
||||
<p>When you’re just getting started with Python, you’re going to spend a lot of time in <a href=installing-python.html#idle>the Python Shell</a>. Throughout this book, you will see examples that go like this:
|
||||
|
||||
<ol>
|
||||
<li>Import one of the modules in the <a href=examples/><code>examples</code> folder</a>
|
||||
<li>Call a function in that module
|
||||
<li>Explain the result
|
||||
</ol>
|
||||
|
||||
<p>If you don’t know about the current working directory, step 1 will probably fail with an <code>ImportError</code>. Why? Because Python will look for the example module in <a href=your-first-python-program.html#importsearchpath>the import search path</a>, but it won’t find it because the <code>examples</code> folder isn’t one of the directories in the search path. To get past this, you can do one of two things:
|
||||
|
||||
<ol>
|
||||
<li>Add the <code>examples</code> folder to the import search path
|
||||
<li>Change the current working directory to the <code>examples</code> folder
|
||||
</ol>
|
||||
|
||||
<p>The current working directory is an invisible property that Python holds in memory at all times. There is always a current working directory, whether you’re in the Python Shell, running your own Python script from the command line, or running a Python <abbr>CGI</abbr> script on a web server somewhere.
|
||||
|
||||
<p>The <code>os</code> module contains two functions to deal with the current working directory.
|
||||
|
||||
<pre class=screen>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>import os</kbd> <span class=u>①</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>print(os.getcwd())</kbd> <span class=u>②</span></a>
|
||||
<samp class=pp>C:\Python31</samp>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>os.chdir('/Users/pilgrim/diveintopython3/examples')</kbd> <span class=u>③</span></a>
|
||||
<a><samp class=p>>>> </samp><kbd class=pp>print(os.getcwd())</kbd> <span class=u>④</span></a>
|
||||
<samp class=pp>C:\Users\pilgrim\diveintopython3\examples</samp></pre>
|
||||
<ol>
|
||||
<li>When you run the graphical Python Shell, the current working directory starts as the directory where the Python Shell executable is. On Windows, this depends on where you installed Python; the default directory is <code>c:\Python31</code>. If you run the Python Shell from the command line, the current working directory starts as the directory you were in when you ran <code>python3</code>.
|
||||
<li>FIXME
|
||||
<li>FIXME
|
||||
<li>FIXME
|
||||
</ol>
|
||||
|
||||
<h3 id=ospath>The <code>os.path</code> module</h3>
|
||||
|
||||
<p>FIXME The <code>os.path</code> module has several functions for manipulating files and directories. Here, we're looking at handling pathnames and listing the contents of a directory.
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>import os</kbd>
|
||||
<samp class=p>>>> </samp><kbd>os.path.join("c:\\music\\ap\\", "mahadeva.mp3")</kbd> <span>①</span> <span>②</span>
|
||||
'c:\\music\\ap\\mahadeva.mp3'
|
||||
<samp class=p>>>> </samp><kbd>os.path.join("c:\\music\\ap", "mahadeva.mp3")</kbd> <span>③</span>
|
||||
'c:\\music\\ap\\mahadeva.mp3'
|
||||
<samp class=p>>>> </samp><kbd>os.path.expanduser("~")</kbd> <span>④</span>
|
||||
'c:\\Documents and Settings\\mpilgrim\\My Documents'
|
||||
<samp class=p>>>> </samp><kbd>os.path.join(os.path.expanduser("~"), "Python")</kbd> <span>⑤</span>
|
||||
'c:\\Documents and Settings\\mpilgrim\\My Documents\\Python'</pre>
|
||||
<ol>
|
||||
<li><code>os.path</code> is a reference to a module -- which module depends on your platform. Just as <a href="#crossplatform.example" title="Example 6.2. Supporting Platform-Specific Functionality"><code>getpass</code></a> encapsulates differences between platforms by setting <var>getpass</var> to a platform-specific function, <code>os</code> encapsulates differences between platforms by setting <var>path</var> to a platform-specific module.
|
||||
<li>The <code>join</code> function of <code>os.path</code> constructs a pathname out of one or more partial pathnames. In this case, it simply concatenates strings. (Note that dealing
|
||||
with pathnames on Windows is annoying because the backslash character must be escaped.)
|
||||
<li>In this slightly less trivial case, <code>join</code> will add an extra backslash to the pathname before joining it to the filename. I was overjoyed when I discovered this, since
|
||||
<code>addSlashIfNecessary</code> is one of the stupid little functions I always need to write when building up my toolbox in a new language. <em>Do not</em> write this stupid little function in Python; smart people have already taken care of it for you.
|
||||
<li><code>expanduser</code> will expand a pathname that uses <code>~</code> to represent the current user's home directory. This works on any platform where users have a home directory, like Windows,
|
||||
<abbr>UNIX</abbr>, and Mac OS X; it has no effect on Mac OS.
|
||||
<li>Combining these techniques, you can easily construct pathnames for directories and files under the user's home directory.
|
||||
</ol>
|
||||
|
||||
<p>FIXME
|
||||
|
||||
<pre class=screen><samp class=p>>>> </samp><kbd>os.path.split("c:\\music\\ap\\mahadeva.mp3")</kbd> <span>①</span>
|
||||
('c:\\music\\ap', 'mahadeva.mp3')
|
||||
<samp class=p>>>> </samp><kbd>(filepath, filename) = os.path.split("c:\\music\\ap\\mahadeva.mp3")</kbd> <span>②</span>
|
||||
<samp class=p>>>> </samp><kbd>filepath</kbd> <span>③</span>
|
||||
'c:\\music\\ap'
|
||||
<samp class=p>>>> </samp><kbd>filename</kbd> <span>④</span>
|
||||
'mahadeva.mp3'
|
||||
<samp class=p>>>> </samp><kbd>(shortname, extension) = os.path.splitext(filename)</kbd> <span>⑤</span>
|
||||
<samp class=p>>>> </samp><kbd>shortname</kbd>
|
||||
'mahadeva'
|
||||
<samp class=p>>>> </samp><kbd>extension</kbd>
|
||||
'.mp3'</pre>
|
||||
<ol>
|
||||
<li>The <code>split</code> function splits a full pathname and returns a tuple containing the path and filename. Remember when I said you could use
|
||||
<a href="#odbchelper.multiassign" title="3.4.2. Assigning Multiple Values at Once">multi-variable assignment</a> to return multiple values from a function? Well, <code>split</code> is such a function.
|
||||
<li>You assign the return value of the <code>split</code> function into a tuple of two variables. Each variable receives the value of the corresponding element of the returned tuple.
|
||||
<li>The first variable, <var>filepath</var>, receives the value of the first element of the tuple returned from <code>split</code>, the file path.
|
||||
<li>The second variable, <var>filename</var>, receives the value of the second element of the tuple returned from <code>split</code>, the filename.
|
||||
<li><code>os.path</code> also contains a function <code>splitext</code>, which splits a filename and returns a tuple containing the filename and the file extension. You use the same technique
|
||||
to assign each of them to separate variables.
|
||||
</ol>
|
||||
|
||||
<p>FIXME
|
||||
|
||||
<pre class=screen><samp class=p>>>> </samp><kbd>os.listdir("c:\\music\\_singles\\")</kbd> <span>①</span>
|
||||
<samp>['a_time_long_forgotten_con.mp3', 'hellraiser.mp3',
|
||||
'kairo.mp3', 'long_way_home1.mp3', 'sidewinder.mp3',
|
||||
'spinning.mp3']</samp>
|
||||
<samp class=p>>>> </samp><kbd>dirname = "c:\\"</kbd>
|
||||
<samp class=p>>>> </samp><kbd>os.listdir(dirname)</kbd> <span>②</span>
|
||||
<samp>['AUTOEXEC.BAT', 'boot.ini', 'CONFIG.SYS', 'cygwin',
|
||||
'docbook', 'Documents and Settings', 'Incoming', 'Inetpub', 'IO.SYS',
|
||||
'MSDOS.SYS', 'Music', 'NTDETECT.COM', 'ntldr', 'pagefile.sys',
|
||||
'Program Files', 'Python20', 'RECYCLER',
|
||||
'System Volume Information', 'TEMP', 'WINNT']</samp>
|
||||
<samp class=p>>>> </samp><kbd>[f for f in os.listdir(dirname)</kbd>
|
||||
<samp class=p>... </samp>if os.path.isfile(os.path.join(dirname, f))] <span>③</span>
|
||||
<samp>['AUTOEXEC.BAT', 'boot.ini', 'CONFIG.SYS', 'IO.SYS', 'MSDOS.SYS',
|
||||
'NTDETECT.COM', 'ntldr', 'pagefile.sys']</samp>
|
||||
<samp class=p>>>> </samp><kbd>[f for f in os.listdir(dirname)</kbd>
|
||||
<samp class=p>... </samp>if os.path.isdir(os.path.join(dirname, f))] <span>④</span>
|
||||
<samp>['cygwin', 'docbook', 'Documents and Settings', 'Incoming',
|
||||
'Inetpub', 'Music', 'Program Files', 'Python20', 'RECYCLER',
|
||||
'System Volume Information', 'TEMP', 'WINNT']</samp></pre>
|
||||
<ol>
|
||||
<li>The <code>listdir</code> function takes a pathname and returns a list of the contents of the directory.
|
||||
<li><code>listdir</code> returns both files and folders, with no indication of which is which.
|
||||
<li>You can use <a href="#apihelper.filter" title="4.5. Filtering Lists">list filtering</a> and the <code>isfile</code> function of the <code>os.path</code> module to separate the files from the folders. <code>isfile</code> takes a pathname and returns 1 if the path represents a file, and 0 otherwise. Here you're using <code><code>os.path</code>.<code>join</code></code> to ensure a full pathname, but <code>isfile</code> also works with a partial path, relative to the current working directory. You can use <code>os.getcwd()</code> to get the current working directory.
|
||||
<li><code>os.path</code> also has a <code>isdir</code> function which returns 1 if the path represents a directory, and 0 otherwise. You can use this to get a list of the subdirectories
|
||||
within a directory.
|
||||
</ol>
|
||||
|
||||
<h2 id=glob>The <code>glob</code> module</h2>
|
||||
|
||||
<p>FIXME
|
||||
|
||||
<pre><code>def listDirectory(directory, fileExtList):
|
||||
"get list of file info objects for files of particular extensions"
|
||||
fileList = [os.path.normcase(f)
|
||||
for f in os.listdir(directory)] <span>①</span> <span>②</span>
|
||||
fileList = [os.path.join(directory, f)
|
||||
for f in fileList
|
||||
if os.path.splitext(f)[1] in fileExtList] <span>③</span> <span>④</span> <span>⑤</span></code></pre>
|
||||
<ol>
|
||||
<li><code>os.listdir(directory)</code> returns a list of all the files and folders in <var>directory</var>.
|
||||
<li>Iterating through the list with <var>f</var>, you use <code>os.path.normcase(f)</code> to normalize the case according to operating system defaults. <code>normcase</code> is a useful little function that compensates for case-insensitive operating systems that think that <code>mahadeva.mp3</code> and <code>mahadeva.MP3</code> are the same file. For instance, on Windows and Mac OS, <code>normcase</code> will convert the entire filename to lowercase; on <abbr>UNIX</abbr>-compatible systems, it will return the filename unchanged.
|
||||
<li>Iterating through the normalized list with <var>f</var> again, you use <code>os.path.splitext(f)</code> to split each filename into name and extension.
|
||||
<li>For each file, you see if the extension is in the list of file extensions you care about (<var>fileExtList</var>, which was passed to the <code>listDirectory</code> function).
|
||||
<li>For each file you care about, you use <code>os.path.join(directory, f)</code> to construct the full pathname of the file, and return a list of the full pathnames.
|
||||
</ol>
|
||||
|
||||
<blockquote class=note>
|
||||
<p><span class=u>☞</span>Whenever possible, you should use the functions in <code>os</code> and <code>os.path</code> for file, directory, and path manipulations. These modules are wrappers for platform-specific modules, so functions like <code>os.path.split()</code> work on <abbr>UNIX</abbr>, Windows, Mac OS X, and any other platform supported by Python.
|
||||
</blockquote>
|
||||
|
||||
<p>There is one other way to get the contents of a directory. It's very powerful, and it uses the sort of wildcards that you may already be familiar with from working on the command line.
|
||||
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>os.listdir("c:\\music\\_singles\\")</kbd> <span>①</span>
|
||||
<samp>['a_time_long_forgotten_con.mp3', 'hellraiser.mp3',
|
||||
'kairo.mp3', 'long_way_home1.mp3', 'sidewinder.mp3',
|
||||
'spinning.mp3']</samp>
|
||||
<samp class=p>>>> </samp><kbd>import glob</kbd>
|
||||
<samp class=p>>>> </samp><kbd>glob.glob('c:\\music\\_singles\\*.mp3')</kbd> <span>②</span>
|
||||
<samp>['c:\\music\\_singles\\a_time_long_forgotten_con.mp3',
|
||||
'c:\\music\\_singles\\hellraiser.mp3',
|
||||
'c:\\music\\_singles\\kairo.mp3',
|
||||
'c:\\music\\_singles\\long_way_home1.mp3',
|
||||
'c:\\music\\_singles\\sidewinder.mp3',
|
||||
'c:\\music\\_singles\\spinning.mp3']</samp>
|
||||
<samp class=p>>>> </samp><kbd>glob.glob('c:\\music\\_singles\\s*.mp3')</kbd> <span>③</span>
|
||||
<samp>['c:\\music\\_singles\\sidewinder.mp3',
|
||||
'c:\\music\\_singles\\spinning.mp3']</samp>
|
||||
<samp class=p>>>> </samp><kbd>glob.glob('c:\\music\\*\\*.mp3')</kbd><span>④</span>
|
||||
</pre>
|
||||
<ol>
|
||||
<li>As you saw earlier, <code>os.listdir</code> simply takes a directory path and lists all files and directories in that directory.
|
||||
<li>The <code>glob</code> module, on the other hand, takes a wildcard and returns the full path of all files and directories matching the wildcard.
|
||||
Here the wildcard is a directory path plus "*.mp3", which will match all <code>.mp3</code> files. Note that each element of the returned list already includes the full path of the file.
|
||||
<li>If you want to find all the files in a specific directory that start with "s" and end with ".mp3", you can do that too.
|
||||
<li>Now consider this scenario: you have a <code>music</code> directory, with several subdirectories within it, with <code>.mp3</code> files within each subdirectory. You can get a list of all of those with a single call to <code>glob</code>, by using two wildcards at once. One wildcard is the <code>"*.mp3"</code> (to match <code>.mp3</code> files), and one wildcard is <em>within the directory path itself</em>, to match any subdirectory within <code>c:\music</code>. That's a crazy amount of power packed into one deceptively simple-looking function!
|
||||
</ol>
|
||||
|
||||
<h2 id=list-comprehensions>List Comprehensions</h2>
|
||||
|
||||
<p>FIXME
|
||||
<!--
|
||||
<p>One of the most powerful features of Python is the list comprehension, which provides a compact way of mapping a list into another list by applying a function to each
|
||||
of the elements of the list.
|
||||
<div class=example><h3>Example 3.24. Introducing List Comprehensions</h3><pre class=screen><samp class=p>>>> </samp><kbd>li = [1, 9, 8, 4]</kbd>
|
||||
<samp class=p>>>> </samp><kbd>[elem*2 for elem in li]</kbd> <span>①</span>
|
||||
|
||||
<pre class=screen><samp class=p>>>> </samp><kbd>li = [1, 9, 8, 4]</kbd>
|
||||
<samp class=p>>>> </samp><kbd>[elem * 2 for elem in li]</kbd> <span>①</span>
|
||||
[2, 18, 16, 8]
|
||||
<samp class=p>>>> </samp><kbd>li</kbd> <span>②</span>
|
||||
[1, 9, 8, 4]
|
||||
<samp class=p>>>> </samp><kbd>li = [elem*2 for elem in li]</kbd> <span>③</span>
|
||||
<samp class=p>>>> </samp><kbd>li = [elem * 2 for elem in li]</kbd> <span>③</span>
|
||||
<samp class=p>>>> </samp><kbd>li</kbd>
|
||||
[2, 18, 16, 8]</pre>
|
||||
<ol>
|
||||
<li>To make sense of this, look at it from right to left. <var>li</var> is the list you're mapping. Python loops through <var>li</var> one element at a time, temporarily assigning the value of each element to the variable <var>elem</var>. Python then applies the function <code><var>elem</var>*2</code> and appends that result to the returned list.
|
||||
<li>Note that list comprehensions do not change the original list.
|
||||
<li>It is safe to assign the result of a list comprehension to the variable that you're mapping. Python constructs the new list in memory, and when the list comprehension is complete, it assigns the result to the variable.
|
||||
</ol>
|
||||
|
||||
<p>Here are the list comprehensions in the <code>buildConnectionString</code> function that you declared in <a href="#odbchelper">Chapter 2</a>:<pre><code>
|
||||
["%s=%s" % (k, v) for k, v in params.items()]</pre><p>First, notice that you're calling the <code>items</code> function of the <var>params</var> dictionary. This function returns a list of tuples of all the data in the dictionary.
|
||||
<div class=example><h3 id="odbchelper.items">Example 3.25. The <code>keys</code>, <code>values</code>, and <code>items</code> Functions</h3><pre class=screen><samp class=p>>>> </samp><kbd>params = {"server":"mpilgrim", "database":"master", "uid":"sa", "pwd":"secret"}</kbd>
|
||||
<p>FIXME Here are the list comprehensions in the <code>buildConnectionString</code> function that you declared in <a href="#odbchelper">Chapter 2</a>:
|
||||
|
||||
<pre><code>["%s=%s" % (k, v) for k, v in params.items()]</code></pre>
|
||||
|
||||
<p>First, notice that you're calling the <code>items</code> function of the <var>params</var> dictionary. This function returns a list of tuples of all the data in the dictionary.
|
||||
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>params = {"server":"mpilgrim", "database":"master", "uid":"sa", "pwd":"secret"}</kbd>
|
||||
<samp class=p>>>> </samp><kbd>params.keys()</kbd> <span>①</span>
|
||||
['server', 'uid', 'database', 'pwd']
|
||||
<samp class=p>>>> </samp><kbd>params.values()</kbd> <span>②</span>
|
||||
@@ -55,9 +228,13 @@ body{counter-reset:h1 3}
|
||||
(remember that elements in a dictionary are unordered), but it is a list.
|
||||
<li>The <code>values</code> method returns a list of all the values. The list is in the same order as the list returned by <code>keys</code>, so <code>params.values()[n] == params[params.keys()[n]]</code> for all values of <var>n</var>.
|
||||
<li>The <code>items</code> method returns a list of tuples of the form <code>(<var>key</var>, <var>value</var>)</code>. The list contains all the data in the dictionary.
|
||||
</ol>
|
||||
|
||||
<p>Now let's see what <code>buildConnectionString</code> does. It takes a list, <code><var>params</var>.<code>items</code>()</code>, and maps it to a new list by applying string formatting to each element. The new list will have the same number of elements
|
||||
as <code><var>params</var>.<code>items</code>()</code>, but each element in the new list will be a string that contains both a key and its associated value from the <var>params</var> dictionary.
|
||||
<div class=example><h3>Example 3.26. List Comprehensions in <code>buildConnectionString</code>, Step by Step</h3><pre class=screen><samp class=p>>>> </samp><kbd>params = {"server":"mpilgrim", "database":"master", "uid":"sa", "pwd":"secret"}</kbd>
|
||||
|
||||
<pre class=screen>
|
||||
<samp class=p>>>> </samp><kbd>params = {"server":"mpilgrim", "database":"master", "uid":"sa", "pwd":"secret"}</kbd>
|
||||
<samp class=p>>>> </samp><kbd>params.items()</kbd>
|
||||
[('server', 'mpilgrim'), ('uid', 'sa'), ('database', 'master'), ('pwd', 'secret')]
|
||||
<samp class=p>>>> </samp><kbd>[k for k, v in params.items()]</kbd> <span>①</span>
|
||||
@@ -71,7 +248,7 @@ as <code><var>params</var>.<code>items</code>()</code>, but each element in the
|
||||
<li>Here you're doing the same thing, but ignoring the value of <var>k</var>, so this list comprehension ends up being equivalent to <code><var>params</var>.<code>values</code>()</code>.
|
||||
<li>Combining the previous two examples with some simple <a href="#odbchelper.stringformatting" title="3.5. Formatting Strings">string formatting</a>, you get a list of strings that include both the key and value of each element of the dictionary. This looks suspiciously
|
||||
like the <a href="#odbchelper.output">output</a> of the program. All that remains is to join the elements in this list into a single string.
|
||||
-->
|
||||
</ol>
|
||||
|
||||
<p class=a>⁂
|
||||
|
||||
@@ -89,7 +266,7 @@ as <code><var>params</var>.<code>items</code>()</code>, but each element in the
|
||||
|
||||
<h2 id=furtherreading>Further Reading</h2>
|
||||
<ul>
|
||||
<li>FIXME
|
||||
<li><a href=http://docs.python.org/3.1/library/os.html><code>os</code> module</a>
|
||||
</ul>
|
||||
<p class=v><a href=native-datatypes.html rel=prev title='back to “Native Datatypes”'><span class=u>☜</span></a> <a href=strings.html rel=next title='onward to “Strings”'><span class=u>☞</span></a>
|
||||
<p class=c>© 2001–9 <a href=about.html>Mark Pilgrim</a>
|
||||
|
||||
+4
-6
@@ -2,10 +2,10 @@
|
||||
* Your First Python Program
|
||||
** TODO mention why from module import * is only allowed at module level
|
||||
* Native Datatypes
|
||||
** TODO section (chapter?) on comprehensions
|
||||
*** TODO list comprehensions
|
||||
*** TODO set comprehensions
|
||||
*** TODO dictionary comprehensions
|
||||
* TODO Comprehensions
|
||||
** List comprehensions
|
||||
** Set comprehensions
|
||||
** Dictionary comprehensions
|
||||
* Strings
|
||||
* Regular Expressions
|
||||
* Closures & Generators
|
||||
@@ -13,9 +13,7 @@
|
||||
* DONE 2nd draft Advanced Iterators
|
||||
SCHEDULED: <2009-07-15 Wed> CLOSED: [2009-07-15 Wed 20:57]
|
||||
* TODO 2nd draft Unit Testing
|
||||
* TODO 1st draft Advanced Unit Testing
|
||||
* TODO 2nd draft Refactoring
|
||||
* TODO 1st draft Advanced Classes
|
||||
* DONE 1st draft Files
|
||||
SCHEDULED: <2009-07-16 Thu> CLOSED: [2009-07-19 Sun 15:26]
|
||||
** Reading from text files
|
||||
|
||||
Reference in New Issue
Block a user