diff --git a/dip2 b/dip2 index 98e8cb6..40f0ffd 100644 --- a/dip2 +++ b/dip2 @@ -432,7 +432,7 @@ Type "help", "copyright", "credits", or "license" for more information. >>> [press Ctrl+D to exit] [root@localhost root]# which python2.3 ③ /usr/bin/python2.3 -
(x, y, z) is a tuple of three variables. Assigning one to the other assigns each of the values of v to each of the variables, in order.
This has all sorts of uses. I often want to assign names to a range of values. In C, you would use enum and manually list each constant and its associated value, which seems especially tedious when the values are consecutive.
@@ -645,7 +645,7 @@ NameError: There is no variable named 'x'
>>> TUESDAY
1
>>> SUNDAY
-6
range function returns a list of integers. In its simplest form, it takes an upper limit and returns a zero-based list counting
up to but not including the upper limit. (If you like, you can pass other parameters to specify a base other than 0 and a step other than 1. You can print range.__doc__ for details.)
@@ -671,7 +671,7 @@ NameError: There is no variable named 'x'
[1, 9, 8, 4]
>>> li = [elem*2 for elem in li] ③
>>> li
-[2, 18, 16, 8]elem*2 and appends that result to the returned list.
keys method of a dictionary returns a list of all the keys. The list is not in the order in which the dictionary was defined
(remember that elements in a dictionary are unordered), but it is a list.
@@ -701,7 +701,7 @@ as params.items(), but each element in the
>>> [v for k, v in params.items()] ②
['mpilgrim', 'sa', 'master', 'secret']
>>> ["%s=%s" % (k, v) for k, v in params.items()] ③
-['server=mpilgrim', 'uid=sa', 'database=master', 'pwd=secret']params.items() list. This is another use of multi-variable assignment. The first element of params.items() is ('server', 'mpilgrim'), so in the first iteration of the list comprehension, k will get 'server' and v will get 'mpilgrim'. In this case, you're ignoring the value of v and only including the value of k in the returned list, so this list comprehension ends up being equivalent to params.keys().
params.values().
@@ -774,7 +774,7 @@ def info(object, spacing=10, collapse=1): ① ②
for method in methodList])
if __name__ == "__main__": ④ ⑤
- print info.__doc__info. According to its function declaration, it takes three parameters: object, spacing, and collapse. The last two are actually optional parameters, as you'll see shortly.
info function has a multi-line docstring that succinctly describes the function's purpose. Note that no return value is mentioned; this function will be used solely
@@ -818,7 +818,7 @@ Python, arguments can be specified by name, in any order.
info(odbchelper) ①
info(odbchelper, 12) ②
info(odbchelper, collapse=0) ③
-info(spacing=15, object=odbchelper) ④10 and collapse gets its default value of 1.
1.
@@ -852,7 +852,7 @@ time, you'll call functions the “normal” way, but you always have th
<type 'module'>
>>> import types ④
>>> type(odbchelper) == types.ModuleType
-Truetype takes anything -- and I mean anything -- and returns its datatype. Integers, strings, lists, dictionaries, tuples, functions,
classes, modules, even types are acceptable.
@@ -873,7 +873,7 @@ Truestr to work, because almost every language has a function to convert an integer to a string.
str works on any object of any type. Here it works on a list which you've constructed in bits and pieces.
@@ -891,7 +891,7 @@ Truedir(li) returns a list of all the methods of a list. Note that the returned list contains the names of the methods as strings, not
the methods themselves.
@@ -915,7 +915,7 @@ True
intervening occurrences of sep. The default separator is a
single space.
- (joinfields and join are synonymous)string module are deprecated (although many people still use the join function), but the module contains a lot of useful constants like this string.punctuation, which contains all the standard punctuation characters.
string.join is a function that joins a list of strings.
@@ -966,7 +966,7 @@ IOError I/O operation failed.
>>> getattr((), "pop") ⑤
Traceback (innermost last):
File "<interactive input>", line 1, in ?
-AttributeError: 'tuple' object has no attribute 'pop'pop method of the list. Note that this is not calling the pop method; that would be li.pop(). This is the method itself.
pop method, but this time, the method name is specified as a string argument to the getattr function. getattr is an incredibly useful built-in function that returns any attribute of any object. In this case, the object is a list,
@@ -991,7 +991,7 @@ AttributeError: 'tuple' object has no attribute 'pop'buildConnectionString function in the odbchelper module, which you studied in Chapter 2, Your First Python Program. (The hex address you see is specific to my machine; your output will be different.)
getattr, you can get the same reference to the same function. In general, getattr(object, "attribute") is equivalent to object.attribute. If object is a module, then attribute can be anything defined in the module: a function, class, or global variable.
@@ -1009,7 +1009,7 @@ import statsout
def output(data, format="text"): ①
output_function = getattr(statsout, "output_%s" % format) ②
return output_function(data) ③
-output function takes one required argument, data, and one optional argument, format. If format is not specified, it defaults to text, and you will end up calling the plain text output function.
statsout module. This allows you to easily extend the program later to support other output formats, without changing this dispatch
@@ -1025,7 +1025,7 @@ import statsout
def output(data, format="text"):
output_function = getattr(statsout, "output_%s" % format, statsout.output_text)
return output_function(data) ①
-getattr. The third argument is a default value that is returned if the attribute or method specified by the second argument wasn't
found.
@@ -1042,7 +1042,7 @@ so they are never put through the mapping expression and are not included in the
>>> [elem for elem in li if elem != "b"] ②
['a', 'mpilgrim', 'foo', 'c', 'd', 'd']
>>> [elem for elem in li if li.count(elem) == 1] ③
-['a', 'mpilgrim', 'foo', 'c']pop method of a list) and user-defined (like the buildCon
>>> '' and 'b' ②
''
>>> 'a' and 'b' and 'c' ③
-'c'
+'c'
- When using
and, values are evaluated in a boolean context from left to right. 0, '', [], (), {}, and None are false in a boolean context; everything else is true. Well, almost everything. By default, instances of classes are
true in a boolean context, but you can define special methods in your class to make an instance evaluate to false. You'll
@@ -1092,7 +1092,7 @@ the pop method of a list) and user-defined (like the buildCon
... print "in sidefx()"
... return 1
>>> 'a' or sidefx() ④
-'a'
+'a'
- When using
or, values are evaluated in a boolean context from left to right, just like and. If any value is true, or returns that value immediately. In this case, 'a' is the first true value.
or evaluates '', which is false, then 'b', which is true, and returns 'b'.
@@ -1107,7 +1107,7 @@ the pop method of a list) and user-defined (like the buildCon
'first'
>>> 0 and a or b ②
'second'
-
+
- This syntax looks similar to the
bool ? a : b expression in C. The entire expression is evaluated from left to right, so the and is evaluated first. 1 and 'first' evalutes to 'first', then 'first' or 'second' evalutes to 'first'.
0 and 'first' evalutes to False, and then 0 or 'second' evaluates to 'second'.
@@ -1116,7 +1116,7 @@ the pop method of a list) and user-defined (like the buildCon
Example 4.18. When the and-or Trick Fails
>>> a = ""
>>> b = "second"
>>> 1 and a or b ①
-'second'
+'second'
- Since a is an empty string, which Python considers false in a boolean context,
1 and '' evalutes to '', and then '' or 'second' evalutes to 'second'. Oops! That's not what you wanted.
The and-or trick, bool and a or b, will not work like the C expression bool ? a : b when a is false in a boolean context.
@@ -1124,7 +1124,7 @@ the pop method of a list) and user-defined (like the buildCon
Example 4.19. Using the and-or Trick Safely
>>> a = ""
>>> b = "second"
>>> (1 and [a] or [b])[0] ①
-''
+''
- Since
[a] is a non-empty list, it is never false. Even if a is 0 or '' or some other false value, the list [a] is true because it has one element.
By now, this trick may seem like more trouble than it's worth. You could, after all, accomplish the same thing with an if statement, so why go through all this fuss? Well, in many cases, you are choosing between two constant values, so you can
@@ -1147,7 +1147,7 @@ the pop method of a list) and user-defined (like the buildCon
>>> g(3)
6
>>> (lambda x: x*2)(3) ②
-6
+6
- This is a
lambda function that accomplishes the same thing as the normal function above it. Note the abbreviated syntax here: there are no
parentheses around the argument list, and the return keyword is missing (it is implied, since the entire function can only be one expression). Also, the function has no name,
@@ -1173,7 +1173,7 @@ a test
>>> print s.split() ②
['this', 'is', 'a', 'test']
>>> print " ".join(s.split()) ③
-'this is a test'
+'this is a test'
- This is a multiline string, defined by escape characters instead of triple quotes.
\n is a carriage return, and \t is a tab character.
split without any arguments splits on whitespace. So three spaces, a carriage return, and a tab character are all the same.
@@ -1216,7 +1216,7 @@ for method in methodListshows that this is a >>> print getattr(object, method).__doc__ ④
Build a connection string from a dictionary of parameters.
- Returns string.
+ Returns string.
- In the
info function, object is the object you're getting help on, passed in as an argument.
- As you're looping through methodList, method is the name of the current method.
@@ -1231,7 +1231,7 @@ for method in methodList
shows that this is a >>> str(foo.__doc__) ③
'None'
-
+
- You can easily define a function that has no
docstring, so its __doc__ attribute is None. Confusingly, if you evaluate the __doc__ attribute directly, the Python IDE prints nothing at all, which makes sense if you think about it, but is still unhelpful.
- You can verify that the value of the
__doc__ attribute is actually None by comparing it directly.
@@ -1245,7 +1245,7 @@ True
>>> s.ljust(30) ①
'buildConnectionString '
>>> s.ljust(20) ②
-'buildConnectionString'
+'buildConnectionString'
ljust pads the string with spaces to the given length. This is what the info function uses to make two columns of output and line up all the docstrings in the second column.
- If the given length is smaller than the length of the string,
ljust will simply return the string unchanged. It never truncates the string.
@@ -1254,7 +1254,7 @@ True
>>> print "\n".join(li) ①
a
b
-c
+c
- This is also a useful debugging trick when you're working with lists. And in Python, you're always working with lists.
That's the last piece of the puzzle. You should now understand this code.
@@ -1391,7 +1391,7 @@ def listDirectory(directory, fileExtList):
if __name__ == "__main__":
for info in listDirectory("/music/_singles/", [".mp3"]): ①
print "\n".join(["%s=%s" % (k, v) for k, v in info.items()])
- print
+ print
- This program's output depends on the files on your hard drive. To get meaningful output, you'll need to change the directory
path to point to a directory of MP3 files on your own machine.
@@ -1466,7 +1466,7 @@ can import individual items or use
from module import *
NameError: There is no variable named 'FunctionType'
>>> from types import FunctionType ③
>>> FunctionType ④
-<type 'function'>
+<type 'function'>
- The
types module contains no methods; it just has attributes for each Python object type. Note that the attribute, FunctionType, must be qualified by the module name, types.
FunctionType by itself has not been defined in this namespace; it exists only in the context of types.
@@ -1505,7 +1505,7 @@ NameError: There is no variable named 'FunctionType'
Example 5.4. Defining the FileInfo Class
from UserDict import UserDict
-class FileInfo(UserDict): ①
+class FileInfo(UserDict): ①
- In Python, the ancestor of a class is simply listed in parentheses immediately after the class name. So the
FileInfo class is inherited from the UserDict class (which was imported from the UserDict module). UserDict is a class that acts like a dictionary, allowing you to essentially subclass the dictionary datatype and add your own behavior.
(There are similar classes UserList and UserString which allow you to subclass lists and strings.) There is a bit of black magic behind this, which you will demystify later
@@ -1517,84 +1517,14 @@ class FileInfo(UserDict): ①
Python supports multiple inheritance. In the parentheses following the class name, you can list as many ancestor classes as you
like, separated by commas.
5.3.1. Initializing and Coding Classes
-This example shows the initialization of the FileInfo class using the __init__ method.
-
Example 5.5. Initializing the FileInfo Class
-class FileInfo(UserDict):
- "store file metadata" ①
- def __init__(self, filename=None): ② ③ ④
-
-- Classes can (and should) have
docstrings too, just like modules and functions.
- __init__ is called immediately after an instance of the class is created. It would be tempting but incorrect to call this the constructor
- of the class. It's tempting, because it looks like a constructor (by convention, __init__ is the first method defined for the class), acts like one (it's the first piece of code executed in a newly created instance
- of the class), and even sounds like one (“init” certainly suggests a constructor-ish nature). Incorrect, because the object has already been constructed by the time __init__ is called, and you already have a valid reference to the new instance of the class. But __init__ is the closest thing you're going to get to a constructor in Python, and it fills much the same role.
-- The first argument of every class method, including
__init__, is always a reference to the current instance of the class. By convention, this argument is always named self. In the __init__ method, self refers to the newly created object; in other class methods, it refers to the instance whose method was called. Although
- you need to specify self explicitly when defining the method, you do not specify it when calling the method; Python will add it for you automatically.
- __init__ methods can take any number of arguments, and just like functions, the arguments can be defined with default values, making
- them optional to the caller. In this case, filename has a default value of None, which is the Python null value.
-
-
By convention, the first argument of any Python class method (the reference to the current instance) is called self. This argument fills the role of the reserved word this in C++ or Java, but self is not a reserved word in Python, merely a naming convention. Nonetheless, please don't call it anything but self; this is a very strong convention.
-Example 5.6. Coding the FileInfo Class
-class FileInfo(UserDict):
- "store file metadata"
- def __init__(self, filename=None):
- UserDict.__init__(self) ①
- self["name"] = filename ②
-③
-
-- Some pseudo-object-oriented languages like Powerbuilder have a concept of “extending” constructors and other events, where the ancestor's method is called automatically before the descendant's method is executed.
- Python does not do this; you must always explicitly call the appropriate method in the ancestor class.
-
- I told you that this class acts like a dictionary, and here is the first sign of it. You're assigning the argument filename as the value of this object's
name key.
- - Note that the
__init__ method never returns a value.
-5.3.2. Knowing When to Use self and __init__
-When defining your class methods, you must explicitly list self as the first argument for each method, including __init__. When you call a method of an ancestor class from within your class, you must include the self argument. But when you call your class method from outside, you do not specify anything for the self argument; you skip it entirely, and Python automatically adds the instance reference for you. I am aware that this is confusing at first; it's not really inconsistent,
- but it may appear inconsistent because it relies on a distinction (between bound and unbound methods) that you don't know
- about yet.
-
Whew. I realize that's a lot to absorb, but you'll get the hang of it. All Python classes work the same way, so once you learn one, you've learned them all. If you forget everything else, remember this
- one thing, because I promise it will trip you up:
-
-
__init__ methods are optional, but when you define one, you must remember to explicitly call the ancestor's __init__ method (if it defines one). This is more generally true: whenever a descendant wants to extend the behavior of the ancestor,
- the descendant method must explicitly call the ancestor method at the proper time, with the proper arguments.
-
-Further Reading on Python Classes
-
-- Learning to Program has a gentler introduction to classes.
-
-
- How to Think Like a Computer Scientist shows how to use classes to model compound datatypes.
-
-
- Python Tutorial has an in-depth look at classes, namespaces, and inheritance.
-
-
- Python Knowledge Base answers common questions about classes.
-
-
-5.4. Instantiating Classes
-Instantiating classes in Python is straightforward. To instantiate a class, simply call the class as if it were a function, passing the arguments that the
-__init__ method defines. The return value will be the newly created object.
-
Example 5.7. Creating a FileInfo Instance
>>> import fileinfo
->>> f = fileinfo.FileInfo("/music/_singles/kairo.mp3") ①
->>> f.__class__ ②
-<class fileinfo.FileInfo at 010EC204>
->>> f.__doc__ ③
-'store file metadata'
->>> f ④
-{'name': '/music/_singles/kairo.mp3'}
-
-- You are creating an instance of the
FileInfo class (defined in the fileinfo module) and assigning the newly created instance to the variable f. You are passing one parameter, /music/_singles/kairo.mp3, which will end up as the filename argument in FileInfo's __init__ method.
- - Every class instance has a built-in attribute,
__class__, which is the object's class. (Note that the representation of this includes the physical address of the instance on my
- machine; your representation will be different.) Java programmers may be familiar with the Class class, which contains methods like getName and getSuperclass to get metadata information about an object. In Python, this kind of metadata is available directly on the object itself through attributes like __class__, __name__, and __bases__.
- - You can access the instance's
docstring just as with a function or a module. All instances of a class share the same docstring.
- - Remember when the
__init__ method assigned its filename argument to self["name"]? Well, here's the result. The arguments you pass when you create the class instance get sent right along to the __init__ method (along with the object reference, self, which Python adds for free).
-
-
-
In Python, simply call a class as if it were a function to create a new instance of the class. There is no explicit new operator like C++ or Java.
-5.4.1. Garbage Collection
If creating new instances is easy, destroying them is even easier. In general, there is no need to explicitly free instances,
because they are freed automatically when the variables assigned to them go out of scope. Memory leaks are rare in Python.
Example 5.8. Trying to Implement a Memory Leak
>>> def leakmem():
... f = fileinfo.FileInfo('/music/_singles/kairo.mp3') ①
...
>>> for i in range(100):
-... leakmem() ②
+... leakmem() ②
- Every time the
leakmem function is called, you are creating an instance of FileInfo and assigning it to the variable f, which is a local variable within the function. Then the function ends without ever freeing f, so you would expect a memory leak, but you would be wrong. When the function ends, the local variable f goes out of scope. At this point, there are no longer any references to the newly created instance of FileInfo (since you never assigned it to anything other than f), so Python destroys the instance for us.
- No matter how many times you call the
leakmem function, it will never leak memory, because every time, Python will destroy the newly created FileInfo class before returning from leakmem.
@@ -1623,14 +1553,14 @@ class UserDict: ①
def __init__(self, dict=None): ②
self.data = {} ③
if dict is not None: self.update(dict) ④ ⑤
-
+
- Note that
UserDict is a base class, not inherited from any other class.
- This is the
__init__ method that you overrode in the FileInfo class. Note that the argument list in this ancestor class is different than the descendant. That's okay; each subclass can have
its own set of arguments, as long as it calls the ancestor with the correct arguments. Here the ancestor class has a way
to define initial values (by passing a dictionary in the dict argument) which the FileInfo does not use.
- Python supports data attributes (called “instance variables” in Java and Powerbuilder, and “member variables” in C++). Data attributes are pieces of data held by a specific instance of a class. In this case, each instance of
UserDict will have a data attribute data. To reference this attribute from code outside the class, you qualify it with the instance name, instance.data, in the same way that you qualify a function with its module name. To reference a data attribute from within the class,
- you use self as the qualifier. By convention, all data attributes are initialized to reasonable values in the __init__ method. However, this is not required, since data attributes, like local variables, spring into existence when they are first assigned a value.
+ you use self as the qualifier. By convention, all data attributes are initialized to reasonable values in the __init__ method. However, this is not required, since data attributes, like local variables, spring into existence when they are first assigned a value.
- The
update method is a dictionary duplicator: it copies all the keys and values from one dictionary to another. This does not clear the target dictionary first; if the target dictionary already has some keys, the ones from the source dictionary will
be overwritten, but others will be left untouched. Think of update as a merge function, not a copy function.
- This is a syntax you may not have seen before (I haven't used it in the examples in this book). It's an
if statement, but instead of having an indented block starting on the next line, there is just a single statement on the same
@@ -1661,14 +1591,14 @@ class UserDict: ①
def keys(self): return self.data.keys() ⑤
def items(self): return self.data.items()
def values(self): return self.data.values()
-
+
-clear is a normal class method; it is publicly available to be called by anyone at any time. Notice that clear, like all class methods, has self as its first argument. (Remember that you don't include self when you call the method; it's something that Python adds for you.) Also note the basic technique of this wrapper class: store a real dictionary (data) as a data attribute, define all the methods that a real dictionary has, and have each class method redirect to the corresponding
+clear is a normal class method; it is publicly available to be called by anyone at any time. Notice that clear, like all class methods, has self as its first argument. (Remember that you don't include self when you call the method; it's something that Python adds for you.) Also note the basic technique of this wrapper class: store a real dictionary (data) as a data attribute, define all the methods that a real dictionary has, and have each class method redirect to the corresponding
method on the real dictionary. (In case you'd forgotten, a dictionary's clear method deletes all of its keys and their associated values.)
- The
copy method of a real dictionary returns a new dictionary that is an exact duplicate of the original (all the same key-value pairs).
- But UserDict can't simply redirect to self.data.copy, because that method returns a real dictionary, and what you want is to return a new instance that is the same class as self.
- - You use the
__class__ attribute to see if self is a UserDict; if so, you're golden, because you know how to copy a UserDict: just create a new UserDict and give it the real dictionary that you've squirreled away in self.data. Then you immediately return the new UserDict you don't even get to the import copy on the next line.
- - If
self.__class__ is not UserDict, then self must be some subclass of UserDict (like maybe FileInfo), in which case life gets trickier. UserDict doesn't know how to make an exact copy of one of its descendants; there could, for instance, be other data attributes defined
+ But UserDict can't simply redirect to self.data.copy, because that method returns a real dictionary, and what you want is to return a new instance that is the same class as self.
+ - You use the
__class__ attribute to see if self is a UserDict; if so, you're golden, because you know how to copy a UserDict: just create a new UserDict and give it the real dictionary that you've squirreled away in self.data. Then you immediately return the new UserDict you don't even get to the import copy on the next line.
+ - If
self.__class__ is not UserDict, then self must be some subclass of UserDict (like maybe FileInfo), in which case life gets trickier. UserDict doesn't know how to make an exact copy of one of its descendants; there could, for instance, be other data attributes defined
in the subclass, so you would need to iterate through them and make sure to copy all of them. Luckily, Python comes with a module to do exactly this, and it's called copy. I won't go into the details here (though it's a wicked cool module, if you're ever inclined to dive into it on your own).
Suffice it to say that copy can copy arbitrary Python objects, and that's how you're using it here.
- The rest of the methods are straightforward, redirecting the calls to the built-in methods on self.data.
@@ -1682,7 +1612,7 @@ class FileInfo(dict):①
"store file metadata"
def __init__(self, filename=None): ②
self["name"] = filename
-
+
- The first difference is that you don't need to import the
UserDict module, since dict is a built-in datatype and is always available. The second is that you are inheriting from dict directly, instead of from UserDict.UserDict.
- The third difference is subtle but important. Because of the way
UserDict works internally, it requires you to manually call its __init__ method to properly initialize its internal data structures. dict does not work like this; it is not a wrapper, and it requires no explicit initialization.
@@ -1706,7 +1636,7 @@ provide a way to map non-method-calling syntax into method calls.
>>> f.__getitem__("name") ①
'/music/_singles/kairo.mp3'
>>> f["name"] ②
-'/music/_singles/kairo.mp3'
+'/music/_singles/kairo.mp3'
- The
__getitem__ special method looks simple enough. Like the normal methods clear, keys, and values, it just redirects to the dictionary to return its value. But how does it get called? Well, you can call __getitem__ directly, but in practice you wouldn't actually do that; I'm just doing it here to show you how it works. The right way
to use __getitem__ is to get Python to call it for you.
@@ -1720,7 +1650,7 @@ provide a way to map non-method-calling syntax into method calls.
{'name':'/music/_singles/kairo.mp3', 'genre':31}
>>> f["genre"] = 32 ②
>>> f
-{'name':'/music/_singles/kairo.mp3', 'genre':32}
+{'name':'/music/_singles/kairo.mp3', 'genre':32}
- Like the
__getitem__ method, __setitem__ simply redirects to the real dictionary self.data to do its work. And like __getitem__, you wouldn't ordinarily call it directly like this; Python calls __setitem__ for you when you use the right syntax.
- This looks like regular dictionary syntax, except of course that f is really a class that's trying very hard to masquerade as a dictionary, and
__setitem__ is an essential part of that masquerade. This line of code actually calls f.__setitem__("genre", 32) under the covers.
@@ -1734,7 +1664,7 @@ provide a way to map non-method-calling syntax into method calls.
def __setitem__(self, key, item): ①
if key == "name" and item: ②
self.__parse(item) ③
- FileInfo.__setitem__(self, key, item) ④
+ FileInfo.__setitem__(self, key, item) ④
- Notice that this
__setitem__ method is defined exactly the same way as the ancestor method. This is important, since Python will be calling the method for you, and it expects it to be defined with a certain number of arguments. (Technically speaking,
the names of the arguments don't matter; only the number of arguments is important.)
@@ -1758,7 +1688,7 @@ provide a way to map non-method-calling syntax into method calls.
>>> mp3file
{'album': '', 'artist': 'The Cynic Project', 'genre': 18, 'title': 'Sidewinder',
'name': '/music/_singles/sidewinder.mp3', 'year': '2000',
-'comment': 'http://mp3.com/cynicproject'}
+'comment': 'http://mp3.com/cynicproject'}
- First, you create an instance of
MP3FileInfo, without passing it a filename. (You can get away with this because the filename argument of the __init__ method is optional.) Since MP3FileInfo has no __init__ method of its own, Python walks up the ancestor tree and finds the __init__ method of FileInfo. This __init__ method manually calls the __init__ method of UserDict and then sets the name key to filename, which is None, since you didn't pass a filename. Thus, mp3file initially looks like a dictionary with one key, name, whose value is None.
@@ -1777,7 +1707,7 @@ provide a way to map non-method-calling syntax into method calls.
else:
return cmp(self.data, dict)
def __len__(self): return len(self.data) ③
- def __delitem__(self, key): del self.data[key] ④
+ def __delitem__(self, key): del self.data[key] ④
__repr__ is a special method that is called when you call repr(instance). The repr function is a built-in function that returns a string representation of an object. It works on any object, not just class
instances. You're already intimately familiar with repr and you don't even know it. In the interactive window, when you type just a variable name and press the ENTER key, Python uses repr to display the variable's value. Go create a dictionary d with some data and then print repr(d) to see for yourself.
@@ -1833,7 +1763,7 @@ class MP3FileInfo(FileInfo):
'artist': (33, 63, <function stripnulls at 0260C8D4>),
'year': (93, 97, <function stripnulls at 0260C8D4>),
'comment': (97, 126, <function stripnulls at 0260C8D4>),
-'album': (63, 93, <function stripnulls at 0260C8D4>)}
+'album': (63, 93, <function stripnulls at 0260C8D4>)}
MP3FileInfo is the class itself, not any particular instance of the class.
- tagDataMap is a class attribute: literally, an attribute of the class. It is available before creating any instances of the class.
@@ -1865,7 +1795,7 @@ class MP3FileInfo(FileInfo):
>>> c.count
2
>>> counter.count
-2
+2
- count is a class attribute of the
counter class.
__class__ is a built-in attribute of every class instance (of every class). It is a reference to the class that self is an instance of (in this case, the counter class).
@@ -1896,7 +1826,7 @@ call it directly (even from outside the fileinfo module) if you had
>>> m.__parse("/music/_singles/kairo.mp3") ①
Traceback (innermost last):
File "<interactive input>", line 1, in ?
-AttributeError: 'MP3FileInfo' instance has no attribute '__parse'
+AttributeError: 'MP3FileInfo' instance has no attribute '__parse'
- If you try to call a private method, Python will raise a slightly misleading exception, saying that the method does not exist. Of course it does exist, but it's private,
so it's not accessible outside the class.Strictly speaking, private methods are accessible outside their class, just not easily accessible. Nothing in Python is truly private; internally, the names of private methods and attributes are mangled and unmangled on the fly to make them
@@ -1963,7 +1893,7 @@ IOError: [Errno 2] No such file or directory: '/notthere'
... print "The file does not exist, exiting gracefully"
... print "This line will always print" ④
The file does not exist, exiting gracefully
-This line will always print
+This line will always print
- Using the built-in
open function, you can try to open a file for reading (more on open in the next section). But the file doesn't exist, so this raises the IOError exception. Since you haven't provided any explicit check for an IOError exception, Python just prints out some debugging information about what happened and then gives up.
- You're trying to open the same non-existent file, but this time you're doing it within a
try...except block.
@@ -1999,7 +1929,7 @@ exceptions, errors occur immediately, and you can handle them in a standard way
else:
getpass = win_getpass
else:
- getpass = unix_getpass
+ getpass = unix_getpass
termios is a UNIX-specific module that provides low-level control over the input terminal. If this module is not available (because it's not
on your system, or your system doesn't support it), the import fails and Python raises an ImportError, which you catch.
@@ -2031,7 +1961,7 @@ exceptions, errors occur immediately, and you can handle them in a standard way
>>> f.mode ③
'rb'
>>> f.name ④
-'/music/_singles/kairo.mp3'
+'/music/_singles/kairo.mp3'
- The
open method can take up to three parameters: a filename, a mode, and a buffering parameter. Only the first one, the filename,
is required; the other two are optional. If not specified, the file is opened for reading in text mode. Here you are opening the file for reading in binary mode.
@@ -2054,7 +1984,7 @@ exceptions, errors occur immediately, and you can handle them in a standard way
'TAGKAIRO****THE BEST GOA ***DJ MARY-JANE***
Rave Mix 2000http://mp3.com/DJMARYJANE \037'
>>> f.tell() ⑤
-7543037
+7543037
- A file object maintains state about the file it has open. The
tell method of a file object tells you your current position in the open file. Since you haven't done anything with this file
yet, the current position is 0, which is the beginning of the file.
@@ -2091,7 +2021,7 @@ ValueError: I/O operation on closed file
Traceback (innermost last):
File "<interactive input>", line 1, in ?
ValueError: I/O operation on closed file
->>> f.close() ⑤
+>>> f.close() ⑤
- The closed attribute of a file object indicates whether the object has a file open or not. In this case, the file is still open (closed is
False).
- To close a file, call the
close method of the file object. This frees the lock (if any) that you were holding on the file, flushes buffered writes (if any)
@@ -2115,7 +2045,7 @@ ValueError: I/O operation on closed file
.
.
except IOError: ⑥
- pass
+ pass
- Because opening and reading files is risky and may raise an exception, all of this code is wrapped in a
try...except block. (Hey, isn't standardized indentation great? This is where you start to appreciate it.)
- The
open function may raise an IOError. (Maybe the file doesn't exist.)
@@ -2145,7 +2075,7 @@ test succeeded
>>> logfile.close()
>>> print file('test.log').read() ⑤
test succeededline 2
-
+
- You start boldly by creating either the new file
test.log or overwrites the existing file, and opening the file for writing. (The second parameter "w" means open the file for writing.) Yes, that's all as dangerous as it sounds. I hope you didn't care about the previous
contents of that file, because it's gone now.
@@ -2180,7 +2110,7 @@ e
>>> print "\n".join(li) ③
a
b
-e
+e
- The syntax for a
for loop is similar to list comprehensions. li is a list, and s will take the value of each element in turn, starting from the first element.
- Like an
if statement or any other indented block, a for loop can have any number of lines of code in it.
@@ -2202,7 +2132,7 @@ b
c
d
e
-
+
- As you saw in Example 3.20, “Assigning Consecutive Values”,
range produces a list of integers, which you then loop through. I know it looks a bit odd, but it is occasionally (and I stress
occasionally) useful to have a counter loop.
@@ -2225,7 +2155,7 @@ OS=Windows_NT
COMPUTERNAME=MPILGRIM
USERNAME=mpilgrim
-[...snip...]
+[...snip...]
- os.environ is a dictionary of the environment variables defined on your system. In Windows, these are your user and system variables
accessible from MS-DOS. In UNIX, they are the variables exported in your shell's startup scripts. In Mac OS, there is no concept of environment variables, so this dictionary is empty.
@@ -2246,7 +2176,7 @@ USERNAME=mpilgrim
.
if tagdata[:3] == "TAG":
for tag, (start, end, parseFunc) in self.tagDataMap.items(): ②
- self[tag] = parseFunc(tagdata[start:end]) ③
+ self[tag] = parseFunc(tagdata[start:end]) ③
- tagDataMap is a class attribute that defines the tags you're looking for in an MP3 file. Tags are stored in fixed-length fields. Once you read the last 128 bytes of the file, bytes 3 through 32 of those
are always the song title, 33 through 62 are always the artist name, 63 through 92 are the album name, and so forth. Note
@@ -2269,7 +2199,7 @@ __builtin__
site
signal
UserDict
-stat
+stat
- The
sys module contains system-level information, such as the version of Python you're running (sys.version or sys.version_info), and system-level options such as the maximum allowed recursion depth (sys.getrecursionlimit() and sys.setrecursionlimit()).
sys.modules is a dictionary containing all the modules that have ever been imported since Python was started; the key is the module name, the value is the module object. Note that this is more than just the modules your program has imported. Python preloads some modules on startup, and if you're using a Python IDE, sys.modules contains all the modules imported by all the programs you've run within the IDE.
@@ -2293,7 +2223,7 @@ stat
>>> fileinfo
<module 'fileinfo' from 'fileinfo.pyc'>
>>> sys.modules["fileinfo"] ②
-<module 'fileinfo' from 'fileinfo.pyc'>
+<module 'fileinfo' from 'fileinfo.pyc'>
- As new modules are imported, they are added to
sys.modules. This explains why importing the same module twice is very fast: Python has already loaded and cached the module in sys.modules, so importing the second time is simply a dictionary lookup.
- Given the name (as a string) of any previously-imported module, you can get a reference to the module itself through the
sys.modules dictionary.
@@ -2302,7 +2232,7 @@ stat
>>> MP3FileInfo.__module__ ①
'fileinfo'
>>> sys.modules[MP3FileInfo.__module__] ②
-<module 'fileinfo' from 'fileinfo.pyc'>
+<module 'fileinfo' from 'fileinfo.pyc'>
- Every Python class has a built-in class attribute
__module__, which is the name of the module in which the class is defined.
- Combining this with the
sys.modules dictionary, you can get a reference to the module in which a class is defined.
@@ -2311,7 +2241,7 @@ stat
def getFileInfoClass(filename, module=sys.modules[FileInfo.__module__]): ①
"get file info class from filename extension"
subclass = "%sFileInfo" % os.path.splitext(filename)[1].upper()[1:] ②
- return hasattr(module, subclass) and getattr(module, subclass) or FileInfo ③
+ return hasattr(module, subclass) and getattr(module, subclass) or FileInfo ③
- This is a function with two arguments; filename is required, but module is optional and defaults to the module that contains the
FileInfo class. This looks inefficient, because you might expect Python to evaluate the sys.modules expression every time the function is called. In fact, Python evaluates default expressions only once, the first time the module is imported. As you'll see later, you never call this
function with a module argument, so module serves as a function-level constant.
@@ -2338,7 +2268,7 @@ stat
>>> os.path.expanduser("~") ④
'c:\\Documents and Settings\\mpilgrim\\My Documents'
>>> os.path.join(os.path.expanduser("~"), "Python") ⑤
-'c:\\Documents and Settings\\mpilgrim\\My Documents\\Python'
+'c:\\Documents and Settings\\mpilgrim\\My Documents\\Python'
os.path is a reference to a module -- which module depends on your platform. Just as getpass encapsulates differences between platforms by setting getpass to a platform-specific function, os encapsulates differences between platforms by setting path to a platform-specific module.
- The
join function of os.path constructs a pathname out of one or more partial pathnames. In this case, it simply concatenates strings. (Note that dealing
@@ -2359,7 +2289,7 @@ stat
>>> shortname
'mahadeva'
>>> extension
-'.mp3'
+'.mp3'
- The
split function splits a full pathname and returns a tuple containing the path and filename. Remember when I said you could use
multi-variable assignment to return multiple values from a function? Well, split is such a function.
@@ -2387,7 +2317,7 @@ stat
... if os.path.isdir(os.path.join(dirname, f))] ④
['cygwin', 'docbook', 'Documents and Settings', 'Incoming',
'Inetpub', 'Music', 'Program Files', 'Python20', 'RECYCLER',
-'System Volume Information', 'TEMP', 'WINNT']
+'System Volume Information', 'TEMP', 'WINNT']
- The
listdir function takes a pathname and returns a list of the contents of the directory.
listdir returns both files and folders, with no indication of which is which.
@@ -2401,7 +2331,7 @@ def listDirectory(directory, fileExtList):
for f in os.listdir(directory)] ① ②
fileList = [os.path.join(directory, f)
for f in fileList
- if os.path.splitext(f)[1] in fileExtList] ③ ④ ⑤
+ if os.path.splitext(f)[1] in fileExtList] ③ ④ ⑤
os.listdir(directory) returns a list of all the files and folders in directory.
- Iterating through the list with f, you use
os.path.normcase(f) to normalize the case according to operating system defaults. normcase is a useful little function that compensates for case-insensitive operating systems that think that mahadeva.mp3 and mahadeva.MP3 are the same file. For instance, on Windows and Mac OS, normcase will convert the entire filename to lowercase; on UNIX-compatible systems, it will return the filename unchanged.
@@ -2431,7 +2361,7 @@ may already be familiar with from working on the command line.
['c:\\music\\_singles\\sidewinder.mp3',
'c:\\music\\_singles\\spinning.mp3']
>>> glob.glob('c:\\music\\*\\*.mp3')④
-
+
- As you saw earlier,
os.listdir simply takes a directory path and lists all files and directories in that directory.
- The
glob module, on the other hand, takes a wildcard and returns the full path of all files and directories matching the wildcard.
@@ -2461,7 +2391,7 @@ def listDirectory(directory, fileExtList): ①
"get file info class from filename extension"
subclass = "%sFileInfo" % os.path.splitext(filename)[1].upper()[1:] ④
return hasattr(module, subclass) and getattr(module, subclass) or FileInfo ⑤
- return [getFileInfoClass(f)(f) for f in fileList] ⑥
+ return [getFileInfoClass(f)(f) for f in fileList] ⑥
listDirectory is the main attraction of this entire module. It takes a directory (like c:\music\_singles\ in my case) and a list of interesting file extensions (like ['.mp3']), and it returns a list of class instances that act like dictionaries that contain metadata about each interesting file in
that directory. And it does it in just a few straightforward lines of code.
@@ -2925,7 +2855,7 @@ data: '\n '
<td width='99%' align='right'><hr size='1' noshade></td></tr>
<tr><td class='tagline' colspan='2'>Python for experienced programmers</td></tr>
-[...snip...]
+[...snip...]
- The
urllib module is part of the standard Python library. It contains functions for getting information about and actually retrieving data from Internet-based URLs (mainly web pages).
- The simplest use of
urllib is to retrieve the entire text of a web page using the urlopen function. Opening a URL is similar to opening a file. The return value of urlopen is a file-like object, which has some of the same methods as a file object.
@@ -2945,7 +2875,7 @@ class URLLister(SGMLParser):
def start_a(self, attrs): ②
href = [v for k, v in attrs if k=='href'] ③ ④
if href:
- self.urls.extend(href)
+ self.urls.extend(href)
reset is called by the __init__ method of SGMLParser, and it can also be called manually once an instance of the parser has been created. So if you need to do any initialization,
do it in reset, not in __init__, so that it will be re-initialized properly when someone re-uses a parser instance.
@@ -2974,7 +2904,7 @@ download/diveintopython3-xml-5.0.zip
download/diveintopython3-common-5.0.zip
-... rest of output omitted for brevity ...
+... rest of output omitted for brevity ...
- Call the
feed method, defined in SGMLParser, to get HTML into the parser.
[1] It takes a string, which is what usock.read() returns.
@@ -3017,7 +2947,7 @@ class BaseHTMLProcessor(SGMLParser):
self.pieces.append("<?%(text)s>" % locals())
def handle_decl(self, text):
- self.pieces.append("<!%(text)s>" % locals())
+ self.pieces.append("<!%(text)s>" % locals())