more HTTP chapter

2026-06-05 23:10:17 +00:00 · 2009-05-31 22:12:37 -07:00
parent 421c370591
commit 49b763e3f5
1 changed files with 37 additions and 587 deletions
@@ -253,599 +253,49 @@ Content-Type: application/xml</samp>
 <p>But wait, it gets worse! To see just how inefficient this code is, let&#8217;s request the same feed a second time.

 <pre class=screen>
-FIXME
-</pre>
+# continued from the <a href=#whats-on-the-wire>previous example</a>
+<samp class=p>>>> </samp><kbd>response2 = urlopen('http://diveintopython3.org/examples/feed.xml')</kbd>
+<samp>send: b'GET /examples/feed.xml HTTP/1.1
+Host: diveintopython3.org
+Accept-Encoding: identity
+User-Agent: Python-urllib/3.0'
+Connection: close
+reply: 'HTTP/1.1 200 OK'
+&hellip;further debugging information omitted&hellip;</samp></pre>

-<!--
-<p class=a>&#x2042;
+<p>Notice anything peculiar about this request? It hasn&#8217;t changed! It&#8217;s exactly the same as the first request. No sign of <a href=#last-modified><code>If-Modified-Since</code> headers</a>. No sign of <a href=#etags><code>If-None-Match</code> headers</a>. No respect for the caching headers. Still no compression.

-<h2 id="oa.useragent">11.5. Setting the <code>User-Agent</code></h2>
-<p>The first step to improving your <abbr>HTTP</abbr> web services client is to identify yourself properly with a <code>User-Agent</code>. To do that, you need to move beyond the basic <code>urllib</code> and dive into <code>urllib2</code>.
-<div class=example><h3>Example 11.4. Introducing <code>urllib2</code></h3><pre class=screen>
-<samp class=p>>>> </samp><kbd>import httplib</kbd>
-<samp class=p>>>> </samp><kbd>httplib.HTTPConnection.debuglevel = 1</kbd>           <span>&#x2460;</span>
-<samp class=p>>>> </samp><kbd>import urllib2</kbd>
-<samp class=p>>>> </samp><kbd>request = urllib2.Request('http://diveintomark.org/xml/atom.xml')</kbd> <span>&#x2461;</span>
-<samp class=p>>>> </samp><kbd>opener = urllib2.build_opener()</kbd>                 <span>&#x2462;</span>
-<samp class=p>>>> </samp><kbd>feeddata = opener.open(request).read()</kbd>          <span>&#x2463;</span>
-connect: (diveintomark.org, 80)
-send: '
-GET /xml/atom.xml HTTP/1.0
-Host: diveintomark.org
-User-agent: Python-urllib/2.1
-'
-reply: 'HTTP/1.1 200 OK\r\n'
-header: Date: Wed, 14 Apr 2004 23:23:12 GMT
-header: Server: Apache/2.0.49 (Debian GNU/Linux)
-header: Content-Type: application/atom+xml
-header: Last-Modified: Wed, 14 Apr 2004 22:14:38 GMT
-header: ETag: "e8284-68e0-4de30f80"
-header: Accept-Ranges: bytes
-header: Content-Length: 26848
-header: Connection: close
-</pre>
+<p>And what happens when you do the same thing twice? You get the same response. Twice.
+
+<pre class=screen>
+# continued from the previous example
+<a><samp class=p>>>> </samp><kbd>print(response2.headers.as_string())</kbd>     <span>&#x2460;</span></a>
+<samp>Date: Mon, 01 Jun 2009 03:58:00 GMT
+Server: Apache
+Last-Modified: Sun, 31 May 2009 22:51:11 GMT
+ETag: "bfe-255ef5c0"
+Accept-Ranges: bytes
+Content-Length: 3070
+Cache-Control: max-age=86400
+Expires: Tue, 02 Jun 2009 03:58:00 GMT
+Vary: Accept-Encoding
+Connection: close
+Content-Type: application/xml</samp>
+<samp class=p>>>> </samp><kbd>data2 = response2.read()</kbd>
+<a><samp class=p>>>> </samp><kbd>len(data2)</kbd>                               <span>&#x2461;</span></a>
+<samp>3070</samp>
+<a><samp class=p>>>> </samp><kbd>data2 == data</kbd>                            <span>&#x2462;</span></a>
+<samp>True</samp></pre>
 <ol>
-<li>If you still have your Python <abbr>IDE</abbr> open from the previous section&#8217;s example, you can skip this, but this turns on <a href="#oa.debug" title="11.4. Debugging HTTP web services"><abbr>HTTP</abbr> debugging</a> so you can see what you&#8217;re actually sending over the wire, and what gets sent back.
-<li>Fetching an <abbr>HTTP</abbr> resource with <code>urllib2</code> is a three-step process, for good reasons that will become clear shortly. The first step is to create a <code>Request</code> object, which takes the <abbr>URL</abbr> of the resource you&#8217;ll eventually get around to retrieving. Note that this step doesn&#8217;t actually
-            retrieve anything yet.
-<li>The second step is to build a <abbr>URL</abbr> opener. This can take any number of handlers, which control how responses are handled.
-             But you can also build an opener without any custom handlers, which is what you&#8217;re doing here. You&#8217;ll see how to define
-            and use custom handlers later in this chapter when you explore redirects.
-<li>The final step is to tell the opener to open the <abbr>URL</abbr>, using the <code>Request</code> object you created. As you can see from all the debugging information that gets printed, this step actually retrieves the
-            resource and stores the returned data in <var>feeddata</var>.
-<div class=example><h3>Example 11.5. Adding headers with the <code>Request</code></h3><pre class=screen>
-<samp class=p>>>> </samp><kbd>request</kbd>            <span>&#x2460;</span>
-&lt;urllib2.Request instance at 0x00250AA8>
-<samp class=p>>>> </samp><kbd>request.get_full_url()</kbd>
-http://diveintomark.org/xml/atom.xml
-<samp class=p>>>> </samp><kbd>request.add_header('User-Agent',</kbd>
-<samp class=p>...    </samp><kbd>'OpenAnything/1.0 +http://diveintopython3.org/')</kbd>    <span>&#x2461;</span>
-<samp class=p>>>> </samp><kbd>feeddata = opener.open(request).read()</kbd>                 <span>&#x2462;</span>
-connect: (diveintomark.org, 80)
-send: '
-GET /xml/atom.xml HTTP/1.0
-Host: diveintomark.org
-User-agent: OpenAnything/1.0 +http://diveintopython3.org/   <span>&#x2463;</span>
-'
-reply: 'HTTP/1.1 200 OK\r\n'
-header: Date: Wed, 14 Apr 2004 23:45:17 GMT
-header: Server: Apache/2.0.49 (Debian GNU/Linux)
-header: Content-Type: application/atom+xml
-header: Last-Modified: Wed, 14 Apr 2004 22:14:38 GMT
-header: ETag: "e8284-68e0-4de30f80"
-header: Accept-Ranges: bytes
-header: Content-Length: 26848
-header: Connection: close
-</pre>
-<ol>
-<li>You&#8217;re continuing from the previous example; you&#8217;ve already created a <code>Request</code> object with the <abbr>URL</abbr> you want to access.
-<li>Using the <code>add_header</code> method on the <code>Request</code> object, you can add arbitrary <abbr>HTTP</abbr> headers to the request. The first argument is the header, the second is the value you&#8217;re
-            providing for that header. Convention dictates that a <code>User-Agent</code> should be in this specific format: an application name, followed by a slash, followed by a version number. The rest is free-form,
-            and you&#8217;ll see a lot of variations in the wild, but somewhere it should include a <abbr>URL</abbr> of your application. The <code>User-Agent</code> is usually logged by the server along with other details of your request, and including a <abbr>URL</abbr> of your application allows
-            server administrators looking through their access logs to contact you if something is wrong.
-<li>The <var>opener</var> object you created before can be reused too, and it will retrieve the same feed again, but with your custom <code>User-Agent</code> header.
-<li>And here&#8217;s you sending your custom <code>User-Agent</code>, in place of the generic one that Python sends by default. If you look closely, you&#8217;ll notice that you defined a <code>User-Agent</code> header, but you actually sent a <code>User-agent</code> header. See the difference?  <code>urllib2</code> changed the case so that only the first letter was capitalized. It doesn&#8217;t really matter; <abbr>HTTP</abbr> specifies that header field
-            names are completely case-insensitive.
+<li>The server is still sending the same array of &#8220;smart&#8221; headers: <code>Cache-Control</code> and <code>Expires</code> to allow caching, <code>Last-Modified</code> and <code>ETag</code> to enable &#8220;not-modified&#8221; tracking. Even the <code>Vary: Accept-Encoding</code> header hints that the server would support compression, if only you would bloody well ask for it. But you&#8217;re not listening.
+<li>Once again, fetching this data downloads the whole 3070 bytes&hellip;
+<li>&hellip;the exact same 3070 bytes you downloaded last time.
+</ol>
+
+<p><abbr>HTTP</abbr> is designed to work better than this. <code>urllib</code> speaks <abbr>HTTP</abbr> like I speak Spanish &mdash; enough to get by in a jam, but not enough to hold a conversation. <abbr>HTTP</abbr> is a conversation. It&#8217;s time to upgrade to a library that speaks <abbr>HTTP</abbr> fluently.

 <p class=a>&#x2042;

-<h2 id="oa.etags">11.6. Handling <code>Last-Modified</code> and <code>ETag</code></h2>
-<p>Now that you know how to add custom <abbr>HTTP</abbr> headers to your web service requests, let&#8217;s look at adding support for <code>Last-Modified</code> and <code>ETag</code> headers.
-<p>These examples show the output with debugging turned off. If you still have it turned on from the previous section, you can
-turn it off by setting <code>httplib.HTTPConnection.debuglevel = 0</code>. Or you can just leave debugging on, if that helps you.
-<div class=example><h3 id="oa.etags.example.1">Example 11.6. Testing <code>Last-Modified</code></h3><pre class=screen>
-<samp class=p>>>> </samp><kbd>import urllib2</kbd>
-<samp class=p>>>> </samp><kbd>request = urllib2.Request('http://diveintomark.org/xml/atom.xml')</kbd>
-<samp class=p>>>> </samp><kbd>opener = urllib2.build_opener()</kbd>
-<samp class=p>>>> </samp><kbd>firstdatastream = opener.open(request)</kbd>
-<samp class=p>>>> </samp><kbd>firstdatastream.headers.dict</kbd>     <span>&#x2460;</span>
-<samp>{'date': 'Thu, 15 Apr 2004 20:42:41 GMT', 
- 'server': 'Apache/2.0.49 (Debian GNU/Linux)', 
- 'content-type': 'application/atom+xml',
- 'last-modified': 'Thu, 15 Apr 2004 19:45:21 GMT', 
- 'etag': '"e842a-3e53-55d97640"',
- 'content-length': '15955', 
- 'accept-ranges': 'bytes', 
- 'connection': 'close'}</samp>
-<samp class=p>>>> </samp><kbd>request.add_header('If-Modified-Since',</kbd>
-<samp class=p>...    </samp>firstdatastream.headers.get('Last-Modified'))  <span>&#x2461;</span>
-<samp class=p>>>> </samp><kbd>seconddatastream = opener.open(request)</kbd>            <span>&#x2462;</span>
-<samp class=traceback>Traceback (most recent call last):
-  File "&lt;stdin>", line 1, in ?
-  File "c:\python23\lib\urllib2.py", line 326, in open
-    '_open', req)
-  File "c:\python23\lib\urllib2.py", line 306, in _call_chain
-    result = func(*args)
-  File "c:\python23\lib\urllib2.py", line 901, in http_open
-    return self.do_open(httplib.HTTP, req)
-  File "c:\python23\lib\urllib2.py", line 895, in do_open
-    return self.parent.error('http', req, fp, code, msg, hdrs)
-  File "c:\python23\lib\urllib2.py", line 352, in error
-    return self._call_chain(*args)
-  File "c:\python23\lib\urllib2.py", line 306, in _call_chain
-    result = func(*args)
-  File "c:\python23\lib\urllib2.py", line 412, in http_error_default
-    raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
-urllib2.HTTPError: HTTP Error 304: Not Modified</span>
-</pre>
-<ol>
-<li>Remember all those <abbr>HTTP</abbr> headers you saw printed out when you turned on debugging?  This is how you can get access to them
-            programmatically: <var>firstdatastream.headers</var> is <a href="#fileinfo.userdict" title="5.5. Exploring UserDict: A Wrapper Class">an object that acts like a dictionary</a> and allows you to get any of the individual headers returned from the <abbr>HTTP</abbr> server.
-<li>On the second request, you add the <code>If-Modified-Since</code> header with the last-modified date from the first request. If the data hasn&#8217;t changed, the server should return a <code>304</code> status code.
-<li>Sure enough, the data hasn&#8217;t changed. You can see from the traceback that <code>urllib2</code> throws a special exception, <code>HTTPError</code>, in response to the <code>304</code> status code. This is a little unusual, and not entirely helpful. After all, it&#8217;s not an error; you specifically asked the
-            server not to send you any data if it hadn&#8217;t changed, and the data didn&#8217;t change, so the server told you it wasn&#8217;t sending
-            you any data. That&#8217;s not an error; that&#8217;s exactly what you were hoping for.
-<p><code>urllib2</code> also raises an <code>HTTPError</code> exception for conditions that you would think of as errors, such as <code>404</code> (page not found). In fact, it will raise <code>HTTPError</code> for <em>any</em> status code other than <code>200</code> (OK), <code>301</code> (permanent redirect), or <code>302</code> (temporary redirect). It would be more helpful for your purposes to capture the status code and simply return it, without
-throwing an exception. To do that, you&#8217;ll need to define a custom <abbr>URL</abbr> handler.
-<div class=example><h3>Example 11.7. Defining URL handlers</h3>
-<p>This custom <abbr>URL</abbr> handler is part of <code>openanything.py</code>.
-<pre><code>
-class DefaultErrorHandler(urllib2.HTTPDefaultErrorHandler):    <span>&#x2460;</span>
-    def http_error_default(self, req, fp, code, msg, headers): <span>&#x2461;</span>
-        result = urllib2.HTTPError(         
-            req.get_full_url(), code, msg, headers, fp)       
-        result.status = code                 <span>&#x2462;</span>
-        return result     
-</pre>
-<ol>
-<li><code>urllib2</code> is designed around <abbr>URL</abbr> handlers. Each handler is just a class that can define any number of methods. When something happens
-            &mdash; like an <abbr>HTTP</abbr> error, or even a <code>304</code> code &mdash; <code>urllib2</code> introspects into the list of defined handlers for a method that can handle it. You used a similar introspection in <a href="#kgp" title="Chapter 9. XML Processing">Chapter 9, <i>XML Processing</i></a> to define handlers for different node types, but <code>urllib2</code> is more flexible, and introspects over as many handlers as are defined for the current request.
-<li><code>urllib2</code> searches through the defined handlers and calls the <code>http_error_default</code> method when it encounters a <code>304</code> status code from the server. By defining a custom error handler, you can prevent <code>urllib2</code> from raising an exception. Instead, you create the <code>HTTPError</code> object, but return it instead of raising it.
-<li>This is the key part: before returning, you save the status code returned by the <abbr>HTTP</abbr> server. This will allow you easy access
-            to it from the calling program.
-<div class=example><h3>Example 11.8. Using custom URL handlers</h3><pre class=screen>
-<samp class=p>>>> </samp><kbd>request.headers</kbd>         <span>&#x2460;</span>
-{'If-modified-since': 'Thu, 15 Apr 2004 19:45:21 GMT'}
-<samp class=p>>>> </samp><kbd>import openanything</kbd>
-<samp class=p>>>> </samp><kbd>opener = urllib2.build_opener(</kbd>
-<samp class=p>...    </samp>openanything.DefaultErrorHandler())   <span>&#x2461;</span>
-<samp class=p>>>> </samp><kbd>seconddatastream = opener.open(request)</kbd>
-<samp class=p>>>> </samp><kbd>seconddatastream.status</kbd> <span>&#x2462;</span>
-304
-<samp class=p>>>> </samp><kbd>seconddatastream.read()</kbd> <span>&#x2463;</span>
-''
-</pre>
-<ol>
-<li>You&#8217;re continuing the previous example, so the <code>Request</code> object is already set up, and you&#8217;ve already added the <code>If-Modified-Since</code> header.
-<li>This is the key: now that you&#8217;ve defined your custom <abbr>URL</abbr> handler, you need to tell <code>urllib2</code> to use it. Remember how I said that <code>urllib2</code> broke up the process of accessing an <abbr>HTTP</abbr> resource into three steps, and for good reason?  This is why building the <abbr>URL</abbr> opener
-            is its own step, because you can build it with your own custom <abbr>URL</abbr> handlers that override <code>urllib2</code>&#8217;s default behavior.
-<li>Now you can quietly open the resource, and what you get back is an object that, along with the usual headers (use <var>seconddatastream.headers.dict</var> to acess them), also contains the <abbr>HTTP</abbr> status code. In this case, as you expected, the status is <code>304</code>, meaning this data hasn&#8217;t changed since the last time you asked for it.
-<li>Note that when the server sends back a <code>304</code> status code, it doesn&#8217;t re-send the data. That&#8217;s the whole point: to save bandwidth by not re-downloading data that hasn&#8217;t
-            changed. So if you actually want that data, you&#8217;ll need to cache it locally the first time you get it.
-<p>Handling <code>ETag</code> works much the same way, but instead of checking for <code>Last-Modified</code> and sending <code>If-Modified-Since</code>, you check for <code>ETag</code> and send <code>If-None-Match</code>. Let&#8217;s start with a fresh <abbr>IDE</abbr> session.
-<div class=example><h3 id="oa.etags.example">Example 11.9. Supporting <code>ETag</code>/<code>If-None-Match</code></h3><pre class=screen>
-<samp class=p>>>> </samp><kbd>import urllib2, openanything</kbd>
-<samp class=p>>>> </samp><kbd>request = urllib2.Request('http://diveintomark.org/xml/atom.xml')</kbd>
-<samp class=p>>>> </samp><kbd>opener = urllib2.build_opener(</kbd>
-<samp class=p>...    </samp>openanything.DefaultErrorHandler())
-<samp class=p>>>> </samp><kbd>firstdatastream = opener.open(request)</kbd>
-<samp class=p>>>> </samp><kbd>firstdatastream.headers.get('ETag')</kbd>        <span>&#x2460;</span>
-'"e842a-3e53-55d97640"'
-<samp class=p>>>> </samp><kbd>firstdata = firstdatastream.read()</kbd>
-<samp class=p>>>> </samp><kbd>print firstdata</kbd>          <span>&#x2461;</span>
-<samp>&lt;?xml version="1.0" encoding="iso-8859-1"?>
-&lt;feed version="0.3"
-  xmlns="http://purl.org/atom/ns#"
-  xmlns:dc="http://purl.org/dc/elements/1.1/"
-  xml:lang="en">
-  &lt;title mode="escaped">dive into mark&lt;/title>
-  &lt;link rel="alternate" type="text/html" href="http://diveintomark.org/"/>
-  &hellip;
-</samp>
-<samp class=p>>>> </samp><kbd>request.add_header('If-None-Match',</kbd>
-<samp class=p>...    </samp>firstdatastream.headers.get('ETag'))   <span>&#x2462;</span>
-<samp class=p>>>> </samp><kbd>seconddatastream = opener.open(request)</kbd>
-<samp class=p>>>> </samp><kbd>seconddatastream.status</kbd>  <span>&#x2463;</span>
-304
-<samp class=p>>>> </samp><kbd>seconddatastream.read()</kbd>  <span>&#x2464;</span>
-''
-</pre>
-<ol>
-<li>Using the <var>firstdatastream.headers</var> pseudo-dictionary, you can get the <code>ETag</code> returned from the server. (What happens if the server didn&#8217;t send back an <code>ETag</code>?  Then this line would return <code>None</code>.)
-<li>OK, you got the data.
-<li>Now set up the second call by setting the <code>If-None-Match</code> header to the <code>ETag</code> you got from the first call.
-<li>The second call succeeds quietly (without throwing an exception), and once again you see that the server has sent back a <code>304</code> status code. Based on the <code>ETag</code> you sent the second time, it knows that the data hasn&#8217;t changed.
-<li>Regardless of whether the <code>304</code> is triggered by <code>Last-Modified</code> date checking or <code>ETag</code> hash matching, you&#8217;ll never get the data along with the <code>304</code>. That&#8217;s the whole point.
-<table id="tip.etag.vs.lastmodified" class=note border="0" summary="">
-
-<td rowspan="2" align="center" valign="top" width="1%"><img src="images/note.png" alt="Note" title="" width="24" height="24"><td colspan="2" align="left" valign="top" width="99%">In these examples, the <abbr>HTTP</abbr> server has supported both <code>Last-Modified</code> and <code>ETag</code> headers, but not all servers do. As a web services client, you should be prepared to support both, but you must code defensively
-      in case a server only supports one or the other, or neither.
-
-<p class=a>&#x2042;
-
-<h2 id="oa.redirect">11.7. Handling redirects</h2>
-<p>You can support permanent and temporary redirects using a different kind of custom <abbr>URL</abbr> handler.
-<p>First, let&#8217;s see why a redirect handler is necessary in the first place.
-<div class=example><h3>Example 11.10. Accessing web services without a redirect handler</h3><pre class=screen>
-<samp class=p>>>> </samp><kbd>import urllib2, httplib</kbd>
-<samp class=p>>>> </samp><kbd>httplib.HTTPConnection.debuglevel = 1</kbd>           <span>&#x2460;</span>
-<samp class=p>>>> </samp><kbd>request = urllib2.Request(</kbd>
-<samp class=p>...    </samp>'http://diveintomark.org/redir/example301.xml') <span>&#x2461;</span>
-<samp class=p>>>> </samp><kbd>opener = urllib2.build_opener()</kbd>
-<samp class=p>>>> </samp><kbd>f = opener.open(request)</kbd>
-<samp>connect: (diveintomark.org, 80)
-send: '
-GET /redir/example301.xml HTTP/1.0
-Host: diveintomark.org
-User-agent: Python-urllib/2.1
-'
-reply: 'HTTP/1.1 301 Moved Permanently\r\n'</span>             <span>&#x2462;</span>
-<samp>header: Date: Thu, 15 Apr 2004 22:06:25 GMT
-header: Server: Apache/2.0.49 (Debian GNU/Linux)
-header: Location: http://diveintomark.org/xml/atom.xml</span>  <span>&#x2463;</span>
-<samp>header: Content-Length: 338
-header: Connection: close
-header: Content-Type: text/html; charset=iso-8859-1
-connect: (diveintomark.org, 80)
-send: '
-GET /xml/atom.xml HTTP/1.0</span>            <span>&#x2464;</span>
-<samp>Host: diveintomark.org
-User-agent: Python-urllib/2.1
-'
-reply: 'HTTP/1.1 200 OK\r\n'
-header: Date: Thu, 15 Apr 2004 22:06:25 GMT
-header: Server: Apache/2.0.49 (Debian GNU/Linux)
-header: Last-Modified: Thu, 15 Apr 2004 19:45:21 GMT
-header: ETag: "e842a-3e53-55d97640"
-header: Accept-Ranges: bytes
-header: Content-Length: 15955
-header: Connection: close
-header: Content-Type: application/atom+xml</samp>
-<samp class=p>>>> </samp><kbd>f.url</kbd>           <span>&#x2465;</span>
-'http://diveintomark.org/xml/atom.xml'
-<samp class=p>>>> </samp><kbd>f.headers.dict</kbd>
-<samp>{'content-length': '15955', 
-'accept-ranges': 'bytes', 
-'server': 'Apache/2.0.49 (Debian GNU/Linux)', 
-'last-modified': 'Thu, 15 Apr 2004 19:45:21 GMT', 
-'connection': 'close', 
-'etag': '"e842a-3e53-55d97640"', 
-'date': 'Thu, 15 Apr 2004 22:06:25 GMT', 
-'content-type': 'application/atom+xml'}</samp>
-<samp class=p>>>> </samp><kbd>f.status</kbd>
-<samp class=traceback>Traceback (most recent call last):
-  File "&lt;stdin>", line 1, in ?
-AttributeError: addinfourl instance has no attribute 'status'</span>
-</pre>
-<ol>
-<li>You&#8217;ll be better able to see what&#8217;s happening if you turn on debugging.
-<li>This is a <abbr>URL</abbr> which I have set up to permanently redirect to my Atom feed at <code>http://diveintomark.org/xml/atom.xml</code>.
-<li>Sure enough, when you try to download the data at that address, the server sends back a <code>301</code> status code, telling you that the resource has moved permanently.
-<li>The server also sends back a <code>Location</code> header that gives the new address of this data.
-<li><code>urllib2</code> notices the redirect status code and automatically tries to retrieve the data at the new location specified in the <code>Location</code> header.
-<li>The object you get back from the <var>opener</var> contains the new permanent address and all the headers returned from the second request (retrieved from the new permanent
-            address). But the status code is missing, so you have no way of knowing programmatically whether this redirect was temporary
-            or permanent. And that matters very much: if it was a temporary redirect, then you should continue to ask for the data at
-            the old location. But if it was a permanent redirect (as this was), you should ask for the data at the new location from
-            now on.
-<p>This is suboptimal, but easy to fix. <code>urllib2</code> doesn&#8217;t behave exactly as you want it to when it encounters a <code>301</code> or <code>302</code>, so let&#8217;s override its behavior. How?  With a custom <abbr>URL</abbr> handler, <a href="#oa.etags" title="11.6. Handling Last-Modified and ETag">just like you did to handle <code>304</code> codes</a>.
-<div class=example><h3>Example 11.11. Defining the redirect handler</h3>
-<p>This class is defined in <code>openanything.py</code>.
-<pre><code>
-class SmartRedirectHandler(urllib2.HTTPRedirectHandler):     <span>&#x2460;</span>
-    def http_error_301(self, req, fp, code, msg, headers):  
-        result = urllib2.HTTPRedirectHandler.http_error_301( <span>&#x2461;</span>
-            self, req, fp, code, msg, headers)              
-        result.status = code               <span>&#x2462;</span>
-        return result   
-
-    def http_error_302(self, req, fp, code, msg, headers):   <span>&#x2463;</span>
-        result = urllib2.HTTPRedirectHandler.http_error_302(
-            self, req, fp, code, msg, headers)              
-        result.status = code              
-        return result   
-</pre>
-<ol>
-<li>Redirect behavior is defined in <code>urllib2</code> in a class called <code>HTTPRedirectHandler</code>. You don&#8217;t want to completely override the behavior, you just want to extend it a little, so you&#8217;ll subclass <code>HTTPRedirectHandler</code> so you can call the ancestor class to do all the hard work.
-<li>When it encounters a <code>301</code> status code from the server, <code>urllib2</code> will search through its handlers and call the <code>http_error_301</code> method.  The first thing ours does is just call the <code>http_error_301</code> method in the ancestor, which handles the grunt work of looking for the <code>Location</code> header and following the redirect to the new address.
-<li>Here&#8217;s the key: before you return, you store the status code (<code>301</code>), so that the calling program can access it later.
-<li>Temporary redirects (status code <code>302</code>) work the same way: override the <code>http_error_302</code> method, call the ancestor, and save the status code before returning.
-<p>So what has this bought us?  You can now build a <abbr>URL</abbr> opener with the custom redirect handler, and it will still automatically
-follow redirects, but now it will also expose the redirect status code.
-<div class=example><h3>Example 11.12. Using the redirect handler to detect permanent redirects</h3><pre class=screen>
-<samp class=p>>>> </samp><kbd>request = urllib2.Request('http://diveintomark.org/redir/example301.xml')</kbd>
-<samp class=p>>>> </samp><kbd>import openanything, httplib</kbd>
-<samp class=p>>>> </samp><kbd>httplib.HTTPConnection.debuglevel = 1</kbd>
-<samp class=p>>>> </samp><kbd>opener = urllib2.build_opener(</kbd>
-<samp class=p>...    </samp>openanything.SmartRedirectHandler())           <span>&#x2460;</span>
-<samp class=p>>>> </samp><kbd>f = opener.open(request)</kbd>
-<samp>connect: (diveintomark.org, 80)
-send: 'GET /redir/example301.xml HTTP/1.0
-Host: diveintomark.org
-User-agent: Python-urllib/2.1
-'
-reply: 'HTTP/1.1 301 Moved Permanently\r\n'</span>            <span>&#x2461;</span>
-<samp>header: Date: Thu, 15 Apr 2004 22:13:21 GMT
-header: Server: Apache/2.0.49 (Debian GNU/Linux)
-header: Location: http://diveintomark.org/xml/atom.xml
-header: Content-Length: 338
-header: Connection: close
-header: Content-Type: text/html; charset=iso-8859-1
-connect: (diveintomark.org, 80)
-send: '
-GET /xml/atom.xml HTTP/1.0
-Host: diveintomark.org
-User-agent: Python-urllib/2.1
-'
-reply: 'HTTP/1.1 200 OK\r\n'
-header: Date: Thu, 15 Apr 2004 22:13:21 GMT
-header: Server: Apache/2.0.49 (Debian GNU/Linux)
-header: Last-Modified: Thu, 15 Apr 2004 19:45:21 GMT
-header: ETag: "e842a-3e53-55d97640"
-header: Accept-Ranges: bytes
-header: Content-Length: 15955
-header: Connection: close
-header: Content-Type: application/atom+xml
-</samp>
-<samp class=p>>>> </samp><kbd>f.status</kbd>       <span>&#x2462;</span>
-301
-<samp class=p>>>> </samp><kbd>f.url</kbd>
-'http://diveintomark.org/xml/atom.xml'
-</pre>
-<ol>
-<li>First, build a <abbr>URL</abbr> opener with the redirect handler you just defined.
-<li>You sent off a request, and you got a <code>301</code> status code in response. At this point, the <code>http_error_301</code> method gets called. You call the ancestor method, which follows the redirect and sends a request at the new location (<code>http://diveintomark.org/xml/atom.xml</code>).
-<li>This is the payoff: now, not only do you have access to the new <abbr>URL</abbr>, but you have access to the redirect status code, so you
-            can tell that this was a permanent redirect. The next time you request this data, you should request it from the new location
-            (<code>http://diveintomark.org/xml/atom.xml</code>, as specified in <var>f.url</var>). If you had stored the location in a configuration file or a database, you need to update that so you don&#8217;t keep pounding
-            the server with requests at the old address. It&#8217;s time to update your address book.
-<p>The same redirect handler can also tell you that you <em>shouldn&#8217;t</em> update your address book.
-<div class=example><h3>Example 11.13. Using the redirect handler to detect temporary redirects</h3><pre class=screen>
-<samp class=p>>>> </samp><kbd>request = urllib2.Request(</kbd>
-<samp class=p>...    </samp>'http://diveintomark.org/redir/example302.xml')   <span>&#x2460;</span>
-<samp class=p>>>> </samp><kbd>f = opener.open(request)</kbd>
-<samp>connect: (diveintomark.org, 80)
-send: '
-GET /redir/example302.xml HTTP/1.0
-Host: diveintomark.org
-User-agent: Python-urllib/2.1
-'
-reply: 'HTTP/1.1 302 Found\r\n'</span>         <span>&#x2461;</span>
-<samp>header: Date: Thu, 15 Apr 2004 22:18:21 GMT
-header: Server: Apache/2.0.49 (Debian GNU/Linux)
-header: Location: http://diveintomark.org/xml/atom.xml
-header: Content-Length: 314
-header: Connection: close
-header: Content-Type: text/html; charset=iso-8859-1
-connect: (diveintomark.org, 80)
-send: '
-GET /xml/atom.xml HTTP/1.0</span>              <span>&#x2462;</span>
-<samp>Host: diveintomark.org
-User-agent: Python-urllib/2.1
-'
-reply: 'HTTP/1.1 200 OK\r\n'
-header: Date: Thu, 15 Apr 2004 22:18:21 GMT
-header: Server: Apache/2.0.49 (Debian GNU/Linux)
-header: Last-Modified: Thu, 15 Apr 2004 19:45:21 GMT
-header: ETag: "e842a-3e53-55d97640"
-header: Accept-Ranges: bytes
-header: Content-Length: 15955
-header: Connection: close
-header: Content-Type: application/atom+xml</samp>
-<samp class=p>>>> </samp><kbd>f.status</kbd>          <span>&#x2463;</span>
-302
-<samp class=p>>>> </samp><kbd>f.url</kbd>
-http://diveintomark.org/xml/atom.xml
-</pre>
-<ol>
-<li>This is a sample <abbr>URL</abbr> I&#8217;ve set up that is configured to tell clients to <em>temporarily</em> redirect to <code>http://diveintomark.org/xml/atom.xml</code>.
-<li>The server sends back a <code>302</code> status code, indicating a temporary redirect. The temporary new location of the data is given in the <code>Location</code> header.
-<li><code>urllib2</code> calls your <code>http_error_302</code> method, which calls the ancestor method of the same name in <code>urllib2.HTTPRedirectHandler</code>, which follows the redirect to the new location. Then your <code>http_error_302</code> method stores the status code (<code>302</code>) so the calling application can get it later.
-<li>And here you are, having successfully followed the redirect to <code>http://diveintomark.org/xml/atom.xml</code>. <var>f.status</var> tells you that this was a temporary redirect, which means that you should continue to request data from the original address
-            (<code>http://diveintomark.org/redir/example302.xml</code>). Maybe it will redirect next time too, but maybe not. Maybe it will redirect to a different address. It&#8217;s not for you
-            to say. The server said this redirect was only temporary, so you should respect that. And now you&#8217;re exposing enough information
-            that the calling application can respect that.
-
-<p class=a>&#x2042;
-
-<h2 id="oa.gzip">11.8. Handling compressed data</h2>
-<p>The last important <abbr>HTTP</abbr> feature you want to support is compression. Many web services have the ability to send data compressed,
-   which can cut down the amount of data sent over the wire by 60% or more. This is especially true of <abbr>XML</abbr> web services, since
-   <abbr>XML</abbr> data compresses very well.
-<p>Servers won&#8217;t give you compressed data unless you tell them you can handle it.
-<div class=example><h3>Example 11.14. Telling the server you would like compressed data</h3><pre class=screen>
-<samp class=p>>>> </samp><kbd>import urllib2, httplib</kbd>
-<samp class=p>>>> </samp><kbd>httplib.HTTPConnection.debuglevel = 1</kbd>
-<samp class=p>>>> </samp><kbd>request = urllib2.Request('http://diveintomark.org/xml/atom.xml')</kbd>
-<samp class=p>>>> </samp><kbd>request.add_header('Accept-encoding', 'gzip')</kbd>        <span>&#x2460;</span>
-<samp class=p>>>> </samp><kbd>opener = urllib2.build_opener()</kbd>
-<samp class=p>>>> </samp><kbd>f = opener.open(request)</kbd>
-<samp>connect: (diveintomark.org, 80)
-send: '
-GET /xml/atom.xml HTTP/1.0
-Host: diveintomark.org
-User-agent: Python-urllib/2.1
-Accept-encoding: gzip</span><span>&#x2461;</span>
-<samp>'
-reply: 'HTTP/1.1 200 OK\r\n'
-header: Date: Thu, 15 Apr 2004 22:24:39 GMT
-header: Server: Apache/2.0.49 (Debian GNU/Linux)
-header: Last-Modified: Thu, 15 Apr 2004 19:45:21 GMT
-header: ETag: "e842a-3e53-55d97640"
-header: Accept-Ranges: bytes
-header: Vary: Accept-Encoding
-header: Content-Encoding: gzip</span>         <span>&#x2462;</span>
-header: Content-Length: 6289           <span>&#x2463;</span>
-<samp>header: Connection: close
-header: Content-Type: application/atom+xml</span>
-</pre>
-<ol>
-<li>This is the key: once you&#8217;ve created your <code>Request</code> object, add an <code>Accept-encoding</code> header to tell the server you can accept gzip-encoded data. <code>gzip</code> is the name of the compression algorithm you&#8217;re using. In theory there could be other compression algorithms, but <code>gzip</code> is the compression algorithm used by 99% of web servers.
-<li>There&#8217;s your header going across the wire.
-<li>And here&#8217;s what the server sends back: the <code>Content-Encoding: gzip</code> header means that the data you&#8217;re about to receive has been gzip-compressed.
-<li>The <code>Content-Length</code> header is the length of the compressed data, not the uncompressed data. As you&#8217;ll see in a minute, the actual length of
-            the uncompressed data was 15955, so gzip compression cut your bandwidth by over 60%!
-<div class=example><h3>Example 11.15. Decompressing the data</h3><pre class=screen>
-<samp class=p>>>> </samp><kbd>compresseddata = f.read()</kbd>            <span>&#x2460;</span>
-<samp class=p>>>> </samp><kbd>len(compresseddata)</kbd>
-6289
-<samp class=p>>>> </samp><kbd>import StringIO</kbd>
-<samp class=p>>>> </samp><kbd>compressedstream = StringIO.StringIO(compresseddata)</kbd>   <span>&#x2461;</span>
-<samp class=p>>>> </samp><kbd>import gzip</kbd>
-<samp class=p>>>> </samp><kbd>gzipper = gzip.GzipFile(fileobj=compressedstream)</kbd>      <span>&#x2462;</span>
-<samp class=p>>>> </samp><kbd>data = gzipper.read()</kbd>                <span>&#x2463;</span>
-<samp class=p>>>> </samp><kbd>print data</kbd>         <span>&#x2464;</span>
-<samp>&lt;?xml version="1.0" encoding="iso-8859-1"?>
-&lt;feed version="0.3"
-  xmlns="http://purl.org/atom/ns#"
-  xmlns:dc="http://purl.org/dc/elements/1.1/"
-  xml:lang="en">
-  &lt;title mode="escaped">dive into mark&lt;/title>
-  &lt;link rel="alternate" type="text/html" href="http://diveintomark.org/"/>
-  &hellip;
-</samp>
-<samp class=p>>>> </samp><kbd>len(data)</kbd>
-15955
-</pre>
-<ol>
-<li>Continuing from the previous example, <var>f</var> is the file-like object returned from the <abbr>URL</abbr> opener. Using its <code>read()</code> method would ordinarily get you the uncompressed data, but since this data has been gzip-compressed, this is just the first
-            step towards getting the data you really want.
-<li>OK, this step is a little bit of messy workaround. Python has a <code>gzip</code> module, which reads (and actually writes) gzip-compressed files on disk. But you don&#8217;t have a file on disk, you have a gzip-compressed
-            buffer in memory, and you don&#8217;t want to write out a temporary file just so you can uncompress it. So what you&#8217;re going to
-            do is create a file-like object out of the in-memory data (<var>compresseddata</var>), using the <code>StringIO</code> module. You first saw the <code>StringIO</code> module in <a href="#kgp.openanything.stringio.example" title="Example 10.4. Introducing StringIO">the previous chapter</a>, but now you&#8217;ve found another use for it.
-<li>Now you can create an instance of <code>GzipFile</code>, and tell it that its &#8220;file&#8221; is the file-like object <var>compressedstream</var>.
-<li>This is the line that does all the actual work: &#8220;reading&#8221; from <code>GzipFile</code> will decompress the data. Strange?  Yes, but it makes sense in a twisted kind of way. <var>gzipper</var> is a file-like object which represents a gzip-compressed file. That &#8220;file&#8221; is not a real file on disk, though; <var>gzipper</var> is really just &#8220;reading&#8221; from the file-like object you created with <code>StringIO</code> to wrap the compressed data, which is only in memory in the variable <var>compresseddata</var>. And where did that compressed data come from?  You originally downloaded it from a remote <abbr>HTTP</abbr> server by &#8220;reading&#8221; from the file-like object you built with <code>urllib2.build_opener</code>. And amazingly, this all just works. Every step in the chain has no idea that the previous step is faking it.
-<li>Look ma, real data. (15955 bytes of it, in fact.)<p>&#8220;But wait!&#8221; I hear you cry. &#8220;This could be even easier!&#8221;  I know what you&#8217;re thinking. You&#8217;re thinking that <var>opener.open</var> returns a file-like object, so why not cut out the <code>StringIO</code> middleman and just pass <var>f</var> directly to <code>GzipFile</code>?  OK, maybe you weren&#8217;t thinking that, but don&#8217;t worry about it, because it doesn&#8217;t work.
-<div class=example><h3>Example 11.16. Decompressing the data directly from the server</h3><pre class=screen>
-<samp class=p>>>> </samp><kbd>f = opener.open(request)</kbd><span>&#x2460;</span>
-<samp class=p>>>> </samp><kbd>f.headers.get('Content-Encoding')</kbd>         <span>&#x2461;</span>
-'gzip'
-<samp class=p>>>> </samp><kbd>data = gzip.GzipFile(fileobj=f).read()</kbd>    <span>&#x2462;</span>
-<samp class=traceback>Traceback (most recent call last):
-  File "&lt;stdin>", line 1, in ?
-  File "c:\python23\lib\gzip.py", line 217, in read
-    self._read(readsize)
-  File "c:\python23\lib\gzip.py", line 252, in _read
-    pos = self.fileobj.tell()   # Save current position
-AttributeError: addinfourl instance has no attribute 'tell'</span>
-</pre>
-<ol>
-<li>Continuing from the previous example, you already have a <code>Request</code> object set up with an <code>Accept-encoding: gzip</code> header.
-<li>Simply opening the request will get you the headers (though not download any data yet). As you can see from the returned
-<code>Content-Encoding</code> header, this data has been sent gzip-compressed.
-<li>Since <code>opener.open</code> returns a file-like object, and you know from the headers that when you read it, you&#8217;re going to get gzip-compressed data,
-            why not simply pass that file-like object directly to <code>GzipFile</code>?  As you &#8220;read&#8221; from the <code>GzipFile</code> instance, it will &#8220;read&#8221; compressed data from the remote <abbr>HTTP</abbr> server and decompress it on the fly. It&#8217;s a good idea, but unfortunately it doesn&#8217;t
-            work. Because of the way gzip compression works, <code>GzipFile</code> needs to save its position and move forwards and backwards through the compressed file. This doesn&#8217;t work when the &#8220;file&#8221; is a stream of bytes coming from a remote server; all you can do with it is retrieve bytes one at a time, not move back and
-            forth through the data stream. So the inelegant hack of using <code>StringIO</code> is the best solution: download the compressed data, create a file-like object out of it with <code>StringIO</code>, and then decompress the data from that.
-
-<p class=a>&#x2042;
-
-<h2 id="oa.alltogether">11.9. Putting it all together</h2>
-<p>You&#8217;ve seen all the pieces for building an intelligent <abbr>HTTP</abbr> web services client. Now let&#8217;s see how they all fit together.
-<div class=example><h3>Example 11.17. The <code>openanything</code> function</h3>
-<p>This function is defined in <code>openanything.py</code>.
-<pre><code>
-def openAnything(source, etag=None, lastmodified=None, agent=USER_AGENT):
-    # non-HTTP code omitted for brevity
-    if urlparse.urlparse(source)[0] == 'http':   <span>&#x2460;</span>
-        # open URL with urllib2                 
-        request = urllib2.Request(source)       
-        request.add_header('User-Agent', agent)  <span>&#x2461;</span>
-        if etag:              
-            request.add_header('If-None-Match', etag)              <span>&#x2462;</span>
-        if lastmodified:      
-            request.add_header('If-Modified-Since', lastmodified)  <span>&#x2463;</span>
-        request.add_header('Accept-encoding', 'gzip')              <span>&#x2464;</span>
-        opener = urllib2.build_opener(SmartRedirectHandler(), DefaultErrorHandler()) <span>&#x2465;</span>
-        return opener.open(request)              <span>&#x2466;</span>
-</pre>
-<ol>
-<li><code>urlparse</code> is a handy utility module for, you guessed it, parsing <abbr>URL</abbr>s. Its primary function, also called <code>urlparse</code>, takes a <abbr>URL</abbr> and splits it into a tuple of (scheme, domain, path, params, query string parameters, and fragment identifier).
-             Of these, the only thing you care about is the scheme, to make sure that you&#8217;re dealing with an <abbr>HTTP</abbr> <abbr>URL</abbr> (which <code>urllib2</code> can handle).
-<li>You identify yourself to the <abbr>HTTP</abbr> server with the <code>User-Agent</code> passed in by the calling function. If no <code>User-Agent</code> was specified, you use a default one defined earlier in the <code>openanything.py</code> module. You never use the default one defined by <code>urllib2</code>.
-<li>If an <code>ETag</code> hash was given, send it in the <code>If-None-Match</code> header.
-<li>If a last-modified date was given, send it in the <code>If-Modified-Since</code> header.
-<li>Tell the server you would like compressed data if possible.
-<li>Build a <abbr>URL</abbr> opener that uses <em>both</em> of the custom <abbr>URL</abbr> handlers: <code>SmartRedirectHandler</code> for handling <code>301</code> and <code>302</code> redirects, and <code>DefaultErrorHandler</code> for handling <code>304</code>, <code>404</code>, and other error conditions gracefully.
-<li>That&#8217;s it!  Open the <abbr>URL</abbr> and return a file-like object to the caller.
-<div class=example><h3>Example 11.18. The <code>fetch</code> function</h3>
-<p>This function is defined in <code>openanything.py</code>.
-<pre><code>
-def fetch(source, etag=None, last_modified=None, agent=USER_AGENT):  
-    '''Fetch data and metadata from a URL, file, stream, or string'''
-    result = {}
-    f = openAnything(source, etag, last_modified, agent)              <span>&#x2460;</span>
-    result['data'] = f.read()     <span>&#x2461;</span>
-    if hasattr(f, 'headers'):    
-        # save ETag, if the server sent one        
-        result['etag'] = f.headers.get('ETag')      <span>&#x2462;</span>
-        # save Last-Modified header, if the server sent one          
-        result['lastmodified'] = f.headers.get('Last-Modified')       <span>&#x2463;</span>
-        if f.headers.get('content-encoding', '') == 'gzip':           <span>&#x2464;</span>
-            # data came back gzip-compressed, decompress it          
-            result['data'] = gzip.GzipFile(fileobj=StringIO(result['data']])).read()
-    if hasattr(f, 'url'):         <span>&#x2465;</span>
-        result['url'] = f.url    
-        result['status'] = 200   
-    if hasattr(f, 'status'):      <span>&#x2466;</span>
-        result['status'] = f.status                
-    f.close()  
-    return result                
-</pre>
-<ol>
-<li>First, you call the <code>openAnything</code> function with a <abbr>URL</abbr>, <code>ETag</code> hash, <code>Last-Modified</code> date, and <code>User-Agent</code>.
-<li>Read the actual data returned from the server. This may be compressed; if so, you&#8217;ll decompress it later.
-<li>Save the <code>ETag</code> hash returned from the server, so the calling application can pass it back to you next time, and you can pass it on to <code>openAnything</code>, which can stick it in the <code>If-None-Match</code> header and send it to the remote server.
-<li>Save the <code>Last-Modified</code> date too.
-<li>If the server says that it sent compressed data, decompress it.
-<li>If you got a <abbr>URL</abbr> back from the server, save it, and assume that the status code is <code>200</code> until you find out otherwise.
-<li>If one of the custom <abbr>URL</abbr> handlers captured a status code, then save that too.
-<div class=example><h3>Example 11.19. Using <code>openanything.py</code></h3><pre class=screen>
-<samp class=p>>>> </samp><kbd>import openanything</kbd>
-<samp class=p>>>> </samp><kbd>useragent = 'MyHTTPWebServicesApp/1.0'</kbd>
-<samp class=p>>>> </samp><kbd>url = 'http://diveintopython3.org/redir/example301.xml'</kbd>
-<samp class=p>>>> </samp><kbd>params = openanything.fetch(url, agent=useragent)</kbd>              <span>&#x2460;</span>
-<samp class=p>>>> </samp><kbd>params</kbd>   <span>&#x2461;</span>
-<samp>{'url': 'http://diveintomark.org/xml/atom.xml', 
-'lastmodified': 'Thu, 15 Apr 2004 19:45:21 GMT', 
-'etag': '"e842a-3e53-55d97640"', 
-'status': 301,
-'data': '&lt;?xml version="1.0" encoding="iso-8859-1"?>
-&lt;feed version="0.3"
-&hellip;
-'}</samp>
-<samp class=p>>>> </samp><kbd>if params['status'] == 301:</kbd><span>&#x2462;</span>
-<samp class=p>...    </samp>url = params['url']
-<samp class=p>>>> </samp><kbd>newparams = openanything.fetch(</kbd>
-<samp class=p>...    </samp>url, params['etag'], params['lastmodified'], useragent)    <span>&#x2463;</span>
-<samp class=p>>>> </samp><kbd>newparams</kbd>
-<samp>{'url': 'http://diveintomark.org/xml/atom.xml', 
-'lastmodified': None, 
-'etag': '"e842a-3e53-55d97640"', 
-'status': 304,
-'data': ''}</span>  <span>&#x2464;</span>
-</pre>
-<ol>
-<li>The very first time you fetch a resource, you don&#8217;t have an <code>ETag</code> hash or <code>Last-Modified</code> date, so you&#8217;ll leave those out. (They&#8217;re <a href="#apihelper.optional" title="4.2. Using Optional and Named Arguments">optional parameters</a>.)
-<li>What you get back is a dictionary of several useful headers, the <abbr>HTTP</abbr> status code, and the actual data returned from the server.
-             <code>openanything</code> handles the gzip compression internally; you don&#8217;t care about that at this level.
-<li>If you ever get a <code>301</code> status code, that&#8217;s a permanent redirect, and you need to update your <abbr>URL</abbr> to the new address.
-<li>The second time you fetch the same resource, you have all sorts of information to pass back: a (possibly updated) <abbr>URL</abbr>, the
-<code>ETag</code> from the last time, the <code>Last-Modified</code> date from the last time, and of course your <code>User-Agent</code>.
-<li>What you get back is again a dictionary, but the data hasn&#8217;t changed, so all you got was a <code>304</code> status code and no data.
-
-<p class=a>&#x2042;
-
-<h2 id="oa.summary">11.10. Summary</h2>
-<p>The <code>openanything.py</code> and its functions should now make perfect sense.
-<p>There are 5 important features of <abbr>HTTP</abbr> web services that every client should support:
-<div class=itemizedlist>
-<ul>
-<li>Identifying your application <a href="#oa.useragent" title="11.5. Setting the User-Agent">by setting a proper <code>User-Agent</code></a>.
-
-<li>Handling <a href="#oa.redirect" title="11.7. Handling redirects">permanent redirects properly</a>.
-
-<li>Supporting <a href="#oa.etags" title="11.6. Handling Last-Modified and ETag"><code>Last-Modified</code> date checking</a> to avoid re-downloading data that hasn&#8217;t changed.
-
-<li>Supporting <a href="#oa.etags.example" title="Example 11.9. Supporting ETag/If-None-Match"><code>ETag</code> hashes</a> to avoid re-downloading data that hasn&#8217;t changed.
-
-<li>Supporting <a href="#oa.gzip" title="11.8. Handling compressed data">gzip compression</a> to reduce bandwidth even when data <em>has</em> changed.
-
-</ul>
-
-<p class=a>&#x2042;
-->
-
 <h2 id=beyond-get>Beyond GET</h2>

 <p>FIXME