From 49ec7ca6433cca4f3c8a8af9e263c33d6203bdc5 Mon Sep 17 00:00:00 2001 From: Mark Pilgrim Date: Tue, 9 Jun 2009 01:23:19 -0400 Subject: [PATCH] finished httplib2-redirects section --- http-web-services.html | 94 ++++++++++++++++++++++++---------------- regular-expressions.html | 2 +- 2 files changed, 57 insertions(+), 39 deletions(-) diff --git a/http-web-services.html b/http-web-services.html index 537392b..f85c98f 100644 --- a/http-web-services.html +++ b/http-web-services.html @@ -527,28 +527,32 @@ reply: 'HTTP/1.1 200 OK'

HTTP defines two kinds of redirects: temporary and permanent. There’s nothing special to do with temporary redirects except follow them, which httplib2 does automatically.

+>>> import httplib2
+>>> h = httplib2.Http('.cache')
 >>> response, content = h.request('http://diveintopython3.org/examples/feed-302.xml')  
 connect: (diveintopython3.org, 80)
-send: b'GET /examples/feed-302.xml HTTP/1.1                                         
+send: b'GET /examples/feed-302.xml HTTP/1.1                                            
 Host: diveintopython3.org
 accept-encoding: deflate, gzip
 user-agent: Python-httplib2/$Rev: 259 $'
-reply: 'HTTP/1.1 302 Found'                                                         
-send: b'GET /examples/feed.xml HTTP/1.1                                             
+reply: 'HTTP/1.1 302 Found'                                                            
+send: b'GET /examples/feed.xml HTTP/1.1                                                
 Host: diveintopython3.org
 accept-encoding: deflate, gzip
 user-agent: Python-httplib2/$Rev: 259 $'
 reply: 'HTTP/1.1 200 OK'
    -
  1. -
  2. -
  3. -
  4. +
  5. There is no feed at this URL. I’ve set up my server to issue a temporary redirect to the correct address. +
  6. There’s the request. +
  7. And there’s the response: 302 Found. Not shown here, this response also includes a Location header that points to the real URL. +
  8. httplib2 immediately turns around and “follows” the redirect by issuing another request for the URL given in the Location header: http://diveintopython3.org/examples/feed.xml
+

“Following” a redirect is nothing more than this example shows. httplib2 sends a request for the URL you asked for. The server comes back with a response that says “No no, look over there instead.” httplib2 sends another request for the new URL. +

 # continued from the previous example
->>> print(dict(response.items()))  
+>>> print(dict(response.items()))                                     
 {'status': '200',
  'content-length': '3070',
  'content-location': 'http://diveintopython3.org/examples/feed.xml',  
@@ -560,60 +564,74 @@ reply: 'HTTP/1.1 200 OK'
'connection': 'close', '-content-encoding': 'gzip', 'etag': '"bfe-4cbbf5c0"', - 'cache-control': 'max-age=86400', + 'cache-control': 'max-age=86400', 'date': 'Wed, 03 Jun 2009 02:21:41 GMT', 'content-type': 'application/xml'}
    -
  1. -
  2. -
  3. +
  4. The response you get back from this single call to the request() method is the response from the final URL. +
  5. httplib2 adds the final URL to the response dictionary, as content-location. This is not a header that came from the server; it’s specific to httplib2. +
  6. Apropos of nothing, this feed is compressed. +
  7. And cacheable. (This is important, as you’ll see in the next example.)
+

What happens if you request the same URL again? +

 # continued from the previous example
->>> response, content = h.request('http://diveintopython3.org/examples/feed-302.xml')  
+>>> response2, content2 = h.request('http://diveintopython3.org/examples/feed-302.xml')  
 connect: (diveintopython3.org, 80)
-send: b'GET /examples/feed-302.xml HTTP/1.1                                 
+send: b'GET /examples/feed-302.xml HTTP/1.1                                              
 Host: diveintopython3.org
 accept-encoding: deflate, gzip
 user-agent: Python-httplib2/$Rev: 259 $'
-reply: 'HTTP/1.1 302 Found'                                                 
+reply: 'HTTP/1.1 302 Found' +>>> content2 == content +True
    -
  1. -
  2. -
  3. +
  4. Same URL, same httplib2.Http object (and therefore the same cache). +
  5. The 302 response was not cached, so httplib2 sends another request for the same URL. +
  6. Once again, the server responds with a 302. But notice what didn’t happen: there wasn’t ever a second request for the final URL, http://diveintopython3.org/examples/feed.xml. That response was cached (remember the Cache-Control header that you saw in the previous example). Once httplib2 received the 302 Found code, it checked its cache before issuing another request. The cache contained a fresh copy of http://diveintopython3.org/examples/feed.xml, so there was no need to re-request it. +
  7. By the time the request() method returns, it has read the feed data from the cache and returned it. Of course, it’s the same as the data you received last time.
+

In other words, you don’t have to do anything special for temporary redirects. httplib2 will follow them automatically, and the fact that one URL redirects to another has no bearing on httplib2’s support for compression, caching, ETags, or any of the other features of HTTP. + +

Permanent redirects are just as simple. +

->>> response, content = h.request('http://diveintopython3.org/examples/feed-301.xml')
+# continued from the previous example
+>>> response, content = h.request('http://diveintopython3.org/examples/feed-301.xml')  
 connect: (diveintopython3.org, 80)
 send: b'GET /examples/feed-301.xml HTTP/1.1
 Host: diveintopython3.org
 accept-encoding: deflate, gzip
 user-agent: Python-httplib2/$Rev: 259 $'
-reply: 'HTTP/1.1 301 Moved Permanently'
->>> print(dict(response.items()))
-{'status': '200',
- 'content-length': '3070',
- 'content-location': 'http://diveintopython3.org/examples/feed.xml',
- 'accept-ranges': 'bytes',
- 'expires': 'Thu, 04 Jun 2009 02:21:41 GMT',
- 'vary': 'Accept-Encoding',
- 'server': 'Apache',
- 'last-modified': 'Wed, 03 Jun 2009 02:20:15 GMT',
- 'connection': 'close',
- '-content-encoding': 'gzip',
- 'etag': '"bfe-4cbbf5c0"',
- 'cache-control': 'max-age=86400',
- 'date': 'Wed, 03 Jun 2009 02:21:41 GMT',
- 'content-type': 'application/xml'}
->>> response2, content2 = h.request('http://diveintopython3.org/examples/feed-301.xml')
->>> response2.fromcache
+reply: 'HTTP/1.1 301 Moved Permanently'                                                
+>>> response.fromcache                                                                 
 True
    -
  1. FIXME +
  2. Once again, this URL doesn’t really exist. I’ve set up my server to issue a permanent redirect to http://diveintopython3.org/examples/feed.xml. +
  3. And here it is: status code 301. But again, notice what didn’t happen: there was no request to the redirect URL. Why not? Because it’s already cached locally. +
  4. httplib2 “followed” the redirect right into its cache.
+

But wait! There’s more! + +

+# continued from the previous example
+>>> response2, content2 = h.request('http://diveintopython3.org/examples/feed-301.xml')  
+>>> response2.fromcache                                                                  
+>>> content2 == content                                                                  
+True
+
+
    +
  1. Here’s the difference between temporary and permanent redirects: once httplib2 follows a permanent redirect, all further requests for that URL will transparently be rewritten to the target URL without hitting the network for the original URL. Remember, debugging is still turned on, yet there is no output of network activity whatsoever. +
  2. Yep, this response was retrieved from the local cache. +
  3. Yep, you got the entire feed (from the cache). +
+ +

HTTP. It works. +

Beyond HTTP GET

diff --git a/regular-expressions.html b/regular-expressions.html index 45ab117..91801c8 100644 --- a/regular-expressions.html +++ b/regular-expressions.html @@ -272,7 +272,7 @@ body{counter-reset:h1 4} <_sre.SRE_Match object at 0x008EEB48> >>> re.search(pattern, 'MCMLXXXIX', re.VERBOSE) <_sre.SRE_Match object at 0x008EEB48> ->>> re.search(pattern, 'MMMDCCCLXXXVIII', re.VERBOSE) +>>> re.search(pattern, 'MMMDCCCLXXXVIII', re.VERBOSE) <_sre.SRE_Match object at 0x008EEB48> >>> re.search(pattern, 'M')