diff --git a/http-web-services.html b/http-web-services.html index 39a0076..6400848 100644 --- a/http-web-services.html +++ b/http-web-services.html @@ -52,7 +52,7 @@ mark{display:inline}

Caching

-

The most important thing to understand about any type of web service is that network access is incredibly expensive. I don’t mean “dollars and cents” expensive (although bandwidth ain’t free). I mean that it takes an extraordinary long time to open a connection, send a request, and retrieve a response from a remote server. Even the fastest broadband connection slower than your local network, which in turn is slower than your local disk. +

The most important thing to understand about any type of web service is that network access is incredibly expensive. I don’t mean “dollars and cents” expensive (although bandwidth ain’t free). I mean that it takes an extraordinary long time to open a connection, send a request, and retrieve a response from a remote server. Even on the fastest broadband connection, latency (the time it takes to send a request and start retrieving data in a response) can still be higher than you anticipated. A router misbehaves, a packet is dropped, an intermediate proxy is under attack — there’s never a dull moment on the public internet, and there may be nothing you can do about it.

HTTP is designed with caching in mind. There is an entire class of devices (called “caching proxies”) whose only job is to sit between you and the rest of the world and minimize network access. Your company or ISP almost certainly maintains caching proxies, even if you’re unaware of them. They work because caching built into the HTTP protocol. @@ -171,16 +171,6 @@ Cache-Control: max-age=31536000, public

httplib2 handles permanent redirects for you. Not only will it tell you that a permanent redirect occurred, it will keep track of them locally and automatically rewrite redirected URLs before requesting them. -

How Not To Fetch Data Over HTTP

@@ -229,7 +219,7 @@ reply: 'HTTP/1.1 200 OK'
  • The first line specifies the HTTP verb you’re using, and the path of the resource (minus the domain name).
  • The second line specifies the domain name from which we’re requesting this feed.
  • The third line specifies the compression algorithms that the client supports. As I mentioned earlier, urllib.request does not support compression by default. -
  • The fourth line specifies the name of the library that is making the request. By default, this is Python-urllib plus a version number. Both urllib.request and httplib2 support changing the user agent; you’ll see how to do this later in this chapter. [FIXME really?] +
  • The fourth line specifies the name of the library that is making the request. By default, this is Python-urllib plus a version number. Both urllib.request and httplib2 support changing the user agent, simply by adding a User-Agent header to the request (which will override the default value).

    Now let’s look at what the server sent back in its response.