Merge pull request #772 from Lukasa/develop

First pass at documenting encodings and RFC compliance.
This commit is contained in:
Kenneth Reitz
2012-08-10 09:05:08 -07:00
2 changed files with 40 additions and 5 deletions
+25
View File
@@ -343,6 +343,31 @@ To use HTTP Basic Auth with your proxy, use the `http://user:password@host/` syn
"http": "http://user:pass@10.10.1.10:3128/",
}
Compliance
----------
Requests is intended to be compliant with all relevant specifications and
RFCs where that compliance will not cause difficulties for users. This
attention to the specification can lead to some behaviour that may seem
unusual to those not familiar with the relevant specification.
Encodings
^^^^^^^^^
When you receive a response, Requests makes a guess at the encoding to use for
decoding the response when you call the ``Response.text`` method. Requests
will first check for an encoding in the HTTP header, and if none is present,
will use `chardet <http://pypi.python.org/pypi/chardet>`_ to attempt to guess
the encoding.
The only time Requests will not do this is if no explicit charset is present
in the HTTP headers **and** the ``Content-Type`` header contains ``text``. In
this situation,
`RFC 2616 <http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.7.1>`_
specifies that the default charset must be ``ISO-8859-1``. Requests follows
the specification in this case. If you require a different encoding, you can
manually set the ``Response.encoding`` property, or use the raw
``Request.content``.
HTTP Verbs
----------
+15 -5
View File
@@ -86,12 +86,22 @@ again::
Requests will automatically decode content from the server. Most unicode
charsets are seamlessly decoded.
When you make a request, ``r.encoding`` is set, based on the HTTP headers.
Requests will use that encoding when you access ``r.text``. If ``r.encoding``
is ``None``, Requests will make an extremely educated guess of the encoding
of the response body. You can manually set ``r.encoding`` to any encoding
you'd like, and that charset will be used.
When you make a request, Requests makes educated guesses about the encoding of
the response based on the HTTP headers. The text encoding guessed by Requests
is used when you access ``r.text``. You can find out what encoding Requests is
using, and change it, using the ``r.encoding`` property::
>>> r.encoding
'utf-8'
>>> r.encoding = 'ISO-8859-1'
If you change the encoding, Requests will use the new value of ``r.encoding``
whenever you call ``r.text``.
Requests will also use custom encodings in the event that you need them. If
you have created your own encoding and registered it with the ``codecs``
module, you can simply use the codec name as the value of ``r.encoding`` and
Requests will handle the decoding for you.
Binary Response Content
-----------------------