mirror of
https://github.com/kennethreitz/requests.git
synced 2026-06-05 22:50:18 +00:00
Merge pull request #772 from Lukasa/develop
First pass at documenting encodings and RFC compliance.
This commit is contained in:
@@ -343,6 +343,31 @@ To use HTTP Basic Auth with your proxy, use the `http://user:password@host/` syn
|
||||
"http": "http://user:pass@10.10.1.10:3128/",
|
||||
}
|
||||
|
||||
Compliance
|
||||
----------
|
||||
|
||||
Requests is intended to be compliant with all relevant specifications and
|
||||
RFCs where that compliance will not cause difficulties for users. This
|
||||
attention to the specification can lead to some behaviour that may seem
|
||||
unusual to those not familiar with the relevant specification.
|
||||
|
||||
Encodings
|
||||
^^^^^^^^^
|
||||
|
||||
When you receive a response, Requests makes a guess at the encoding to use for
|
||||
decoding the response when you call the ``Response.text`` method. Requests
|
||||
will first check for an encoding in the HTTP header, and if none is present,
|
||||
will use `chardet <http://pypi.python.org/pypi/chardet>`_ to attempt to guess
|
||||
the encoding.
|
||||
|
||||
The only time Requests will not do this is if no explicit charset is present
|
||||
in the HTTP headers **and** the ``Content-Type`` header contains ``text``. In
|
||||
this situation,
|
||||
`RFC 2616 <http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.7.1>`_
|
||||
specifies that the default charset must be ``ISO-8859-1``. Requests follows
|
||||
the specification in this case. If you require a different encoding, you can
|
||||
manually set the ``Response.encoding`` property, or use the raw
|
||||
``Request.content``.
|
||||
|
||||
HTTP Verbs
|
||||
----------
|
||||
|
||||
@@ -86,12 +86,22 @@ again::
|
||||
Requests will automatically decode content from the server. Most unicode
|
||||
charsets are seamlessly decoded.
|
||||
|
||||
When you make a request, ``r.encoding`` is set, based on the HTTP headers.
|
||||
Requests will use that encoding when you access ``r.text``. If ``r.encoding``
|
||||
is ``None``, Requests will make an extremely educated guess of the encoding
|
||||
of the response body. You can manually set ``r.encoding`` to any encoding
|
||||
you'd like, and that charset will be used.
|
||||
When you make a request, Requests makes educated guesses about the encoding of
|
||||
the response based on the HTTP headers. The text encoding guessed by Requests
|
||||
is used when you access ``r.text``. You can find out what encoding Requests is
|
||||
using, and change it, using the ``r.encoding`` property::
|
||||
|
||||
>>> r.encoding
|
||||
'utf-8'
|
||||
>>> r.encoding = 'ISO-8859-1'
|
||||
|
||||
If you change the encoding, Requests will use the new value of ``r.encoding``
|
||||
whenever you call ``r.text``.
|
||||
|
||||
Requests will also use custom encodings in the event that you need them. If
|
||||
you have created your own encoding and registered it with the ``codecs``
|
||||
module, you can simply use the codec name as the value of ``r.encoding`` and
|
||||
Requests will handle the decoding for you.
|
||||
|
||||
Binary Response Content
|
||||
-----------------------
|
||||
|
||||
Reference in New Issue
Block a user