Default the encoding of "text" media subtypes to "ISO-8859-1"

Ref. RFC2616 (HyperText Transfer Protocol), section 3.7.1 (Canonicalization and Text Defaults).
This commit is contained in:
Johannes Gorset
2012-01-21 11:01:45 +01:00
parent 5bff8e362f
commit a0ae2e6c7b
+3
View File
@@ -276,6 +276,9 @@ def get_encoding_from_headers(headers):
if 'charset' in params:
return params['charset'].strip("'\"")
if 'text' in content_type:
return 'ISO-8859-1'
def unicode_from_html(content):
"""Attempts to decode an HTML string into unicode.