Switch LGPL'd chardet for MIT licensed charset_normalizer (#5797)

Although using the (non-vendored) chardet library is fine for requests
itself, but using a LGPL dependency the story is a lot less clear
for downstream projects, particularly ones that might like to bundle
requests (and thus chardet) in to a single binary -- think something
similar to what docker-compose is doing. By including an LGPL'd module
it is no longer clear if the resulting artefact must also be LGPL'd.

By changing out this dependency for one under MIT we remove all
license ambiguity.

As an "escape hatch" I have made the code so that it will use chardet
first if it is installed, but we no longer depend upon it directly,
although there is a new extra added, `requests[lgpl]`. This should
minimize the impact to users, and give them an escape hatch if
charset_normalizer turns out to be not as good. (In my non-exhaustive
tests it detects the same encoding as chartdet in every case I threw at
it)

Co-authored-by: Jarek Potiuk <jarek@potiuk.com>

Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
This commit is contained in:
Ash Berlin-Taylor
2021-07-07 00:55:02 +01:00
committed by GitHub
parent 33d448eb21
commit 2ed84f55b2
10 changed files with 118 additions and 26 deletions
+3 -1
View File
@@ -41,7 +41,8 @@ if sys.argv[-1] == 'publish':
packages = ['requests']
requires = [
'chardet>=3.0.2,<5',
'charset_normalizer~=2.0.0; python_version >= "3"',
'chardet>=3.0.2,<5; python_version < "3"',
'idna>=2.5,<3',
'urllib3>=1.21.1,<1.27',
'certifi>=2017.4.17'
@@ -103,6 +104,7 @@ setup(
'security': ['pyOpenSSL >= 0.14', 'cryptography>=1.3.4'],
'socks': ['PySocks>=1.5.6, !=1.5.7'],
'socks:sys_platform == "win32" and python_version == "2.7"': ['win_inet_pton'],
'use_chardet_on_py3': ['chardet>=3.0.2,<5']
},
project_urls={
'Documentation': 'https://requests.readthedocs.io',