Commit Graph

5 Commits

Author SHA1 Message Date
Martijn Pieters e26ccb34eb Fix the smoke test in the face of UTF-16 surrogate pairs.
If the random data starts with a UTF-16 BOM *and* the next two bytes are for a character in the `\ud800`-`\udfff` range decoding would fail. Small chance, but still possible.

Extend it to check the UTF-8 error as well. The goal is to test that the guesser was *mostly* correct, and to verify the cases where it wasn't that it was to be expected. Most of all that the function doesn't buckle under wildly unexpected data.
2012-10-26 12:15:27 +02:00
Martijn Pieters be01a35ef1 Better not call it chr, rename to byteschr. 2012-10-25 18:27:21 +02:00
Martijn Pieters a4be9a2578 Redefine the unichr and bytes-variant of chr at module level.
Needed to appease Travis; it's python 2.6 and 2.7 builds are weird and the `__builtins__` dict is not following CPython conventions.
2012-10-25 18:22:07 +02:00
Martijn Pieters 9832bd8917 Correct a c&p mistake: set a correct docstring for the unit test class. 2012-10-25 17:56:19 +02:00
Martijn Pieters 4decc7986e Use a JSON-specific encoding detection when no encoding has been specified.
JSON *must* be encoded using UTF-8, UTF-16 or UTF-32 (see the [RFC][1]; detect the encoding based on the fact that JSON always starts with 2 ASCII characters.

[1]: http://tools.ietf.org/html/rfc4627#section-3
2012-10-25 17:43:52 +02:00