diff --git a/serializing.html b/serializing.html index 3d4ebdf..fb1c8cb 100644 --- a/serializing.html +++ b/serializing.html @@ -297,21 +297,25 @@ def protocol_version(file_object):

FIXME - discussion of pickling class instances, stateful objects, __getstate__ and __setstate__, links to http://docs.python.org/3.1/library/pickle.html#pickle-inst and http://docs.python.org/3.1/library/pickle.html#pickle-state +

Serializing Python Objects to be Read by Other Languages

-

The data format used by the pickle module is Python-specific. It makes no attempt to be compatible with other programming languages. If cross-language compatibility is one of your requirements, you need to look at other serialization formats. +

The data format used by the pickle module is Python-specific. It makes no attempt to be compatible with other programming languages. If cross-language compatibility is one of your requirements, you need to look at other serialization formats. One such format is JSON. “JSON” stands for “JavaScript Object Notation,” but don’t let the name fool you — JSON is explicitly designed to be usable across multiple programming languages. -

One format that is designed to be used by multiple programming languages is JSON. +

Python 3 includes a json module in the standard library. Like the pickle module, the json module has functions for serializing data structures, storing the serialized data on disk, loading serialized data from disk, and unserializing the data back into a new Python object. But there are some important differences, too. First of all, the JSON data format is text-based, not binary. RFC 4627 defines the JSON format and how different types of data must be encoded as text. For example, a boolean value is stored as either the five-character string 'false' or the four-character string 'true'. All JSON values are case-sensitive. -

FIXME - pickle format is python-specific; JSON format is designed to be cross-language (in fact, it was originally designed for JavaScript, hence the name); differences with pickle format (table or list); json module implements dumping and loading JSON-formatted data structures; JSON format is string-based (and always encoded as UTF-8 where bytes are required); compact vs. pretty-printing; JSONEncoder; JSONDecoder; iterencode +

Second, as with any text-based format, there is the issue of whitespace. JSON allows arbitrary amounts of whitespace (spaces, tabs, carriage returns, and line feeds) between values. This whitespace is “insignificant,” which means that JSON encoders can add as much or as little whitespace as they like, and JSON decoders are required to ignore the whitespace between values. This allows you to “pretty-print” your JSON data, nicely nesting values within values at different indentation levels so you can read it in a standard browser or text editor. Python’s json module has options for pretty-printing during encoding. + +

Third, there’s the perennial problem of character encoding. JSON encodes values as plain text, but as you know, there ain’t no such thing as “plain text.” JSON must be stored in a Unicode encoding (UTF-32, UTF-16, or the default, UTF-8), and section 3 of RFC 4627 defines how to tell which encoding is being used.

Mapping of Python Datatypes to JSON

-

FIXME +

Since JSON is not Python-specific, there are some mismatches in its coverage of Python datatypes. Some of them are simply naming differences, but there is one important datatype that is completely missing. See if you can spot it:
Notes @@ -343,10 +347,12 @@ def protocol_version(file_object): None
-

FIXME +

Did you notice what was missing? Bytes! JSON has no support for bytes objects or byte arrays.

Serializing Datatypes Unsupported by JSON

+

Even if JSON has no built-in support for bytes, that doesn’t mean you can’t serialize bytes objects. The json module provides extensibility hooks for encoding and decoding unknown datatypes. (By “unknown,” I mean “not defined in JSON.” Obviously the json module knows about byte arrays, but it’s constrained by the limitations of the JSON specification.) If you want to encode bytes or other datatypes that JSON doesn’t support natively, you need to provide custom encoders and decoders for those types. +

 >>> shell                                                 
 1