more links about bytes/strings, mention packaging chapter

This commit is contained in:
Mark Pilgrim
2009-08-03 11:36:00 -07:00
parent 10a56a41e2
commit fd560efcff
+13 -4
View File
@@ -27,16 +27,25 @@ h3:before{content:''}
<p><a href=case-study-porting-chardet-to-python-3.html>Case Study: Porting <code>chardet</code> to Python 3</a> documents my (ultimately successful) effort to port a non-trivial library from Python 2 to Python 3. It may help you; it may not. There&#8217;s a fairly steep learning curve, since you need to kind of understand the library first, so you can understand why it broke and how I fixed it. A lot of the breakage centers around strings. Speaking of which&hellip;
<p>Strings. Whew. Where to start. Python 2 had &#8220;strings&#8221; and &#8220;Unicode strings.&#8221; Python 3 has &#8220;bytes&#8221; and &#8220;strings.&#8221; That is, all strings are now Unicode strings, and if you want to deal with a bag of bytes, you use the new <code>bytes</code> type. Python 3 will <em>never</em> implicitly convert between strings and bytes, so if you&#8217;re not sure which one you have at any given moment, your code will almost certainly break. Read <a href=strings.html>the Strings chapter</a> for more details. Bytes vs. strings comes up again in <a href=files.html>the Files chapter</a>, and again in <a href=http-web-services.html>the <abbr>HTTP</abbr> web services chapter</a>, and again in <a href=case-study-porting-chardet-to-python-3.html>the aforementioned case study</a>. It will come up again and again in your code, too. Trust me.
<p>Strings. Whew. Where to start. Python 2 had &#8220;strings&#8221; and &#8220;Unicode strings.&#8221; Python 3 has &#8220;bytes&#8221; and &#8220;strings.&#8221; That is, all strings are now Unicode strings, and if you want to deal with a bag of bytes, you use the new <code>bytes</code> type. Python 3 will <em>never</em> implicitly convert between strings and bytes, so if you&#8217;re not sure which one you have at any given moment, your code will almost certainly break. Read <a href=strings.html>the Strings chapter</a> for more details.
<p>Bytes vs. strings comes up again and again throughout the book.
<ul>
<li>In <a href=files.html>Files</a>, you&#8217;ll learn the difference between reading files in &#8220;binary&#8221; and &#8220;text&#8221; mode. Reading (and writing!) files in text mode requires an <code>encoding</code> parameter. Some text file methods count characters, but other methods count bytes. If your code assumes that one character == one byte, it <em>will</em> break on multi-byte characters.
<li>In <a href=http-web-services.html><abbr>HTTP</abbr> Web Services</a>, the <code>httplib2</code> module fetches headers and data over <abbr>HTTP</abbr>. <abbr>HTTP</abbr> headers are returned as strings, but the <abbr>HTTP</abbr> body is returned as bytes.
<li>In <a href=serializing.html>Serializing Python Objects</a>, you&#8217;ll learn why the <code>pickle</code> module in Python 3 defines a new data format that is backwardly incompatible with Python 2. (Hint: it&#8217;s because of bytes and strings.) Also <abbr>JSON</abbr>, which doesn&#8217;t support the <code>bytes</code> type at all. I&#8217;ll show you how to hack around that.
<li>In <a href=case-study-porting-chardet-to-python-3.html>Case study: porting <code>chardet</code> to Python 3</a>, it&#8217;s just a bloody mess of bytes and strings everywhere.
</ul>
<p>Even if you don&#8217;t care about Unicode (oh but you will), you&#8217;ll want to read about <a href=strings.html#formatting-strings>string formatting in Python 3</a>, which is completely different from Python 2.
<p>Iterators are everywhere in Python 3, and I understand them a lot better than I did five years ago when I wrote &#8220;Dive Into Python&#8221;. You need to understand them too, because lots of functions that used to return lists in Python 2 will now return iterators in Python 3. At a minimum, you should read <a href=iterators.html#a-fibonacci-iterator>the second half of the Iterators chapter</a> and <a href=advanced-iterators.html#generator-expressions>the second half of the Advanced Iterators chapter</a>.
<p>By popular request, I&#8217;ve added an appendix on <a href=special-method-names.html>Special Method Names</a>, which is kind of like <a href=http://www.python.org/doc/3.0/reference/datamodel.html#special-method-names>the Python docs &#8220;Data Model&#8221; chapter</a> but with more snark.
<p>By popular request, I&#8217;ve added an appendix on <a href=special-method-names.html>Special Method Names</a>, which is kind of like <a href=http://www.python.org/doc/3.1/reference/datamodel.html#special-method-names>the Python docs &#8220;Data Model&#8221; chapter</a> but with more snark.
<p>When I was writing &#8220;Dive Into Python&#8221;, all of the available XML libraries sucked. Then Fredrik Lundh wrote <a href=http://effbot.org/zone/element-index.htm>ElementTree</a>, which doesn&#8217;t suck at all. The Python gods wisely <a href=http://docs.python.org/3.0/library/xml.etree.elementtree.html>incorporated ElementTree into the standard library</a>, and now it forms the basis for <a href=xml.html>my new XML chapter</a>. The old ways of parsing XML are still around, but you should avoid them, because they suck!
<p>When I was writing &#8220;Dive Into Python&#8221;, all of the available XML libraries sucked. Then Fredrik Lundh wrote <a href=http://effbot.org/zone/element-index.htm>ElementTree</a>, which doesn&#8217;t suck at all. The Python gods wisely <a href=http://docs.python.org/3.1/library/xml.etree.elementtree.html>incorporated ElementTree into the standard library</a>, and now it forms the basis for <a href=xml.html>my new XML chapter</a>. The old ways of parsing XML are still around, but you should avoid them, because they suck!
<p>That&#8217;s it for now; the book&#8217;s not finished yet! The file I/O subsystem is totally different now; I hope to write about that soon.
<p>Also new in Python&nbsp;&mdash;&nbsp;not in the language but in the community&nbsp;&mdash;&nbsp;is the emergence of code repositories like <a href=http://www.pypi.org/>The Python Package Index</a> (PyPI). Python comes with utilities to package your code in standard formats and distribute those packages on PyPI. Read <a href=packaging.html>Packaging Python Libraries</a> for details.
<p class=c>&copy; 2001&ndash;9 <a href=about.html>Mark Pilgrim</a>