From ad73df0eba773293550d771185735ab5096b0e9f Mon Sep 17 00:00:00 2001 From: Mark Pilgrim Date: Thu, 30 Jul 2009 01:00:43 -0400 Subject: [PATCH] added chardet examples to #divingin and #structure, finished #setuppy section --- packaging.html | 165 ++++++++++++++++++++++++++++--------------------- 1 file changed, 93 insertions(+), 72 deletions(-) diff --git a/packaging.html b/packaging.html index b5a3eba..fa8bd1e 100644 --- a/packaging.html +++ b/packaging.html @@ -5,7 +5,7 @@ @@ -24,81 +24,56 @@ body{counter-reset:h1 16}

Python 3 comes with a packaging framework called Distutils. Distutils is many things: a build tool (for you), an installation tool (for your users), a package metadata format (for search engines), and more. It integrates with the Python Package Index (“PyPI”), a central repository for open source Python libraries. -

All of these facets of Distutils center around the setup script, traditionally called setup.py. In fact, you’ve already seen a Distutils setup script: you used one to install httplib2 in the “HTTP Web Services” chapter. +

All of these facets of Distutils center around the setup script, traditionally called setup.py. In fact, you’ve already seen several Distutils setup scripts in this book: you used one to install httplib2 in “HTTP Web Services,” and another to install chardet in “Case Study: Porting chardet to Python 3.” -

In this chapter, you’ll learn how the setup script for httplib2 works and step through the process of releasing your own Python software. +

In this chapter, you’ll learn how the setup script for chardet works and step through the process of releasing your own Python software. -

# httplib2's setup.py
+
# chardet's setup.py
 from distutils.core import setup
-VERSION = '0.5.0'
-setup(name='httplib2',
-      version=VERSION, 
-      author='Joe Gregorio',
-      author_email='joe@example.com',
-      url='http://code.google.com/p/httplib2/',
-      download_url='http://httplib2.googlecode.com/files/httplib2-python3-{}.tar.gz'.format(VERSION),
-      description='A comprehensive HTTP client library.',
-      license='MIT',
-      packages=['httplib2'],
-      classifiers=[
-          'Development Status :: 4 - Beta',
-          'Environment :: Web Environment',
-          'Intended Audience :: Developers',
-          'License :: OSI Approved :: MIT License',
-          'Operating System :: OS Independent',
-          'Programming Language :: Python',
-          'Programming Language :: Python :: 3',
-          'Topic :: Internet :: WWW/HTTP',
-          'Topic :: Software Development :: Libraries :: Python Modules',
-      ],
-        long_description="""
+setup(
+    name = "chardet",
+    packages = ["chardet"],
+    version = "1.0.1",
+    description = "Universal encoding detector",
+    author = "Mark Pilgrim",
+    author_email = "mark@diveintomark.org",
+    url = "http://chardet.feedparser.org/",
+    download_url = "http://chardet.feedparser.org/download/python3-chardet-1.0.1.tgz",
+    keywords = ["encoding", "i18n", "xml"],
+    classifiers = [
+        "Programming Language :: Python",
+        "Programming Language :: Python :: 3",
+        "Development Status :: 4 - Beta",
+        "Environment :: Other Environment",
+        "Intended Audience :: Developers",
+        "License :: OSI Approved :: GNU Library or Lesser General Public License (LGPL)",
+        "Operating System :: OS Independent",
+        "Topic :: Software Development :: Libraries :: Python Modules",
+        "Topic :: Text Processing :: Linguistic",
+        ],
+    long_description = """\
+Universal character encoding detector
+-------------------------------------
 
-A comprehensive HTTP client library, ``httplib2`` supports many features left out of other HTTP libraries.
+Detects
+ - ASCII, UTF-8, UTF-16 (2 variants), UTF-32 (4 variants)
+ - Big5, GB2312, EUC-TW, HZ-GB-2312, ISO-2022-CN (Traditional and Simplified Chinese)
+ - EUC-JP, SHIFT_JIS, ISO-2022-JP (Japanese)
+ - EUC-KR, ISO-2022-KR (Korean)
+ - KOI8-R, MacCyrillic, IBM855, IBM866, ISO-8859-5, windows-1251 (Cyrillic)
+ - ISO-8859-2, windows-1250 (Hungarian)
+ - ISO-8859-5, windows-1251 (Bulgarian)
+ - windows-1252 (English)
+ - ISO-8859-7, windows-1253 (Greek)
+ - ISO-8859-8, windows-1255 (Visual and Logical Hebrew)
+ - TIS-620 (Thai)
 
-**HTTP and HTTPS**
-  HTTPS support is only available if the socket module was compiled with SSL support. 
- 
-
-**Keep-Alive**
-  Supports HTTP 1.1 Keep-Alive, keeping the socket open and performing multiple requests over the same connection if possible. 
-
-
-**Authentication**
-  The following three types of HTTP Authentication are supported. These can be used over both HTTP and HTTPS.
-
-  * Digest
-  * Basic
-  * WSSE
-
-**Caching**
-  The module can optionally operate with a private cache that understands the Cache-Control: 
-  header and uses both the ETag and Last-Modified cache validators. Both file system
-  and memcached based caches are supported.
-
-
-**All Methods**
-  The module can handle any HTTP request method, not just GET and POST.
-
-
-**Redirects**
-  Automatically follows 3XX redirects on GETs.
-
-
-**Compression**
-  Handles both 'deflate' and 'gzip' types of compression.
-
-
-**Lost update support**
-  Automatically adds back ETags into PUT requests to resources we have already cached. This implements Section 3.2 of Detecting the Lost Update Problem Using Unreserved Checkout
-
-
-**Unit Tested**
-  A large and growing set of unit tests.
-        """
+This version requires Python 3 or later; a Python 2 version is available separately.
+"""
 )
-

httplib2 is open source, but there’s no requirement that you release your own Python libraries under any particular license. The process described in this chapter will work for any Python software, regardless of license. +

chardet and httplib2 are open source, but there’s no requirement that you release your own Python libraries under any particular license. The process described in this chapter will work for any Python software, regardless of license.

⁂ @@ -143,7 +118,31 @@ A comprehensive HTTP client library, ``httplib2`` supports many features left ou

  • If your Python software is a single .py file, you should put it in the root directory along with your “read me” file and your setup script. If it’s a multi-file module (i.e. a directory with a main __init__.py script), like httplib2, you should put the entire directory here. Yes, that means you’ll have an httplib2/ directory within an httplib2/ directory. Trust me, that’s not a problem. In fact, any other arrangement would be a problem. -

    Depending on the license you chose, you may include the license text within your .py files themselves, or you may have a separate file containing license text, or both. GPL-licensed programs generally include a file called COPYING that includes the entire text of the GPL. If you have a separate license file, it should go in the root directory along with your “read me” file and your setup script. +

    The chardet directory looks slightly different. Instead of a “read me” file, it has HTML-formatted documentation in a docs/ directory. Also, in keeping with the convention for (L)GPL-licensed software, it has a separate file called COPYING which contains the complete text of the LGPL. + +

    +chardet/
    +|
    ++--COPYING
    +|
    ++--setup.py
    +|
    ++--docs/
    +|  |
    +|  +--index.html
    +|  |
    +|  +--usage.html
    +|  |
    +|  +--...
    +|
    ++--chardet/
    +   |
    +   +--__init__.py
    +   |
    +   +--big5freq.py
    +   |
    +   +--...
    +

    Writing Your Setup Script

    @@ -155,7 +154,29 @@ A comprehensive HTTP client library, ``httplib2`` supports many features left ou

    This imports the setup() function, which is the main entry point into Distutils. 95% of all Distutils setup scripts consist of a single call to setup() and nothing else. (I totally just made up that statistic, but if your Distutils setup script is doing more than calling the Distutils setup() function, you should have a good reason.) -

    ...FIXME...required setup() parameters, optional but recommended setup() parameters, always use named parameters, etc. +

    The setup() function can take dozens of parameters. For the sanity of everyone involved, you must use named arguments for every parameter. This is not merely a convention; it’s a hard requirement. Your setup script will crash if you try to call the setup() function with non-named arguments. + +

    The following named arguments are required: + +

      +
    • name, the name of the package. +
    • version, the version number of the package. +
    • author, your full name. +
    • author_email, your email address. +
    • url, the home page of your project. This can be your PyPI package page if you don’t have a separate project website. +
    + +

    Although not required, I recommend that you also include + +

      +
    • description, a one-line summary of the project. +
    • long_description, a multi-line string in reStructuredText format. PyPI converts this to HTML and displays it on your package page. +
    • classifiers, a list of specially-formatted strings described in the next section. +
    + +
    +

    Setup script metadata is defined in PEP 314. +

    Classifying Your Package

    @@ -197,7 +218,7 @@ A comprehensive HTTP client library, ``httplib2`` supports many features left ou

    Examples of Good Package Classifiers

    -

    By way of example, here are the classifiers for Django, a production-ready, cross-platform, BSD-licensed content management system that runs on your web server. (Django is not yet compatible with Python 3, so the Programming Language :: Python :: Python 3 classifier is not listed.) +

    By way of example, here are the classifiers for Django, a production-ready, cross-platform, BSD-licensed content management system that runs on your web server. (Django is not yet compatible with Python 3, so the Programming Language :: Python :: 3 classifier is not listed.)

    Programming Language :: Python
     License :: OSI Approved :: BSD License
    @@ -214,7 +235,7 @@ Topic :: Software Development :: Libraries :: Python Modules

    Here are the classifiers for chardet, the character encoding detection library covered in Case Study: Porting chardet to Python 3. chardet is beta quality, cross-platform, Python 3-compatible, LGPL-licensed, and intended for developers to integrate into their own products.

    Programming Language :: Python
    -Programming Language :: Python :: Python 3
    +Programming Language :: Python :: 3
     License :: OSI Approved :: GNU Library or Lesser General Public License (LGPL)
     Operating System :: OS Independent
     Development Status :: 4 - Beta