You are here: Home Dive Into Python 3

Difficulty level: ♦♦♦♦♢

Packaging Python Libraries

You’ll find the shame is like the pain; you only feel it once.
— Marquise de Merteuil, Dangerous Liaisons

 

Diving In

So you want to release a Python script, library, framework, or application. Excellent. The world needs more Python code.

Python 3 comes with a packaging framework called Distutils. Distutils is many things: a build tool (for you), an installation tool (for your users), a package metadata format (for search engines), and more. It integrates with the Python Package Index (“PyPI”), a central repository for open source Python libraries.

All of these facets of Distutils center around the setup script, traditionally called setup.py. In fact, you’ve already seen several Distutils setup scripts in this book: you used one to install httplib2 in “HTTP Web Services,” and another to install chardet in “Case Study: Porting chardet to Python 3.”

In this chapter, you’ll learn how the setup script for chardet works and step through the process of releasing your own Python software.

# chardet's setup.py
from distutils.core import setup
setup(
    name = "chardet",
    packages = ["chardet"],
    version = "1.0.1",
    description = "Universal encoding detector",
    author = "Mark Pilgrim",
    author_email = "mark@diveintomark.org",
    url = "http://chardet.feedparser.org/",
    download_url = "http://chardet.feedparser.org/download/python3-chardet-1.0.1.tgz",
    keywords = ["encoding", "i18n", "xml"],
    classifiers = [
        "Programming Language :: Python",
        "Programming Language :: Python :: 3",
        "Development Status :: 4 - Beta",
        "Environment :: Other Environment",
        "Intended Audience :: Developers",
        "License :: OSI Approved :: GNU Library or Lesser General Public License (LGPL)",
        "Operating System :: OS Independent",
        "Topic :: Software Development :: Libraries :: Python Modules",
        "Topic :: Text Processing :: Linguistic",
        ],
    long_description = """\
Universal character encoding detector
-------------------------------------

Detects
 - ASCII, UTF-8, UTF-16 (2 variants), UTF-32 (4 variants)
 - Big5, GB2312, EUC-TW, HZ-GB-2312, ISO-2022-CN (Traditional and Simplified Chinese)
 - EUC-JP, SHIFT_JIS, ISO-2022-JP (Japanese)
 - EUC-KR, ISO-2022-KR (Korean)
 - KOI8-R, MacCyrillic, IBM855, IBM866, ISO-8859-5, windows-1251 (Cyrillic)
 - ISO-8859-2, windows-1250 (Hungarian)
 - ISO-8859-5, windows-1251 (Bulgarian)
 - windows-1252 (English)
 - ISO-8859-7, windows-1253 (Greek)
 - ISO-8859-8, windows-1255 (Visual and Logical Hebrew)
 - TIS-620 (Thai)

This version requires Python 3 or later; a Python 2 version is available separately.
"""
)

chardet and httplib2 are open source, but there’s no requirement that you release your own Python libraries under any particular license. The process described in this chapter will work for any Python software, regardless of license.

Things Distutils Can’t Do For You

Releasing your first Python package is a daunting process. (Releasing your second one is a little easier.) Distutils tries to automate as much of it as possible, but there are some things you simply must do yourself.

Directory Structure

To start packaging your Python software, you need to get your files and directories in order. The httplib2 directory looks like this:

httplib2/                 
|
+--README.txt             
|
+--setup.py               
|
+--httplib2/              
   |
   +--__init__.py
   |
   +--iri2uri.py
  1. Make a root directory to hold everything. Give it the same name as your Python module.
  2. To accomodate Windows users, your “read me” file should include a .txt extension, and it should use Windows-style carriage returns. Just because you use a fancy text editor that runs from the command line and includes its own macro language, that doesn’t mean you need to make life difficult for your users. (Your users use Notepad. Sad but true.) Even if you’re on Linux or Mac OS X, your fancy text editor undoubtedly has an option to save files with Windows-style carriage returns.
  3. Your Distutils setup script should be named setup.py unless you have a good reason not to. You do not have a good reason not to.
  4. If your Python software is a single .py file, you should put it in the root directory along with your “read me” file and your setup script. But httplib2 is not a single .py file; it’s a multi-file module. But that’s OK! Just put the httplib2 directory in the root directory, so you have an __init__.py file within an httplib2/ directory within the httplib2/ root directory. That’s not a problem; in fact, it will simplify your packaging process.

The chardet directory looks slightly different. Like httplib2, it’s a multi-file module, so there’s a chardet/ directory within the chardet/ root directory. In addition to the README.txt file, chardet has HTML-formatted documentation in the docs/ directory. The docs/ directory contains several .html files and an images/ subdirectory, which contains several .png and .gif files. (This will be important later.) Also, in keeping with the convention for (L)GPL-licensed software, it has a separate file called COPYING.txt which contains the complete text of the LGPL.


chardet/
|
+--COPYING.txt
|
+--setup.py
|
+--README.txt
|
+--docs/
|  |
|  +--index.html
|  |
|  +--usage.html
|  |
|  +--images/ ...
|
+--chardet/
   |
   +--__init__.py
   |
   +--big5freq.py
   |
   +--...

Writing Your Setup Script

The Distutils setup script is a Python script. In theory, it can do anything Python can do. In practice, it should do as little as possible, in as standard a way as possible. Setup scripts should be boring. The more exotic your installation process is, the more exotic your bug reports will be.

The first line of every Distutils setup script is always the same:

from distutils.core import setup

This imports the setup() function, which is the main entry point into Distutils. 95% of all Distutils setup scripts consist of a single call to setup() and nothing else. (I totally just made up that statistic, but if your Distutils setup script is doing more than calling the Distutils setup() function, you should have a good reason.)

The setup() function can take dozens of parameters. For the sanity of everyone involved, you must use named arguments for every parameter. This is not merely a convention; it’s a hard requirement. Your setup script will crash if you try to call the setup() function with non-named arguments.

The following named arguments are required:

Although not required, I recommend that you also include the following in your setup script:

Setup script metadata is defined in PEP 314.

Now let’s look at the chardet setup script. It has all of these required and recommended parameters, plus one I haven’t mentioned yet: packages.

from distutils.core import setup
setup(
    name = 'chardet',
    packages = ['chardet'],
    version = '1.0.2',
    description = 'Universal encoding detector',
    author='Mark Pilgrim',
    ...
)

The packages parameter highlights an unfortunate vocabulary overlap in the distribution process. We’ve been talking about the “package” as the thing you’re building (and potentially listing in The Python “Package” Index). But that’s not what this packages parameter refers to. It refers to the fact that the chardet module is a multi-file module, sometimes known as… a “package.” The packages parameter tells Distutils to include the chardet/ directory, its __init__.py file, and all the other .py files that constitute the chardet module. That’s kind of important; all this happy talk about documentation and metadata is irrelevant if you forget to include the actual code!

Classifying Your Package

The Python Package Index (“PyPI”) contains thousands of Python libraries. Proper classification metadata will allow people to find yours more easily.

The classifiers parameter to the Distutils setup() function is a list of strings. These strings are not freeform. All of your classifier strings should come from this master list on PyPI.

The Python Package Index lets you browse packages by classifier. You can even select multiple classifiers to narrow your search. Classifiers are not invisible metadata that you can just ignore! They’re quite visible and very useful.

You should always include at least these four classifiers:

I strongly recommend that you also include the following classifications:

Examples of Good Package Classifiers

By way of example, here are the classifiers for Django, a production-ready, cross-platform, BSD-licensed content management system that runs on your web server. (Django is not yet compatible with Python 3, so the Programming Language :: Python :: 3 classifier is not listed.)

Programming Language :: Python
License :: OSI Approved :: BSD License
Operating System :: OS Independent
Development Status :: 5 - Production/Stable
Environment :: Web Environment
Framework :: Django
Intended Audience :: Developers
Topic :: Internet :: WWW/HTTP
Topic :: Internet :: WWW/HTTP :: Dynamic Content
Topic :: Internet :: WWW/HTTP :: WSGI
Topic :: Software Development :: Libraries :: Python Modules

Here are the classifiers for chardet, the character encoding detection library covered in Case Study: Porting chardet to Python 3. chardet is beta quality, cross-platform, Python 3-compatible, LGPL-licensed, and intended for developers to integrate into their own products.

Programming Language :: Python
Programming Language :: Python :: 3
License :: OSI Approved :: GNU Library or Lesser General Public License (LGPL)
Operating System :: OS Independent
Development Status :: 4 - Beta
Environment :: Other Environment
Intended Audience :: Developers
Topic :: Text Processing :: Linguistic
Topic :: Software Development :: Libraries :: Python Modules

And here are the classifiers for httplib2, the HTTP module I mentioned at the beginning of this chapter. httplib2 is beta quality, cross-platform, MIT-licensed, and intended for Python developers.

Programming Language :: Python
Programming Language :: Python :: 3
License :: OSI Approved :: MIT License
Operating System :: OS Independent
Development Status :: 4 - Beta
Environment :: Web Environment
Intended Audience :: Developers
Topic :: Internet :: WWW/HTTP
Topic :: Software Development :: Libraries :: Python Modules

Specifying Additional Files With A Manifest

FIXME

include COPYING.txt
recursive-include docs *.html *.png *.gif

Checking Your Setup Script for Errors

FIXME

c:\Users\pilgrim\chardet> c:\python31\python.exe setup.py check
running check
warning: check: missing required meta-data: version

FIXME

c:\Users\pilgrim\chardet> c:\python31\python.exe setup.py check
running check

Creating Source Distributions

archive formats?
FIXME need to redo this now that we have a MANIFEST.in file
c:\Users\pilgrim\chardet> c:\python31\python.exe setup.py sdist
running sdist
running check
reading manifest file 'MANIFEST'
creating chardet-1.0.2
creating chardet-1.0.2\chardet
copying files to chardet-1.0.2...
copying setup.py -> chardet-1.0.2
copying chardet\__init__.py -> chardet-1.0.2\chardet
copying chardet\big5freq.py -> chardet-1.0.2\chardet
...
copying chardet\universaldetector.py -> chardet-1.0.2\chardet
copying chardet\utf8prober.py -> chardet-1.0.2\chardet
creating dist
creating 'dist\chardet-1.0.2.zip' and adding 'chardet-1.0.2' to it
adding 'chardet-1.0.2\PKG-INFO'
adding 'chardet-1.0.2\setup.py'
adding 'chardet-1.0.2\chardet\big5freq.py'
adding 'chardet-1.0.2\chardet\big5prober.py'
...
adding 'chardet-1.0.2\chardet\universaldetector.py'
adding 'chardet-1.0.2\chardet\utf8prober.py'
adding 'chardet-1.0.2\chardet\__init__.py'
removing 'chardet-1.0.2' (and everything under it)

c:\Users\pilgrim\chardet> dir dist
 Volume in drive C has no label.
 Volume Serial Number is DED5-B4F8

 Directory of c:\Users\pilgrim\chardet\dist

07/30/2009  04:47 AM    <DIR>          .
07/30/2009  04:47 AM    <DIR>          ..
07/30/2009  04:47 AM           175,367 chardet-1.0.2.zip
               1 File(s)        175,367 bytes
               2 Dir(s)  62,235,222,016 bytes free

Creating Binary Distributions

python3 setup.py bdist --help-formats
http://docs.python.org/3.1/distutils/builtdist.html#creating-windows-installers

Building A Windows Installer

FIXME probably need to redo this too
c:\Users\pilgrim\chardet> c:\python31\python.exe setup.py bdist_wininst
running bdist_wininst
running build
running build_py
installing to build\bdist.win32\wininst
running install_lib
creating build\bdist.win32\wininst
creating build\bdist.win32\wininst\PURELIB
creating build\bdist.win32\wininst\PURELIB\chardet
copying build\lib\chardet\big5freq.py -> build\bdist.win32\wininst\PURELIB\chardet
copying build\lib\chardet\big5prober.py -> build\bdist.win32\wininst\PURELIB\chardet
copying build\lib\chardet\chardistribution.py -> build\bdist.win32\wininst\PURELIB\chardet
...
copying build\lib\chardet\universaldetector.py -> build\bdist.win32\wininst\PURELIB\chardet
copying build\lib\chardet\utf8prober.py -> build\bdist.win32\wininst\PURELIB\chardet
copying build\lib\chardet\__init__.py -> build\bdist.win32\wininst\PURELIB\chardet
running install_egg_info
Writing build\bdist.win32\wininst\PURELIB\chardet-1.0.2-py3.1.egg-info
creating 'c:\users\pilgrim\appdata\local\temp\tmp58b9n5.zip' and adding '.' to it
adding 'PURELIB\chardet-1.0.2-py3.1.egg-info'
adding 'PURELIB\chardet\big5freq.py'
adding 'PURELIB\chardet\big5prober.py'
adding 'PURELIB\chardet\chardistribution.py'
...
adding 'PURELIB\chardet\universaldetector.py'
adding 'PURELIB\chardet\utf8prober.py'
adding 'PURELIB\chardet\__init__.py'
removing 'build\bdist.win32\wininst' (and everything under it)

c:\Users\pilgrim\chardet> dir dist
 Volume in drive C has no label.
 Volume Serial Number is DED5-B4F8

 Directory of c:\Users\pilgrim\chardet\dist

07/30/2009  03:52 AM    <DIR>          .
07/30/2009  03:52 AM    <DIR>          ..
07/30/2009  03:52 AM           371,236 chardet-1.0.2.win32.exe
               2 File(s)        546,603 bytes
               2 Dir(s)  62,235,119,616 bytes free
works on non-Windows (as long as the package is pure-Python)
UAC?

Building a Linux RPM Package

Adding Your Package to The Python Package Index

Registering Yourself

Registering Your Package

Uploading New Versions

@jessenoller sez:
 * distutils, how to make a setup.py (and include data files, tests, docs, etc for a project)
 * how to upload it to pypi properly (label it for python 3 for the love of pete)
 * how to build build bdist/RPMs/DEBs.
 * If you can - and I've forgotten how much distutils supports of this, cover dependency management.
 * Oh, I almost forgot - cover the Per-User Site packages stuff, PEP 370

The Many Possible Futures of Python Packaging

Distutils is not the be-all and end-all of Python packaging, but as of this writing (August 2009), it’s the only packaging framework that works in Python 3. There are a number of other frameworks for Python 2; some focus on installation, others on testing and deployment. Some or all of these may end up being ported to Python 3 in the future.

These frameworks focus on installation:

These focus on testing and deployment:

Further Reading

On Distutils:

On other packaging frameworks:

© 2001–9 Mark Pilgrim