You are here: Home Dive Into Python 3

Difficulty level: ♦♦♦♦♢

Packaging Python Libraries

FIXME
— FIXME

 

Diving In

So you want to release a Python script, library, framework, or application. Excellent. The world needs more Python code.

Python 3 comes with a packaging framework called Distutils. Distutils is many things: a build tool (for you), an installation tool (for your users), a package metadata format (for search engines), and more. It integrates with the Python Package Index (“PyPI”), a central repository for open source Python libraries.

All of these facets of Distutils center around the setup script, traditionally called setup.py. In fact, you’ve already seen several Distutils setup scripts in this book: you used one to install httplib2 in “HTTP Web Services,” and another to install chardet in “Case Study: Porting chardet to Python 3.”

In this chapter, you’ll learn how the setup script for chardet works and step through the process of releasing your own Python software.

# chardet's setup.py
from distutils.core import setup
setup(
    name = "chardet",
    packages = ["chardet"],
    version = "1.0.1",
    description = "Universal encoding detector",
    author = "Mark Pilgrim",
    author_email = "mark@diveintomark.org",
    url = "http://chardet.feedparser.org/",
    download_url = "http://chardet.feedparser.org/download/python3-chardet-1.0.1.tgz",
    keywords = ["encoding", "i18n", "xml"],
    classifiers = [
        "Programming Language :: Python",
        "Programming Language :: Python :: 3",
        "Development Status :: 4 - Beta",
        "Environment :: Other Environment",
        "Intended Audience :: Developers",
        "License :: OSI Approved :: GNU Library or Lesser General Public License (LGPL)",
        "Operating System :: OS Independent",
        "Topic :: Software Development :: Libraries :: Python Modules",
        "Topic :: Text Processing :: Linguistic",
        ],
    long_description = """\
Universal character encoding detector
-------------------------------------

Detects
 - ASCII, UTF-8, UTF-16 (2 variants), UTF-32 (4 variants)
 - Big5, GB2312, EUC-TW, HZ-GB-2312, ISO-2022-CN (Traditional and Simplified Chinese)
 - EUC-JP, SHIFT_JIS, ISO-2022-JP (Japanese)
 - EUC-KR, ISO-2022-KR (Korean)
 - KOI8-R, MacCyrillic, IBM855, IBM866, ISO-8859-5, windows-1251 (Cyrillic)
 - ISO-8859-2, windows-1250 (Hungarian)
 - ISO-8859-5, windows-1251 (Bulgarian)
 - windows-1252 (English)
 - ISO-8859-7, windows-1253 (Greek)
 - ISO-8859-8, windows-1255 (Visual and Logical Hebrew)
 - TIS-620 (Thai)

This version requires Python 3 or later; a Python 2 version is available separately.
"""
)

chardet and httplib2 are open source, but there’s no requirement that you release your own Python libraries under any particular license. The process described in this chapter will work for any Python software, regardless of license.

Things Distutils Can’t Do For You

Releasing your first Python package is a daunting process. (Releasing your second one is easier.) Distutils tries to automate as much of it as possible, but there are some things you simply must do yourself.

Directory Structure

To start packaging your Python software, you need to get your files and directories in order. The httplib2 directory looks like this:

httplib2/                 
|
+--README.txt             
|
+--setup.py               
|
+--httplib2/              
   |
   +--__init__.py
   |
   +--iri2uri.py
  1. Make a root directory to hold everything. Give it the same name as your Python module.
  2. To accomodate Windows users, your “read me” file should include a .txt extension, and it should use Windows-style carriage returns. Just because you use a fancy text editor that runs from the command line and includes its own macro language, that doesn’t mean you need to make life difficult for your users. (Your users use Notepad. Sad but true.) Even if you’re on Linux or Mac OS X, your fancy text editor undoubtedly has an option to save files with Windows-style carriage returns.
  3. Your Distutils setup script should be named setup.py unless you have a good reason not to. You do not have a good reason not to.
  4. If your Python software is a single .py file, you should put it in the root directory along with your “read me” file and your setup script. If it’s a multi-file module (i.e. a directory with a main __init__.py script), like httplib2, you should put the entire directory here. Yes, that means you’ll have an httplib2/ directory within an httplib2/ directory. Trust me, that’s not a problem. In fact, any other arrangement would be a problem.

The chardet directory looks slightly different. Instead of a “read me” file, it has HTML-formatted documentation in a docs/ directory. Also, in keeping with the convention for (L)GPL-licensed software, it has a separate file called COPYING which contains the complete text of the LGPL.

chardet/
|
+--COPYING
|
+--setup.py
|
+--docs/
|  |
|  +--index.html
|  |
|  +--usage.html
|  |
|  +--...
|
+--chardet/
   |
   +--__init__.py
   |
   +--big5freq.py
   |
   +--...

Writing Your Setup Script

The Distutils setup script is a Python script. In theory, it can do anything Python can do. In practice, it should do as little as possible, in as standard a way as possible. Setup scripts should be boring. The more exotic your installation process is, the more exotic your bug reports will be.

The first line of every Distutils setup script is always the same:

from distutils.core import setup

This imports the setup() function, which is the main entry point into Distutils. 95% of all Distutils setup scripts consist of a single call to setup() and nothing else. (I totally just made up that statistic, but if your Distutils setup script is doing more than calling the Distutils setup() function, you should have a good reason.)

The setup() function can take dozens of parameters. For the sanity of everyone involved, you must use named arguments for every parameter. This is not merely a convention; it’s a hard requirement. Your setup script will crash if you try to call the setup() function with non-named arguments.

The following named arguments are required:

Although not required, I recommend that you also include

Setup script metadata is defined in PEP 314.

Classifying Your Package

The Python Package Index (“PyPI”) contains thousands of Python libraries. Proper classification metadata will allow people to find yours more easily.

The PyPI classification system is based on SourceForce’s software map. Classifiers are strings, but they are not freeform.

All of your classifiers should come from this master list on PyPI.

You should always include at least these four classifiers:

I strongly recommend that you also include the following classifications:

Examples of Good Package Classifiers

By way of example, here are the classifiers for Django, a production-ready, cross-platform, BSD-licensed content management system that runs on your web server. (Django is not yet compatible with Python 3, so the Programming Language :: Python :: 3 classifier is not listed.)

Programming Language :: Python
License :: OSI Approved :: BSD License
Operating System :: OS Independent
Development Status :: 5 - Production/Stable
Environment :: Web Environment
Framework :: Django
Intended Audience :: Developers
Topic :: Internet :: WWW/HTTP
Topic :: Internet :: WWW/HTTP :: Dynamic Content
Topic :: Internet :: WWW/HTTP :: WSGI
Topic :: Software Development :: Libraries :: Python Modules

Here are the classifiers for chardet, the character encoding detection library covered in Case Study: Porting chardet to Python 3. chardet is beta quality, cross-platform, Python 3-compatible, LGPL-licensed, and intended for developers to integrate into their own products.

Programming Language :: Python
Programming Language :: Python :: 3
License :: OSI Approved :: GNU Library or Lesser General Public License (LGPL)
Operating System :: OS Independent
Development Status :: 4 - Beta
Environment :: Other Environment
Intended Audience :: Developers
Topic :: Text Processing :: Linguistic
Topic :: Software Development :: Libraries :: Python Modules

And here are the classifiers for httplib2, the HTTP module I mentioned at the beginning of this chapter. httplib2 is beta quality, cross-platform, MIT-licensed, and intended for Python developers.

Programming Language :: Python
Programming Language :: Python :: 3
License :: OSI Approved :: MIT License
Operating System :: OS Independent
Development Status :: 4 - Beta
Environment :: Web Environment
Intended Audience :: Developers
Topic :: Internet :: WWW/HTTP
Topic :: Software Development :: Libraries :: Python Modules

Checking Your Setup Script for Errors

Creating Source Distributions

Creating Binary Distributions

Building A Windows Installer

Building a Linux RPM Package

Adding Your Package to The Python Package Index

Registering Yourself

Registering Your Package

Uploading New Versions

The Many Possible Futures of Python Packaging

Distutils is not the be-all and end-all of Python packaging, but as of this writing (August 2009), it’s the only packaging framework that works in Python 3. There are a number of other frameworks for Python 2; some focus on installation, others on testing and deployment. Some or all of these may end up being ported to Python 3 in the future.

These frameworks focus on installation:

These focus on testing and deployment:

Further Reading

On Distutils:

On other packaging frameworks:

© 2001–9 Mark Pilgrim