Your first Python program

Don’t bury your burden in saintly silence. You have a problem? Great. Rejoice, dive in, and investigate.
Ven. Henepola Gunararatana

  1. Diving in
  2. Declaring functions
  3. Documenting functions

Diving in

You know how other books go on and on about programming fundamentals and finally work up to building something useful? Let's skip all that. Here is a complete, working Python program. It probably makes absolutely no sense to you. Don't worry about that, because you're going to dissect it line by line. But read through it first and see what, if anything, you can make of it.

SUFFIXES = {1000: ('KB', 'MB', 'GB', 'TB', 'PB', 'EB', 'ZB', 'YB'),
            1024: ('KiB', 'MiB', 'GiB', 'TiB', 'PiB', 'EiB', 'ZiB', 'YiB')}

def approximate_size(size, a_kilobyte_is_1024_bytes=True):
    """Convert a file size to human-readable form.

    Keyword arguments:
    size -- file size in bytes
    a_kilobyte_is_1024_bytes -- if True (default), use multiples of 1024
                                if False, use multiples of 1000

    Returns: string

    """
    if size < 0:
        raise ValueError('number must be non-negative')

    multiple = 1024 if a_kilobyte_is_1024_bytes else 1000
    for suffix in SUFFIXES[multiple]:
        size /= multiple
        if size < multiple:
            return "{0:.1f} {1}".format(size, suffix)

    raise ValueError('number too large')

if __name__ == "__main__":
    print(approximate_size(1000000000000, False))
    print(approximate_size(1000000000000))

Now let's run this program on the command line. On Windows, it will look something like this:

c:\home\diveintopython3> c:\python30\python.exe humansize.py
1.0 TB
931.3 GiB

On Mac OS X or Linux, it would look something like this:

you@localhost:~$ python3 humansize.py
1.0 TB
931.3 GiB

Declaring functions

Python has functions like most other languages, but it does not have separate header files like C++ or interface/implementation sections like Pascal. When you need a function, just declare it, like this:

def approximate_size(size, a_kilobyte_is_1024_bytes=True):

Note that the keyword def starts the function declaration, followed by the function name, followed by the arguments in parentheses. Multiple arguments are separated with commas.

Also note that the function doesn't define a return datatype. Python functions do not specify the datatype of their return value; they don't even specify whether or not they return a value. (In fact, every Python function returns a value; if the function ever executes a return statement, it will return that value, otherwise it will return None, the Python null value.)

In some languages, functions (that return a value) start with function, and subroutines (that do not return a value) start with sub. There are no subroutines in Python. Everything is a function, all functions return a value (even if it's None), and all functions start with def.

The approximate_size function takes the two arguments — size and a_kilobyte_is_1024_bytes — but neither argument specifies a datatype. (As you might guess from the =True syntax, the second argument is a boolean. You'll learn what that syntax does in [FIXME xref].) In Python, variables are never explicitly typed. Python figures out what type a variable is and keeps track of it internally.

In Java, C++, and other statically-typed languages, you must specify the datatype of the function return value and each function argument. In Python, you never explicitly specify the datatype of anything. Based on what value you assign, Python keeps track of the datatype internally.

How Python's Datatypes Compare to Other Programming Languages

An erudite reader sent me this explanation of how Python compares to other programming languages:

statically typed language
A language in which types are fixed at compile time. Most statically typed languages enforce this by requiring you to declare all variables with their datatypes before using them. Java and C are statically typed languages.
dynamically typed language
A language in which types are discovered at execution time; the opposite of statically typed. JavaScript and Python are dynamically typed, because they figure out what type a variable is when you first assign it a value.
strongly typed language
A language in which types are always enforced. Java and Python are strongly typed. If you have an integer, you can't treat it like a string without explicitly converting it.
weakly typed language
A language in which types are “automagically” coerced to other types as needed; the opposite of strongly typed. PHP is weakly typed. In PHP, you can concatenate the string '12' and the integer 3 to get the string '123', then treat that as the integer 123, all without any explicit conversion. [FIXME double-check this]

So Python is both dynamically typed (because it doesn't use explicit datatype declarations) and strongly typed (because once a variable has a datatype, it actually matters).

If you have experience in other programming languages, this table may help you visualize how Python compares to them:

Statically typedDynamically typed
Weakly typedC, Objective-CJavaScript, Perl 5, PHP
Strongly typedPascal, JavaPython, Ruby

Documenting functions

You can document a Python function by giving it a docstring. In this program, the approximate_size function has a docstring:

def approximate_size(size, a_kilobyte_is_1024_bytes=True):
    """Convert a file size to human-readable form.

    Keyword arguments:
    size -- file size in bytes
    a_kilobyte_is_1024_bytes -- if True (default), use multiples of 1024
                                if False, use multiples of 1000

    Returns: string

    """

Triple quotes signify a multi-line string. Everything between the start and end quotes is part of a single string, including carriage returns and other quote characters. You can use them anywhere, but you'll see them most often used when defining a docstring.

Triple quotes are also an easy way to define a string with both single and double quotes, like qq/.../ in Perl 5.

Everything between the triple quotes is the function's docstring, which documents what the function does. A docstring, if it exists, must be the first thing defined in a function (that is, on the next line after the function declaration). You don't technically need to give your function a docstring, but you always should. I know you've heard this in every programming class you've ever taken, but Python gives you an added incentive: the docstring is available at runtime as an attribute of the function.

Many Python IDEs use the docstring to provide context-sensitive documentation, so that when you type a function name, its docstring appears as a tooltip. This can be incredibly helpful, but it's only as good as the docstrings you write.

Further reading

© 2001-4, 2009 ark Pilgrim, CC-BY-3.0