skip to main content

Native datatypes

Wonder is the foundation of all philosophy, research its progress, ignorance its end.
Michel de Montaigne

  1. Diving in
  2. Booleans
  3. Numbers
  4. Lists
  5. Sets
  6. Dictionaries
  7. None

Diving in

A short digression is in order. Put aside your first Python program for just a minute, and let's talk about datatypes. Every variable has a datatype, even though you don't declare it explicitly. Based on each variable's original assignment, Python figures out what type it is and keeps tracks of that internally.

Python has many native datatypes. Here are the important ones:

  1. Booleans are either True or False.
  2. Numbers can be integers (1 and 2), floats (1.1 and 1.2), fractions (1/2 and 2/3), or even complex numbers (i, the square root of -1).
  3. Strings are sequences of Unicode characters, e.g. an HTML document.
  4. Bytes and byte arrays, e.g. a JPEG image file.
  5. Lists are ordered sequences of values.
  6. Sets are unordered bags of values.
  7. Dictionaries are unordered bags of key-value pairs.

Of course, there are a lot more types than these seven. Everything is an object in Python, so there are types like module, function, class, method, file, and even compiled code. You've already seen some of these: modules have names, functions have docstrings, &c. You'll learn about classes in [FIXME xref] and files in [FIXME xref].

Strings and bytes are important enough — and complicated enough — that they get their own chapter. Let's look at the others first.

Booleans

Booleans are either true or false. Python has two constants, True and False, which can be used to assign boolean values directly. Expressions can also evaluate to a boolean value. In certain places (like if statements), Python expects an expression to evaluate to a boolean value. These places are called boolean contexts. You can use virtually any expression in a boolean context, and Python will try to determine its truth value. Different datatypes have different rules about which values are true or false in a boolean context. (This will make more sense once you see some concrete examples later in this chapter.)

For example, take this snippet from humansize.py:

if size < 0:
    raise ValueError('number must be non-negative')

size is an integer, 0 is an integer, and < is a numerical operator. The result of the expression size < 0 is always a boolean. You can test this yourself in the Python interactive shell:

>>> size = 1
>>> size < 0
False
>>> size = 0
>>> size < 0
False
>>> size = -1
>>> size < 0
True

Numbers

Numbers are awesome. There are so many to choose from. Python supports both integers and floating point numbers.

>>> type(1)                 
<class 'int'>
>>> 1 + 1                   
2
>>> 1 + 1.0                 
2.0
>>> type(2.0)
<class 'float'>
>>> 1.12345678901234567890  
1.1234567890123457
>>> type(1000000000000000)  
<class 'int'>
  1. Integers can be arbitrarily large.

Python 2 had separate types for int and long. The int datatype was limited by sys.maxint, which varied by platform but was usually 232-1. Python 3 has just one integer type, which behaves mostly like the old long type from Python 2.

Lists

FIXME

Sets

FIXME

Dictionaries

One of Python's most important datatypes is the dictionary, which defines one-to-one relationships between keys and values.

A dictionary in Python is like a hash in Perl 5. In Perl 5, variables that store hashes always start with a % character. In Python, variables can be named anything, and Python keeps track of the datatype internally.

Creating a dictionary is easy. The syntax is similar to sets, but instead of values, you have key-value pairs. Once you have a dictionary, you can look up values by their key.

>>> a_dict = {"server":"db.diveintopython3.org", "database":"mysql"}  
>>> a_dict
{'server': 'db.diveintopython3.org', 'database': 'mysql'}
>>> a_dict["server"]                                                  
'db.diveintopython3.org'
>>> a_dict["database"]                                                
'mysql'
>>> a_dict["db.diveintopython3.org"]                                  
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 'db.diveintopython3.org'
  1. First, you create a new dictionary with two elements and assign it to the variable a_dict. Each element is a key-value pair, and the whole set of elements is enclosed in curly braces.
  2. 'server' is a key, and its associated value, referenced by a_dict["server"], is 'db.diveintopython3.org'.
  3. 'database' is a key, and its associated value, referenced by a_dict["database"], is 'mysql'.
  4. You can get values by key, but you can't get keys by value. So a_dict["server"] is 'db.diveintopython3.org', but a_dict["db.diveintopython3.org"] raises an exception, because 'db.diveintopython3.org' is not a key.

Dictionaries do not have any predefined size limit. You can add new key-value pairs to a dictionary at any time, or you can modify the value of an existing key. Continuing from the previous example:

>>> a_dict
{'server': 'db.diveintopython3.org', 'database': 'mysql'}
>>> a_dict["database"] = "blog"  
>>> a_dict
{'server': 'db.diveintopython3.org', 'database': 'blog'}
>>> a_dict["user"] = "mark"      
>>> a_dict                       
{'server': 'db.diveintopython3.org', 'user': 'mark', 'database': 'blog'}
>>> a_dict["user"] = "dora"      
samp class="prompt">>>> a_dict
{'server': 'db.diveintopython3.org', 'user': 'dora', 'database': 'blog'}
>>> a_dict["User"] = "mark"      
>>> a_dict
{'User': 'mark', 'server': 'db.diveintopython3.org', 'user': 'dora', 'database': 'blog'}
  1. You can not have duplicate keys in a dictionary. Assigning a value to an existing key will wipe out the old value.
  2. You can add new key-value pairs at any time. This syntax is identical to modifying existing values.
  3. The new dictionary item (key 'user', value 'mark') appears to be in the middle. In fact, it was just a coincidence that the elements appeared to be in order in the first example; it is just as much a coincidence that they appear to be out of order now.
  4. Assigning a value to an existing dictionary key simply replaces the old value with the new one.
  5. Will this change the value of the user key back to "mark"? No! Look at the key closely — that's a capital U in "User". Dictionary keys are case-sensitive, so this statement is creating a new key-value pair, not overwriting an existing one. It may look similar to you, but as far as Python is concerned, it's completely different.

Dictionaries aren't just for strings. Dictionary values can be any datatype, including integers, booleans, arbitrary objects, or even other dictionaries. And within a single dictionary, the values don't all need to be the same type; you can mix and match as needed. Dictionary keys are more restricted, but they can be strings, integers, and a few other types. You can also mix and match key datatypes within a dictionary.

In fact, you've already seen a dictionary with non-string keys and values, in your first Python program.

SUFFIXES = {1000: ('KB', 'MB', 'GB', 'TB', 'PB', 'EB', 'ZB', 'YB'),
            1024: ('KiB', 'MiB', 'GiB', 'TiB', 'PiB', 'EiB', 'ZiB', 'YiB')}

Let's tear that apart in the interactive shell.

>>> SUFFIXES = {1000: ('KB', 'MB', 'GB', 'TB', 'PB', 'EB', 'ZB', 'YB'),
...             1024: ('KiB', 'MiB', 'GiB', 'TiB', 'PiB', 'EiB', 'ZiB', 'YiB')}
>>> len(SUFFIXES)      
2
>>> SUFFIXES[1000]     
('KB', 'MB', 'GB', 'TB', 'PB', 'EB', 'ZB', 'YB')
>>> SUFFIXES[1024]     
('KiB', 'MiB', 'GiB', 'TiB', 'PiB', 'EiB', 'ZiB', 'YiB')
>>> SUFFIXES[1000][3]  
'TB'
  1. As with lists and sets, the len() function gives you the number of items in a dictionary.
  2. 1000 is a key in the SUFFIXES dictionary; its value is a tuple of eight items (eight strings, to be precise).
  3. Similarly, 1024 is a key in the SUFFIXES dictionary; its value is also a tuple of eight items.
  4. Since SUFFIXES[1000] is a tuple, you can address individual items in the tuple by their 0-based index.

None

None is a special constant in Python. It is a null value. None is not False; it is not 0; it is not an empty string. Comparing None to anything other than None will always return False.

None is the only null value. It has its own datatype (NoneType). You can assign None to any variable, but you can not create other NoneType objects. All variables whose value is None are equal to each other.

>>> type(None)
<class 'NoneType'>
>>> None == False
False
>>> None == 0
False
>>> None == ''
False
>>> None == None
True
>>> x = None
>>> x == None
True
>>> y = None
>>> x == y
True

© 2001-4, 2009 ark Pilgrim, CC-BY-3.0