mirror of
https://github.com/kennethreitz/dive-into-python3.git
synced 2026-06-05 23:10:17 +00:00
265 lines
19 KiB
HTML
265 lines
19 KiB
HTML
<!DOCTYPE html>
|
|
<html lang="en">
|
|
<head>
|
|
<meta charset="utf-8">
|
|
<title>Native datatypes - Dive into Python 3</title>
|
|
<link rel="stylesheet" type="text/css" href="dip3.css">
|
|
<link rel="shortcut icon" href="data:image/ico,">
|
|
<link rel="alternate" type="application/atom+xml" href="http://hg.diveintopython3.org/atom-log">
|
|
<style type="text/css">
|
|
body{counter-reset:h1 2}
|
|
</style>
|
|
</head>
|
|
<body>
|
|
<p class="skip"><a href="#divingin">skip to main content</a>
|
|
<form action="http://www.google.com/cse" id="search"><div><input type="hidden" name="cx" value="014021643941856155761:l5eihuescdw"><input type="hidden" name="ie" value="UTF-8"> <input name="q" size="31"> <input type="submit" name="root" value="Search"></div></form>
|
|
<p class="nav">You are here: <a href="/">Home</a> <span>‣</span> <a href="table-of-contents.html">Dive Into Python 3</a> <span>‣</span>
|
|
<h1>Native datatypes</h1>
|
|
<blockquote class="q">
|
|
<p><span>❝</span> Wonder is the foundation of all philosophy, research its progress, ignorance its end. <span>❞</span><br>— <cite>Michel de Montaigne</cite>
|
|
</blockquote>
|
|
<ol>
|
|
<li><a href="#divingin">Diving in</a>
|
|
<li><a href="#booleans">Booleans</a>
|
|
<li><a href="#numbers">Numbers</a>
|
|
<!--
|
|
<ol>
|
|
<li><a href="#integers">Integers</a>
|
|
<li><a href="#floats">Floating point numbers</a>
|
|
<li><a href="#fractions">Fractions</a>
|
|
<li><a href="#complexnumbers">Complex numbers</a>
|
|
<li><a href="#numberoperations">Common operations on numbers</a>
|
|
<li><a href="#math">The <code>math</code> module</a>
|
|
</ol>
|
|
-->
|
|
<li><a href="#lists">Lists</a>
|
|
<!--
|
|
<ol>
|
|
<li>Creating new a list
|
|
<li>Modifying a list
|
|
<li>Searching a list
|
|
<li>Deleting elements from a list
|
|
<li>Common operations on lists
|
|
</ol>
|
|
-->
|
|
<li><a href="#sets">Sets</a>
|
|
<!--
|
|
<ol>
|
|
<li>Creating a new set
|
|
<li>Modifying a set
|
|
<li>Deleting elements from a set
|
|
<li>Common operations on sets (union, intersection, and difference)
|
|
<li>Frozen sets
|
|
</ol>
|
|
-->
|
|
<li><a href="#dictionaries">Dictionaries</a>
|
|
<li><a href="#none"><code>None</code></a>
|
|
<li><a href="#furtherreading">Further reading</a>
|
|
</ol>
|
|
<h2 id="divingin">Diving in</h2>
|
|
<p class="fancy">A short digression is in order. Put aside <a href="your-first-python-program.html">your first Python program</a> for just a minute, and let's talk about datatypes. <a href="your-first-python-program.html#datatypes">Every variable has a datatype</a>, even though you don't declare it explicitly. Based on each variable's original assignment, Python figures out what type it is and keeps tracks of that internally.
|
|
<p>Python has many native datatypes. Here are the important ones:
|
|
<ol>
|
|
<li><b>Booleans</b> are either <code>True</code> or <code>False</code>.
|
|
<li><b>Numbers</b> can be integers (<code>1</code> and <code>2</code>), floats (<code>1.1</code> and <code>1.2</code>), fractions (<code>1/2</code> and <code>2/3</code>), or even complex numbers (<code><var>i</var></code>, the square root of <code>-1</code>).
|
|
<li><b>Strings</b> are sequences of Unicode characters, <i>e.g.</i> an <abbr>HTML</abbr> document.
|
|
<li><b>Bytes</b> and <b>byte arrays</b>, <i>e.g.</i> a <abbr>JPEG</abbr> image file.
|
|
<li><b>Lists</b> are ordered sequences of values.
|
|
<li><b>Sets</b> are unordered bags of values.
|
|
<li><b>Dictionaries</b> are unordered bags of key-value pairs.
|
|
</ol>
|
|
<p>Of course, there are a lot more types than these seven. <a href="your-first-python-program.html#everythingisanobject">Everything is an object</a> in Python, so there are types like <i>module</i>, <i>function</i>, <i>class</i>, <i>method</i>, <i>file</i>, and even <i>compiled code</i>. You've already seen some of these: <a href="your-first-python-program.html#runningscripts">modules have names</a>, <a href="your-first-python-program.html#docstrings">functions have <code>docstrings</code></a>, <i class="baa">&</i>c. You'll learn about classes in [FIXME xref] and files in [FIXME xref].
|
|
<p>Strings and bytes are important enough — and complicated enough — that they get their own chapter. Let's look at the others first.
|
|
<h2 id="booleans">Booleans</h2>
|
|
<p>Booleans are either true or false. Python has two constants, <code>True</code> and <code>False</code>, which can be used to assign boolean values directly. Expressions can also evaluate to a boolean value. In certain places (like <code>if</code> statements), Python expects an expression to evaluate to a boolean value. These places are called <i>boolean contexts</i>. You can use virtually any expression in a boolean context, and Python will try to determine its truth value. Different datatypes have different rules about which values are true or false in a boolean context. (This will make more sense once you see some concrete examples later in this chapter.)
|
|
<p>For example, take this snippet from <a href="your-first-python-program.html#divingin"><code>humansize.py</code></a>:
|
|
<pre><code>if size < 0:
|
|
raise ValueError('number must be non-negative')</code></pre>
|
|
<p><var>size</var> is an integer, <code>0</code> is an integer, and <code><</code> is a numerical operator. The result of the expression <code>size < 0</code> is always a boolean. You can test this yourself in the Python interactive shell:
|
|
<pre class="screen">
|
|
<samp class="prompt">>>> </samp><kbd>size = 1</kbd>
|
|
<samp class="prompt">>>> </samp><kbd>size < 0</kbd>
|
|
<samp>False</samp>
|
|
<samp class="prompt">>>> </samp><kbd>size = 0</kbd>
|
|
<samp class="prompt">>>> </samp><kbd>size < 0</kbd>
|
|
<samp>False</samp>
|
|
<samp class="prompt">>>> </samp><kbd>size = -1</kbd>
|
|
<samp class="prompt">>>> </samp><kbd>size < 0</kbd>
|
|
<samp>True</samp></pre>
|
|
<h2 id="numbers">Numbers</h2>
|
|
<p>Numbers are awesome. There are so many to choose from. Python supports both integers and floating point numbers. There's no type declaration to distinguish them; Python tells them apart by the presence or absence of a decimal point.
|
|
<pre class="screen">
|
|
<a><samp class="prompt">>>> </samp><kbd>type(1)</kbd> <span>①</span></a>
|
|
<samp><class 'int'></samp>
|
|
<a><samp class="prompt">>>> </samp><kbd>1 + 1</kbd> <span>②</span></a>
|
|
<samp>2</samp>
|
|
<a><samp class="prompt">>>> </samp><kbd>1 + 1.0</kbd> <span>③</span></a>
|
|
<samp>2.0</samp>
|
|
<samp class="prompt">>>> </samp><kbd>type(2.0)</kbd>
|
|
<samp><class 'float'></samp></pre>
|
|
<ol>
|
|
<li>You can use the <code>type()</code> function to check the type of any value or variable. As you might expect, <code>1</code> is an <code>int</code>.
|
|
<li>Adding an <code>int</code> to an <code>int</code> yields an <code>int</code>.
|
|
<li>Adding an <code>int</code> to a <code>float</code> yields a <code>float</code>. Python coerces the <code>int</code> into a <code>float</code> to perform the addition, then returns a <code>float</code> as the result.
|
|
</ol>
|
|
<p>As you just saw, some operators (like addition) will coerce integers to floating point numbers as needed. You can also coerce them by yourself.
|
|
<pre class="screen">
|
|
<a><samp class="prompt">>>> </samp><kbd>float(2)</kbd> <span>①</span></a>
|
|
<samp>2.0</samp>
|
|
<a><samp class="prompt">>>> </samp><kbd>int(2.0)</kbd> <span>②</span></a>
|
|
<samp>2</samp>
|
|
<a><samp class="prompt">>>> </samp><kbd>int(2.5)</kbd> <span>③</span></a>
|
|
<samp>2</samp>
|
|
<a><samp class="prompt">>>> </samp><kbd>int(-2.5)</kbd> <span>④</span></a>
|
|
<samp>-2</samp>
|
|
<a><samp class="prompt">>>> </samp><kbd>1.12345678901234567890</kbd> <span>⑤</span></a>
|
|
<samp>1.1234567890123457</samp>
|
|
<a><samp class="prompt">>>> </samp><kbd>type(1000000000000000)</kbd> <span>⑥</span></a>
|
|
<samp><class 'int'></samp></pre>
|
|
<ol>
|
|
<li>You can explicitly coerce an <code>int</code> to a <code>float</code> by calling the <code>float()</code> function.
|
|
<li>Unsurprisingly, you can also coerce a <code>float</code> to an <code>int</code> by calling <code>int()</code>.
|
|
<li>The <code>int()</code> function will truncate, not round.
|
|
<li>The <code>int()</code> function truncates negative numbers towards <code>0</code>. It's a true truncate function, not a a floor function.
|
|
<li>Floating point numbers are accurate to 15 decimal places.
|
|
<li>Integers can be arbitrarily large.
|
|
</ol>
|
|
<blockquote class="note compare python2">
|
|
<p><span>☞</span>Python 2 had separate types for <code>int</code> and <code>long</code>. The <code>int</code> datatype was limited by <code>sys.maxint</code>, which varied by platform but was usually <code>2<sup>32</sup>-1</code>. Python 3 has just one integer type, which behaves mostly like the old <code>long</code> type from Python 2. See <a href="http://www.python.org/dev/peps/pep-0237">PEP 237</a> for details.
|
|
</blockquote>
|
|
<p>You can do all kinds of things with numbers.
|
|
<pre class="screen">
|
|
<a><samp class="prompt">>>> </samp><kbd>11 / 2</kbd> <span>①</span></a>
|
|
<samp>5.5</samp>
|
|
<a><samp class="prompt">>>> </samp><kbd>11 // 2</kbd> <span>②</span></a>
|
|
<samp>5</samp>
|
|
<a><samp class="prompt">>>> </samp><kbd>−11 // 2</kbd> <span>③</span></a>
|
|
<samp>−6</samp>
|
|
<a><samp class="prompt">>>> </samp><kbd>11.0 // 2</kbd> <span>④</span></a>
|
|
<samp>5.0</samp>
|
|
<a><samp class="prompt">>>> </samp><kbd>11 ** 2</kbd> <span>⑤</span></a>
|
|
<samp>121</samp>
|
|
<a><samp class="prompt">>>> </samp><kbd>11 % 2</kbd> <span>⑥</span></a>
|
|
<samp>1</samp>
|
|
</pre>
|
|
<ol>
|
|
<li>The <code>/</code> operator performs floating point division. It returns a <code>float</code> even if both the numerator and denominator are <code>int</code>s.
|
|
<li>The <code>//</code> operator performs a quirky kind of integer division. When the result is positive, you can think of it as truncating (not rounding) to <code>0</code> decimal places, but be careful with that.
|
|
<li>When integer-dividing negative numbers, the <code>//</code> operator rounds “up” to the nearest integer. Mathematically speaking, it's rounding “down” since <code>−6</code> is less than <code>−5</code>, but it could trip you up if you expecting it to truncate to <code>−5</code>.
|
|
<li>The <code>//</code> operator doesn't always return an integer. If either the numerator or denominator is a <code>float</code>, it will still round to the nearest integer, but the actual return value will be a <code>float</code>.
|
|
<li>The <code>**</code> operator means “raised to the power of.” <code>11<sup>2</sup></code> is <code>121</code>.
|
|
<li>The <code>%</code> operator gives the remainder after performing integer division. <code>11</code> divided by <code>2</code> is <code>5</code> with a remainder of <code>1</code>, so the result here is <code>1</code>.
|
|
</ol>
|
|
<blockquote class="note compare python2">
|
|
<p><span>☞</span>In Python 2, the <code>/</code> operator usually meant integer division, but you could make it behave like floating point division by including a special directive in your code. In Python 3, the <code>/</code> operator always means floating point division. See <a href="http://www.python.org/dev/peps/pep-0238/">PEP 238</a> for details.
|
|
</blockquote>
|
|
<p>FIXME fractions, math module, numbers in a boolean context
|
|
<h2 id="lists">Lists</h2>
|
|
<p>FIXME
|
|
<h2 id="sets">Sets</h2>
|
|
<p>FIXME
|
|
<h2 id="dictionaries">Dictionaries</h2>
|
|
<p>One of Python's most important datatypes is the dictionary, which defines one-to-one relationships between keys and values.
|
|
<blockquote class="note compare perl5">
|
|
<p><span>☞</span>A dictionary in Python is like a hash in Perl 5. In Perl 5, variables that store hashes always start with a <code>%</code> character. In Python, variables can be named anything, and Python keeps track of the datatype internally.
|
|
</blockquote>
|
|
<p>Creating a dictionary is easy. The syntax is similar to <a href="#sets">sets</a>, but instead of values, you have key-value pairs. Once you have a dictionary, you can look up values by their key.
|
|
<pre class="screen">
|
|
<a><samp class="prompt">>>> </samp><kbd>a_dict = {"server":"db.diveintopython3.org", "database":"mysql"}</kbd> <span>①</span></a>
|
|
<samp class="prompt">>>> </samp><kbd>a_dict</kbd>
|
|
<samp>{'server': 'db.diveintopython3.org', 'database': 'mysql'}</samp>
|
|
<a><samp class="prompt">>>> </samp><kbd>a_dict["server"]</kbd> <span>②</span></a>
|
|
'db.diveintopython3.org'
|
|
<a><samp class="prompt">>>> </samp><kbd>a_dict["database"]</kbd> <span>③</span></a>
|
|
'mysql'
|
|
<a><samp class="prompt">>>> </samp><kbd>a_dict["db.diveintopython3.org"]</kbd> <span>④</span></a>
|
|
<samp class="traceback">Traceback (most recent call last):
|
|
File "<stdin>", line 1, in <module>
|
|
KeyError: 'db.diveintopython3.org'</samp></pre>
|
|
<ol>
|
|
<li>First, you create a new dictionary with two elements and assign it to the variable <var>a_dict</var>. Each element is a key-value pair, and the whole set of elements is enclosed in curly braces.
|
|
<li><code>'server'</code> is a key, and its associated value, referenced by <code>a_dict["server"]</code>, is <code>'db.diveintopython3.org'</code>.
|
|
<li><code>'database'</code> is a key, and its associated value, referenced by <code>a_dict["database"]</code>, is <code>'mysql'</code>.
|
|
<li>You can get values by key, but you can't get keys by value. So <code>a_dict["server"]</code> is <code>'db.diveintopython3.org'</code>, but <code>a_dict["db.diveintopython3.org"]</code> raises an exception, because <code>'db.diveintopython3.org'</code> is not a key.
|
|
</ol>
|
|
<p>Dictionaries do not have any predefined size limit. You can add new key-value pairs to a dictionary at any time, or you can modify the value of an existing key. Continuing from the previous example:
|
|
<pre class="screen">
|
|
<samp class="prompt">>>> </samp><kbd>a_dict</kbd>
|
|
<samp>{'server': 'db.diveintopython3.org', 'database': 'mysql'}</samp>
|
|
<a><samp class="prompt">>>> </samp><kbd>a_dict["database"] = "blog"</kbd> <span>①</span></a>
|
|
<samp class="prompt">>>> </samp><kbd>a_dict</kbd>
|
|
<samp>{'server': 'db.diveintopython3.org', 'database': 'blog'}</samp>
|
|
<a><samp class="prompt">>>> </samp><kbd>a_dict["user"] = "mark"</kbd> <span>②</span></a>
|
|
<a><samp class="prompt">>>> </samp><kbd>a_dict</kbd> <span>③</span></a>
|
|
<samp>{'server': 'db.diveintopython3.org', 'user': 'mark', 'database': 'blog'}</samp>
|
|
<a><samp class="prompt">>>> </samp><kbd>a_dict["user"] = "dora"</kbd> <span>④</span></a>
|
|
<samp class="prompt">>>> </samp><kbd>a_dict</kbd>
|
|
<samp>{'server': 'db.diveintopython3.org', 'user': 'dora', 'database': 'blog'}</samp>
|
|
<a><samp class="prompt">>>> </samp><kbd>a_dict["User"] = "mark"</kbd> <span>⑤</span></a>
|
|
<samp class="prompt">>>> </samp><kbd>a_dict</kbd>
|
|
<samp>{'User': 'mark', 'server': 'db.diveintopython3.org', 'user': 'dora', 'database': 'blog'}</samp></pre>
|
|
<ol>
|
|
<li>You can not have duplicate keys in a dictionary. Assigning a value to an existing key will wipe out the old value.
|
|
<li>You can add new key-value pairs at any time. This syntax is identical to modifying existing values.
|
|
<li>The new dictionary item (key <code>'user'</code>, value <code>'mark'</code>) appears to be in the middle. In fact, it was just a coincidence that the elements appeared to be in order in the first example; it is just as much a coincidence that they appear to be out of order now.
|
|
<li>Assigning a value to an existing dictionary key simply replaces the old value with the new one.
|
|
<li>Will this change the value of the <code>user</code> key back to "mark"? No! Look at the key closely — that's a capital <kbd>U</kbd> in <kbd>"User"</kbd>. Dictionary keys are case-sensitive, so this statement is creating a new key-value pair, not overwriting an existing one. It may look similar to you, but as far as Python is concerned, it's completely different.
|
|
</ol>
|
|
<p>Dictionaries aren't just for strings. Dictionary values can be any datatype, including integers, booleans, arbitrary objects, or even other dictionaries. And within a single dictionary, the values don't all need to be the same type; you can mix and match as needed. Dictionary keys are more restricted, but they can be strings, integers, and a few other types. You can also mix and match key datatypes within a dictionary.
|
|
<p>In fact, you've already seen a dictionary with non-string keys and values, in <a href="your-first-python-program.html#divingin">your first Python program</a>.
|
|
<pre><code>SUFFIXES = {1000: ('KB', 'MB', 'GB', 'TB', 'PB', 'EB', 'ZB', 'YB'),
|
|
1024: ('KiB', 'MiB', 'GiB', 'TiB', 'PiB', 'EiB', 'ZiB', 'YiB')}</code></pre>
|
|
<p>Let's tear that apart in the interactive shell.
|
|
<pre class="screen">
|
|
<samp class="prompt">>>> </samp><kbd>SUFFIXES = {1000: ('KB', 'MB', 'GB', 'TB', 'PB', 'EB', 'ZB', 'YB'),</kbd>
|
|
<samp class="prompt">... </samp><kbd> 1024: ('KiB', 'MiB', 'GiB', 'TiB', 'PiB', 'EiB', 'ZiB', 'YiB')}</kbd>
|
|
<a><samp class="prompt">>>> </samp><kbd>len(SUFFIXES)</kbd> <span>①</span></a>
|
|
<samp>2</samp>
|
|
<a><samp class="prompt">>>> </samp><kbd>SUFFIXES[1000]</kbd> <span>②</span></a>
|
|
<samp>('KB', 'MB', 'GB', 'TB', 'PB', 'EB', 'ZB', 'YB')</samp>
|
|
<a><samp class="prompt">>>> </samp><kbd>SUFFIXES[1024]</kbd> <span>③</span></a>
|
|
<samp>('KiB', 'MiB', 'GiB', 'TiB', 'PiB', 'EiB', 'ZiB', 'YiB')</samp>
|
|
<a><samp class="prompt">>>> </samp><kbd>SUFFIXES[1000][3]</kbd> <span>④</span></a>
|
|
<samp>'TB'</samp></pre>
|
|
<ol>
|
|
<li>As with <a href="#lists">lists</a> and <a href="#sets">sets</a>, the <code>len()</code> function gives you the number of items in a dictionary.
|
|
<li><code>1000</code> is a key in the <code>SUFFIXES</code> dictionary; its value is a tuple of eight items (eight strings, to be precise).
|
|
<li>Similarly, <code>1024</code> is a key in the <code>SUFFIXES</code> dictionary; its value is also a tuple of eight items.
|
|
<li>Since <code>SUFFIXES[1000]</code> is a tuple, you can address individual items in the tuple by their 0-based index.
|
|
</ol>
|
|
<h2 id="none"><code>None</code></h2>
|
|
<p><code>None</code> is a special constant in Python. It is a null value. <code>None</code> is not <code>False</code>; it is not <code>0</code>; it is not an empty string. Comparing <code>None</code> to anything other than <code>None</code> will always return <code>False</code>.
|
|
<p><code>None</code> is the only null value. It has its own datatype (<code>NoneType</code>). You can assign <code>None</code> to any variable, but you can not create other <code>NoneType</code> objects. All variables whose value is <code>None</code> are equal to each other.
|
|
<pre class="screen">
|
|
<samp class="prompt">>>> </samp><kbd>type(None)</kbd>
|
|
<samp><class 'NoneType'></samp>
|
|
<samp class="prompt">>>> </samp><kbd>None == False</kbd>
|
|
<samp>False</samp>
|
|
<samp class="prompt">>>> </samp><kbd>None == 0</kbd>
|
|
<samp>False</samp>
|
|
<samp class="prompt">>>> </samp><kbd>None == ''</kbd>
|
|
<samp>False</samp>
|
|
<samp class="prompt">>>> </samp><kbd>None == None</kbd>
|
|
<samp>True</samp>
|
|
<samp class="prompt">>>> </samp><kbd>x = None</kbd>
|
|
<samp class="prompt">>>> </samp><kbd>x == None</kbd>
|
|
<samp>True</samp>
|
|
<samp class="prompt">>>> </samp><kbd>y = None</kbd>
|
|
<samp class="prompt">>>> </samp><kbd>x == y</kbd>
|
|
<samp>True</samp>
|
|
</pre>
|
|
<h3 id="furtherreading">Further reading</h3>
|
|
<ul>
|
|
<li>fractions
|
|
<li>math module
|
|
<li>PEP 237
|
|
<li>PEP 238
|
|
<li>links to appendix
|
|
<li>...etc...
|
|
</ul>
|
|
<p class="c">© 2001-4, 2009 <span>ℳ</span>ark Pilgrim, <a rel="license" href="http://creativecommons.org/licenses/by-sa/3.0/">CC-BY-SA-3.0</a>
|
|
<script type="text/javascript" src="http://www.google.com/jsapi"></script>
|
|
<script type="text/javascript" src="dip3.js"></script>
|
|
</body>
|
|
</html>
|