diff --git a/strings.html b/strings.html
index ba172cc..c802a39 100644
--- a/strings.html
+++ b/strings.html
@@ -51,7 +51,7 @@ My alphabet starts where your alphabet ends! ❞
&m
Enter Unicode. +
Enter Unicode.
Unicode is a system designed to represent every character from every language. Unicode represents each letter, character, or ideograph as a 4-byte number. Each number represents a unique character used in at least one of the world’s languages. (Not all the numbers are used, but more than 65535 of them are, so 2 bytes wouldn’t be sufficient.) Characters that are used in multiple languages generally have the same number, unless there is a good etymological reason not to. Regardless, there is exactly 1 number per character, and exactly 1 character per number. Every number always means just one thing; there are no “modes” to keep track of. U+0041 is always 'A', even if your language doesn’t have an 'A' in it.
@@ -93,9 +93,9 @@ My alphabet starts where your alphabet ends! ❞
&m
'深入 Python 3'
') or double quotes (").
-len() function returns the length of the string, i.e. the number of characters. This is the same function you use to find the length of a list. A string is like a list of characters.
+len() function returns the length of the string, i.e. the number of characters. This is the same function you use to find the length of a list. A string is like a list of characters.
+ operator.
++ operator.
⁂ @@ -138,7 +138,7 @@ def approximate_size(size, a_kilobyte_is_1024_bytes=True):
Python 3 supports formatting values into strings. Although this can include very complicated expressions, the most basic usage is to insert a value into a string with single placeholder. +
Python 3 supports formatting values into strings. Although this can include very complicated expressions, the most basic usage is to insert a value into a string with single placeholder.
>>> username = 'mark' @@ -147,7 +147,7 @@ def approximate_size(size, a_kilobyte_is_1024_bytes=True): "mark's password is PapayaWhip"
{0} and {1} are replacement fields, which are replaced by the arguments passed to the format() method.
+{0} and {1} are replacement fields, which are replaced by the arguments passed to the format() method.
{1} is replaced with the second argument passed to the format() method, which is suffix. But what is {0:.1f}? It’s two things: {0}, which you recognize, and :.1f, which you don’t. The second half (including and after the colon) defines the format specifier, which further refines how the replaced variable should be formatted.
-☞Format specifiers allow you to munge the replacement text in a variety of useful ways, like the
printf()function in C. You can add zero- or space-padding, align strings, control decimal precision, and even convert numbers to hexadecimal. +☞Format specifiers allow you to munge the replacement text in a variety of useful ways, like the
printf()function in C. You can add zero- or space-padding, align strings, control decimal precision, and even convert numbers to hexadecimal.
Within a replacement field, a colon (:) marks the start of the format specifier. The format specifier “.1” means “round to the nearest tenth” (i.e. display only one digit after the decimal point). The format specifier “f” means “fixed-point number” (as opposed to exponential notation or some other decimal representation). Thus, given a size of 698.25 and suffix of 'GB', the formatted string would be '698.3 GB', because 698.25 gets rounded to one decimal place, then the suffix is appended after the number.
@@ -242,8 +242,8 @@ experience of years.
>>> s.lower().count('f') ④
6
splitlines() method takes one multi-line string and returns a list of strings, one for each line of the original. Note that the carriage returns at the end of each line are not included.
+splitlines() method takes one multiline string and returns a list of strings, one for each line of the original. Note that the carriage returns at the end of each line are not included.
lower() method converts the entire string to lowercase. (Similarly, the upper() method converts a string to uppercase.)
count() method counts the number of occurrences of a substring. Yes, there really are six “f”s in that sentence!
split() string method takes one argument, a delimiter, and split a string into a list of strings based on the delimiter. Here, the delimiter is an ampersand character, but it could be anything.
+split() string method takes one argument, a delimiter, and split a string into a list of strings based on the delimiter. Here, the delimiter is an ampersand character, but it could be anything.
'key=value=foo'.split('='), we would end up with a three-item list ['key', 'value', 'foo'].)
dict() function.
Bytes are bytes; characters are an abstraction. An immutable sequence of Unicode characters is called a string. An immutable sequence of numbers-between-0-and-255 is called a bytes object. +
Bytes are bytes; characters are an abstraction. An immutable sequence of Unicode characters is called a string. An immutable sequence of numbers-between-0-and-255 is called a bytes object.
>>> by = b'abcd\x65' ① @@ -294,7 +294,7 @@ experience of years. File "<stdin>", line 1, in <module> TypeError: 'bytes' object does not support item assignment
bytes object, use the b'' “byte literal” syntax. Each byte within the byte literal can be an ASCII character or an encoded hexadecimal number from \x00 to \xff (0–255).
+bytes object, use the b'' “byte literal” syntax. Each byte within the byte literal can be an ASCII character or an encoded hexadecimal number from \x00 to \xff (0–255).
bytes object is bytes.
bytes object with the built-in len() function.
+ operator to concatenate bytes objects. The result is a new bytes object.
@@ -336,11 +336,11 @@ TypeError: Can't convert 'bytes' object to str implicitly
1
And here is the link between strings and bytes: bytes objects have a decode() method that takes a character encoding and returns a string, and strings have an encode() method that takes a character encoding and returns a bytes object. In the previous example, the decoding was relatively straightforward — converting a sequence of bytes n the ASCII encoding into a string of characters. But the same process works with any encoding that supports the characters of the string — even legacy (non-Unicode) encodings.
+
And here is the link between strings and bytes: bytes objects have a decode() method that takes a character encoding and returns a string, and strings have an encode() method that takes a character encoding and returns a bytes object. In the previous example, the decoding was relatively straightforward — converting a sequence of bytes n the ASCII encoding into a string of characters. But the same process works with any encoding that supports the characters of the string — even legacy (non-Unicode) encodings.
>>> a_string = '深入 Python' ① @@ -381,7 +381,7 @@ TypeError: Can't convert 'bytes' object to str implicitlyPython 3 assumes that your source code — i.e. each
.pyfile — is encoded in UTF-8.-☞In Python 2, the default encoding for
.pyfiles was ASCII. In Python 3, the default encoding is UTF-8. +☞In Python 2, the default encoding for
.pyfiles was ASCII. In Python 3, the default encoding is UTF-8.If you would like to use a different encoding within your Python code, you can put an encoding declaration on the first line of each file. This declaration defines a
.pyfile to be windows-1252: