diff --git a/advanced-iterators.html b/advanced-iterators.html index 6727d13..2427f4a 100755 --- a/advanced-iterators.html +++ b/advanced-iterators.html @@ -6,6 +6,7 @@ @@ -102,6 +103,26 @@ if __name__ == '__main__':
Here’s another example that will stretch your brain a little. + +
+>>> re.findall(' s.*? s', "The sixth sick sheikh's sixth sheep's sick.")
+[' sixth s', " sheikh's s", " sheep's s"]
+
+Surprised? The regular expression looks for a space, an s, and then the shortest possible series of any character (.*?), then a space, then another s. Well, looking at that input string, I see five matches:
+
+
The sixth sick sheikh's sixth sheep's sick.
+The sixth sick sheikh's sixth sheep's sick.
+The sixth sick sheikh's sixth sheep's sick.
+The sixth sick sheikh's sixth sheep's sick.
+The sixth sick sheikh's sixth sheep's sick.
+But the re.findall() function only returned three matches. Specifically, it returned the first, the third, and the fifth. Why is that? Because it doesn’t return overlapping matches. The first match overlaps with the second, so the first is returned and the second is skipped. Then the third overlaps with the fourth, so the third is returned and the fourth is skipped. Finally, the fifth is returned. Three matches, not five.
+
+
This has nothing to do with the alphametics solver; I just thought it was interesting. +
⁂