From e24a73f7a4e6f4d604c3e903a571802a9a32ab0f Mon Sep 17 00:00:00 2001 From: Mark Pilgrim Date: Wed, 12 May 2010 14:49:54 -0400 Subject: [PATCH] be more precise about (a|b|c) --- regular-expressions.html | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/regular-expressions.html b/regular-expressions.html index a680d50..a89fc30 100755 --- a/regular-expressions.html +++ b/regular-expressions.html @@ -221,7 +221,7 @@ body{counter-reset:h1 5}
  • This matches the start of the string, then the first optional M, then CM, then the optional L and all three optional X characters, then the end of the string. MCMLXXX is the Roman numeral representation of 1980.
  • This matches the start of the string, then the first optional M, then CM, then the optional L and all three optional X characters, then fails to match the end of the string because there is still one more X unaccounted for. So the entire pattern fails to match, and returns None. MCMLXXXX is not a valid Roman numeral. - +

    The expression for the ones place follows the same pattern. I’ll spare you the details and show you the end result.

     >>> pattern = '^M?M?M?(CM|CD|D?C?C?C?)(XC|XL|L?X?X?X?)(IX|IV|V?I?I?I?)$'
    @@ -431,7 +431,7 @@ AttributeError: 'NoneType' object has no attribute 'groups'
  • x* matches x zero or more times.
  • x+ matches x one or more times.
  • x{n,m} matches an x character at least n times, but not more than m times. -
  • (a|b|c) matches either a or b or c. +
  • (a|b|c) matches exactly one of a, b or c.
  • (x) in general is a remembered group. You can get the value of what matched by using the groups() method of the object returned by re.search.

    Regular expressions are extremely powerful, but they are not the correct solution for every problem. You should learn enough about them to know when they are appropriate, when they will solve your problems, and when they will cause more problems than they solve.