I am using the following code:
CARRIS_REGEX=r\'(\\d+) ([\\s\\w\\.\\-]+) (\\d+:\\d+) (\\d+m)
-
I get this example from Regular expression operations in Python 2.* Documentation and that example well described here in details with some modification. To explain whole example, let's get string type variable call,
text = "He was carefully disguised but captured quickly by police."
and the compile type regular expression pattern as,
regEX = r"\w+ly"
pattern = re.compile(regEX)
\w mean matches any word character (alphanumeric & underscore), + mean matches 1 or more of the preceding token and the whole meaning is select any word which is end-up with ly. There are only two 2 words('carefully' and 'quickly') which is satisfied the above regular expression.
Before move into re.findall() or re.finditer(), let's see what does re.search() mean in Python 2.* Documentation.
Scan through string looking for the first location where the regular expression pattern produces a match, and return a corresponding MatchObject instance. Return None if no position in the string matches the pattern; note that this is different from finding a zero-length match at some point in the string.
Following code lines gives you the basic understand of re.search().
search = pattern.search(text)
print(search)
print(type(search))
#output
It will generate re.MatchObject of class type object which have 13 of supported methods and attributes according to Python 2.* Documentation. This span() method consist with the start and end points(7 and 16 present in the above example) of the matched word in text variable. re.search() method only consider about the very first match, otherwise return None.
Let's move into the question, before that see what does re.finditer() mean in Python 2.* Documentation.
Return an iterator yielding MatchObject instances over all non-overlapping matches for the RE pattern in string. The string is scanned left-to-right, and matches are returned in the order found. Empty matches are included in the result.
Coming next code lines gives you the basic understand of re.finditer().
finditer = pattern.finditer(text)
print(finditer)
print(type(finditer))
#output
The above example gives us the Iterator Objects which need to be loop. This is obviously not the result we want. Let's loop finditer and see what's inside this Iterator Objects.
for anObject in finditer:
print(anObject)
print(type(anObject))
print()
#output
This results are much similar to the re.search() result which we get earlier. But we can see the new result in above output, . As I mention earlier in Python 2.* Documentation, re.search() will scan through string looking for the first location where the regular expression pattern produces a match and re.finditer() will scan through string looking for all the locations where the regular expression pattern produces matches and return more details than re.findall() method.
Here what does re.findall() mean in Python 2.* Documentation.
Return all non-overlapping matches of pattern in string, as a list of strings. The string is scanned left-to-right, and matches are returned in the order found. If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group. Empty matches are included in the result.
Let's understand what happen in re.findall().
findall = pattern.findall(text)
print(findall)
print(type(findall))
#output
['carefully', 'quickly']
This output only gives us the matched words in text variable, otherwise return an empty list. That list in the output which is similar to the match attribute in re.MatchObject.
Here is the full code and I tried in Python 3.7.
import re
text = "He was carefully disguised but captured quickly by police."
regEX = r"\w+ly"
pattern = re.compile(regEX)
search = pattern.search(text)
print(search)
print(type(search))
print()
findall = pattern.findall(text)
print(findall)
print(type(findall))
print()
finditer = pattern.finditer(text)
print(finditer)
print(type(finditer))
print()
for anObject in finditer:
print(anObject)
print(type(anObject))
print()
- 热议问题