+ 1

Python * metacharacter

What's the point of using asterisk (*) metacharacter, when it needs "zero or more repetitions" to return True? Is it just always returning True? For example: re.match(r"(text)*","spam") will return True regardless of the second argument.

5th May 2021, 3:24 PM
Michal Doruch
2 Réponses
+ 2
A regular expression describes the type of string it can match. (text)* will match an empty string since that is 0 repetitions of 'text'. (text)* will also match 'text', 'texttext', 'texttexttext'... The match function looks for the a match of the expression in the specified string. If it can't find a non-empty match and the regex can match an empty string, a match for an empty string is returned. Run this script to see what I mean: import re rexpr = r"(text)*" cases = ['', 'text', 'texttext', 'texttexttext', 'ttext', 'hello', 'world', 'hello text', 'text texttext'] for case in cases: print('Calling match for: ' + case) print(re.match(rexpr, case)) With your expression, match looks for any substring anywhere but you could use characters like ^ and $ to find matches strictly at the beginning, ending, or spanning the entire string. This must find a match at the beginning of the string: r"^(text)*" For "ttext", the only match for r"^(text)*" is the empty string since the "text" is only after the pattern fails with the starting "tt". r"^(text)*
quot; Now, a match can be an empty string or any repetitions of text but if even a single character precedes or is after the reptitions of "text", there won't be a match. "" would match. "t" will not match at all. "text" will match but "ttext" will not. The best way to learn regular expressions is by practicing scripts like I shared here with many cases. Another thing to help is if you learn how finite state machines can be converted to and from regular expressions.
5th May 2021, 8:30 PM
Josh Greig
Josh Greig - avatar
0
Josh Greig well, now it makes sense. It should be explained more precisely in the course...
5th May 2021, 8:39 PM
Michal Doruch