Why is this regex not greedy?

In this regex

$line = 'this is a regular expression';
$line =~  s/^(w+)b(.*)b(w+)$/$3 $2 $1/;

print $line;

Why is $2 equal to " is a regular " ? My thought process is that (.*) should be greedy and match all characters until the end of the line and therefore $3 would be empty.

That's not happening, though. The regex matcher is somehow stopping right before the last word boundary and populating $3 with what's after the last word boundary and the rest of the string is sent to $2.

Any explanation? Thanks.


$3 can't be empty when using this regex because the corresponding capturing group is (w+) , which must match at least one word character or the whole match will fail.

So what happens is (.*) matches " is a regular expression ", b matches the end of the string, and (w+) fails to match. The regex engine then backtracks to (.*) matching " is a regular " (note the match includes the space), b matches the word boundary before e , and (w+) matches " expression ".

If you change (w+) to (w*) then you will end up with the result you expected, where (.*) consumes the whole string.


Greedy doesn't mean it gets to match absolutely everything. It just means it can take as much as possible and still have the regex succeed .

This means that since you use the + in group 3 it can't be empty and still succeed as + means 1 or more .

If you want 3 to be empty, just change (w+) to (w?) . Now since ? means 0 or 1 it can be empty, and therefore the greedy .* takes everything. Note: This seems to work only in Perl, due to how perl deals with lines.


In order for the regex to match the whole string, ^(w+)b requires that the entire first word be 1 . Likewise, b(w+)$ requires that the entire last word be 3 . Therefore, no matter how greedy (.*) is, it can only capture ' is a regular ', otherwise the pattern won't match. At some point while matching the string, .* probably did take up the entire ' is a regular expression', but then it found that it had to backtrack and let the w+ get its match too.

链接地址: http://www.djcxy.com/p/76916.html

上一篇: 贪婪与非

下一篇: 为什么这个正则表达式不贪婪?