Alternation in atomic grouping is useless?
I'm aware of many of the great answer about atomic grouping, eg, Confusion with Atomic Grouping - how it differs from the Grouping in regular expression of Ruby?
My question is simple: so alternation in atomic grouping is useless, right?
Some examples:
a(?>bc|b)c
will never match abc
, actually it will never try b
part in the ()
(?>.*|b*)[ac]
will never match any string since .*
matches them all and is discarded. Do I understand it right?
Some test code in perl
just in case it might be helpful
sub tester {
my ($txt, $pat) = @_;
if ($txt =~ $pat) {
print "${pat} matches ${txt}n";
} else {
print "No matchn";
}
}
$txt = "abcc";
$pat = qr/a(?>bc|b)c/;
tester($txt, $pat);
$txt = "bbabbbabbbbc";
$pat = qr/(?>.*)c/;
tester($txt, $pat);
$pat = qr/(?>.*|b*)[ac]/;
tester($txt, $pat);
$txt = "abadcc";
$pat = qr/a(?>b|dc)c/;
tester($txt, $pat);
I found an explanation in here that kinda answers my question.
It (atomic grouping) tells the RegEx engine that once it has found a matching subpattern not to backtrack on any of the quantifiers or alternative that may be inside of it.
It's not useless in the general case.
It doesn't work in your examples because the first choice always matches, therefore the second choice won't be tried. And the engine won't backtrack inside the atomic group, because this is pretty much the purpose of an atomic group.
If you have two disjoint patterns in an atomic group, that will make perfect sense
(something like (?>ab|cd)
).
上一篇: 行正则表达式贪婪组
下一篇: 原子分组交替是没用的?