Atomic group and non

  • i was wondering how to understand atomic group, represented as (?>expr) ? What is it used for?

    In http://www.regular-expressions.info/atomic.html, the only example is when expr is alternation, such as the regex a(?>bc|b)c matches abcc but not abc . Are there examples with expr not being alternation?

  • Are atomic group and non-capturing group, represented as (?:expr) , the same thing?
  • Note that I am not restricting to just one specific flavor of Regex.


    1) When Atomic groups are used, the regex engine won't backtrack for further permutations if the complete regular expression has not been matched for a given string. Whenever you use an alternation, if a match is successful, the regex will immediately try to match the rest of the expression, but will keep track of the position where other alternations are possible. If the rest of the expression is not matched, the regex will go back to the previously noted position and try the other combinations. If Atomic grouping had been used, the regex engine would not have kept track of the previous position and would just have given up matching. The above example doesn't really explain the purpose of using Atomic groups. It just demonstrates clearly the elimination of backtracking. Atomic groups would be of use in certain scenarios where greedy quantifiers are used and further combinations are possible even though there is no alternation.

    2)Atomic groups and Non-Capturing groups are different. Non-Capturing groups simply don't save the value of the matches. Atomic groups simply disable backtracking in case further combinations are needed.

    For example, the regular expression a(?:bc|b)c matches both abcc and abc (without capturing the match), whilst a(?>bc|c)c only matches abcc . If the regex was a(?>b|bc)c , it would only match abc , whilst a(?:b|bc)c would still match both.


    Atomic groups (and the possessive modifier) are useful to avoid catastrophic backtracking - which can be exploited by malicious users to trigger denial of service attacks by gobbling up a server's memory.

    Non-capturing groups are just that -- non-capturing. The regex engine can backtrack into a non-capturing group; not into an atomic group.

    链接地址: http://www.djcxy.com/p/74838.html

    上一篇: 正则表达式无法捕获所有组

    下一篇: 原子团和非原子团