How do conditionals in lookaround groups work in .NET regex?

Playing around with regular expressions, especially the balanced matching of the .NET flavor, I came to a point where I realized that I do not understand the inner workings of the engine as good as I thought I did. I'd appriciate any input on why my patterns behave the way they do! But fist...

Disclaimer: This question is purely theoretical, and any result obtained here will never be used, or modified and used in production code to parse HTML. Ever. I promise. I do fear the pony. =)

Now to my problem. I'll try to match the letter A , if it is not preceeded by an # . To demonstrate, I'll alway use the string ..A..#..A.. . Here, the first A should be matched. Of course, this is a quite easy task by using "A(?<!^.*#.*)" , but I wish to use conditionals here, since they can be used for balanced matchings and other cool things.

What I tried is

"A(?<=^(#(?<q>)|[^#])*(?(q)(?!)))"

The way I interpret it is: when the engine encounteres an "A", it goes back to the start of the string, and for every character add an empty match to the capturing group q if the character is a #. Then it should fail if q contains a match. What I don't understand is why this expression matches both As in my sample string.

When I simply remove the lookbehind and match the whole string, this works:

"^(#(?<q>)|[^#])*(?(q)(?!))A"

matches the whole string up to the first A, even if the first group's quantifier is greedy. Inserting a '#' at the beginning will also cause the match to fail (as desired).

So: how do look around groups, named capturing groups within them and conditionals play together?

Thanks!

Edit: This problem can be seen more easily in (?<=(?<q>)(?(q)(?!))). , which should not match any character, but matches everything.


Conditionals aren't really that useful in balanced matching--or anywhere else, for that matter. ;) Balanced matching works by using a named capture group as a stack; every time that group matches something, the matched text is pushed onto the stack. There's also special syntax for popping the stack. Here's a good introduction:

http://blog.stevenlevithan.com/archives/balancing-groups

链接地址: http://www.djcxy.com/p/2084.html

上一篇: 有没有程序将程序集转换为C ++?

下一篇: 查看组中的条件如何在.NET正则表达式中工作?