length matches in Java Regex
My code :
Pattern pattern = Pattern.compile("a?");
Matcher matcher = pattern.matcher("ababa");
while(matcher.find()){
System.out.println(matcher.start()+"["+matcher.group()+"]"+matcher.end());
}
Output :
0[a]1
1[]1
2[a]3
3[]3
4[a]5
5[]5
What I know :
Java API says :
What I want to know:
The ?
is a greedy quantifier, therefore it will first try to match the 1-occurence before trying the 0-occurence. In you string,
It is a bit more complicated than that but that is the main idea. When the 1-occurence cannot match, it will then try with the 0-occurence.
As for the values of start, end and group, they will be where the match starts, ends and the group is what has been matched, so in the first 0-occurence match of your string, you get 1, 1 and the emtpy string. I am not sure this really answers your question.
Iterating over few examples would clear out the functioning of matcher.find()
to you :
Regex engine takes on one character from string (ie ababa) and tries to find if pattern you are seeking in string could be found or not. If the pattern exists, then (as API mentioned) :
matcher.start() returns the starting index, matcher.end() returns the offset after the last character matched.
If match do not exists. then start() and end() returns the same index, which is to comply the length matched is zero.
Look down following examples :
// Searching for string either "a" or ""
Pattern pattern = Pattern.compile("a?");
Matcher matcher = pattern.matcher("abaabbbb");
while(matcher.find()){
System.out.println(matcher.start()+"["+matcher.group()+"]"+matcher.end());
}
Output:
0[a]1
1[]1
2[a]3
3[a]4
4[]4
5[]5
6[]6
7[]7
8[]8
// Searching for string either "aa" or "a"
Pattern pattern = Pattern.compile("aa?");
Matcher matcher = pattern.matcher("abaabbbb");
while(matcher.find()){
System.out.println(matcher.start()+"["+matcher.group()+"]"+matcher.end());
}
Output:
0[a]1
2[aa]4
链接地址: http://www.djcxy.com/p/76992.html
上一篇: Java正则表达式拥有量词
下一篇: 长度匹配在Java正则表达式