RegEx to check for string with given length containing special sequence
I'm looking for a regex that matches strings with a given length (parameterized) that start with "+" or a lowercase letter. It additionally must contain at least one uppercase letter followed by a digit and it must not end with a digit. In between there can be lower and uppercase letters as well as digits [a-zA-Z0-9]
. This string may be part of a larger string.
I've got difficulties implementing the length restriction. Tried to solve it with a lookahead but it won't work. Let's say the string's length shall be 10:
(?!.{10,})[a-z+][a-zA-Z0-9]*([A-Z][0-9])+[a-zA-Z0-9]*[a-zA-Z]
Lengtt of 10:
These example strings should be matched:
c4R9vMh0Lh
+lKj9CnR5x
These example strings should not be matched:
9kR7alcjaa
+5kl9Rk9XZ
aBikJ6clo9
Length of 4:
These example strings should be matched:
aR3v
+K7Z
These example strings should not be matched:
9R3v
+7KZ
aK79
Can you give me some hints?
Kind of a strange requirement, but this seems to do what you want:
/[a-z+]
(?=([A-Za-z0-9]{8}[A-Za-z]))
(?=.{0,6}[A-Z][0-9])
1
/x
After matching the first character in the normal way, it uses a lookahead to check the length and basic consistency requirements (all letters and digits, doesn't end with a digit). Whatever is matched by the lookahead is captured in group #1.
Then, starting again from the position following the first character, another lookahead checks for the more specific condition: an uppercase letter followed by a digit. If that succeeds, the backreference ( 1
) goes ahead and consumes the characters that were captured in the first lookahead.
Parameterizing the regex is a simple matter of replacing the numbers inside the braces with numbers or expressions based on the desired length. Here's an example in Java:
import java.util.regex.*;
public class Test
{
public static void main(String[] args) throws Exception
{
String[] inputs = {
"c4R9vMh0Lh",
"+lKj9CnR5x",
"9kR7alcjaa",
"+5kl9Rk9XZ",
"aBikJ6clo9",
"aR3v",
"+K7Z",
"9R3v",
"+7KZ",
"aK79"
};
int len = Integer.parseInt(args[0]);
String regex = "[a-z+]" +
"(?=([A-Za-z0-9]{" + (len-2) + "}[A-Za-z]))" +
"(?=.{0," + (len-4) + "}[A-Z][0-9])" +
"1";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher("");
System.out.println("length = " + len);
System.out.println("regex = " + p.pattern());
for (String s : inputs)
{
System.out.printf("%n%12s : %b%n", s, m.reset(s).find());
}
}
}
sample output:
>java Test 4 length = 4 regex = [a-z+](?=([A-Za-z0-9]{2}[A-Za-z]))(?=.{0,0}[A-Z][0-9])1 c4R9vMh0Lh : false +lKj9CnR5x : true 9kR7alcjaa : true +5kl9Rk9XZ : false aBikJ6clo9 : true aR3v : true +K7Z : true 9R3v : false +7KZ : false aK79 : false
You example uses negative look ahead instead of positive, use ^(?=.{10,})
instead. This should work as long as your regex flavour supports look ahead of course.
In my opinion, situations like this are often best with using more than 1 regex, but that is not always an option.
This:
#!/usr/bin/perl
$_ = "Hello%20world%20how%20are%20you%20today";
print "<$1>" while m{
G ( (?: [^%] | % p{xdigit}{2} )+ )
(?:
(?<= G .{5} )
|(?<= G .{4} )
|(?<= G .{3} )
)
}xg;
Produces this:
<Hello>
<%20wo>
<rld>
<%20ho>
<w%20a>
<re%20>
<you>
<%20to>
<day>
Whereas this:
$_ = <<EOM;
This particularly rapid,
unintelligible patter,
Isn't generally heard,
and if it is it doesn't matter.
EOM
s/(s)/sprintf("%%%02X", ord $1)/ge;
print "$_nn";
produces this:
This%20particularly%20rapid,%20%0Aunintelligible%20patter,%20%0AIsn't%20generally%20heard,%20%0Aand%20if%20it%20is%20it%20doesn't%20matter.%0A
<This>
<%20pa>
<rticu>
<larly>
<%20ra>
<pid,>
<%20>
<%0Aun>
<intel>
<ligib>
<le%20>
<patte>
<r,%20>
<%0AIs>
<n't>
<%20ge>
<neral>
<ly%20>
<heard>
<,%20>
<%0Aan>
<d%20i>
<f%20i>
<t%20i>
<s%20i>
<t%20d>
<oesn'>
<t%20m>
<atter>
<.%0A>
链接地址: http://www.djcxy.com/p/74820.html