Regular expression to avoid a given set of substrings
This question already has an answer here:
That's what you use a negative lookahead assertion for:
^(?!.*(abc|def|ghi))
will match as long as the input string doesn't contain any of the "bad" words.
Note that the lookahead assertion itself doesn't match anything, so the match result (in the case of a successful match) will be an empty string.
In Python:
>>> regex = re.compile("^(?!.*(abc|def|ghi))")
>>> [bool(regex.match(s)) for s in ("student", "apple", "maria",
... "definition", "ghint", "abc123")]
[True, True, True, False, False, False]
You can use lookaheads:
^(?!.*?(?:abc|def|ghi)).*$
(?!...)
is called negative lookahead
(?:...)
is called non capturing group. Regex Reference
If you have a string containing the "forbidden" words like below "
student apple maria definition ghint abc123 righit
and you just want to know if the string contains them you can use :
.*?(?!def|abc|ghi)
This will give you 4 matches
that are the first letters of the forbidden words ( *def*inition, *ghi*nt, *abc*123, ri*ghi*t )
If no matches are found in your string, there are no "forbidden" words.
you can also use a regex.replace using :
w*(abc|def|ghi)w*
that replaces your "forbidden" substring with "" allowing you to retain all non-forbidden substrings.
链接地址: http://www.djcxy.com/p/13390.html上一篇: 什么是最好的正则表达式来检查一个字符串是否是一个有效的URL?
下一篇: 正则表达式以避免给定的一组子字符串