RFC822:使用正则表达式进行电子邮件地址验证
如您所知,这是我们验证电子邮件地址的方式:
(?:(?:rn)?[ t])*(?:(?:(?:[^()<>@,;:".[] 00- 31]+(?:(?:(?:rn)?[ t]
)+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:
rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 00- 31]+(?:(?:(
?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[
t]))*"(?:(?:rn)?[ t])*))*@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 00-
31]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*
](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 00- 31]+
(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:
(?:rn)?[ t])*))*|(?:[^()<>@,;:".[] 00- 31]+(?:(?:(?:rn)?[ t])+|Z
|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)
?[ t])*)*<(?:(?:rn)?[ t])*(?:@(?:[^()<>@,;:".[] 00- 31]+(?:(?:(?:
rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[
t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 00- 31]+(?:(?:(?:rn)
?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t]
)*))*(?:,@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 00- 31]+(?:(?:(?:rn)?[
t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*
)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 00- 31]+(?:(?:(?:rn)?[ t]
)+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*)
*:(?:(?:rn)?[ t])*)?(?:[^()<>@,;:".[] 00- 31]+(?:(?:(?:rn)?[ t])+
|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:r
n)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 00- 31]+(?:(?:(?:
rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t
]))*"(?:(?:rn)?[ t])*))*@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 00- 31
]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](
?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 00- 31]+(?
:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?
:rn)?[ t])*))*>(?:(?:rn)?[ t])*)|(?:[^()<>@,;:".[] 00- 31]+(?:(?
:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?
[ t]))*"(?:(?:rn)?[ t])*)*:(?:(?:rn)?[ t])*(?:(?:(?:[^()<>@,;:".[]
00- 31]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|
.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>
@,;:".[] 00- 31]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"
(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*))*@(?:(?:rn)?[ t]
)*(?:[^()<>@,;:".[] 00- 31]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:
".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?
:[^()<>@,;:".[] 00- 31]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[
]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*|(?:[^()<>@,;:".[] 00-
31]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(
?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)*<(?:(?:rn)?[ t])*(?:@(?:[^()<>@,;
:".[] 00- 31]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([
^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:"
.[] 00- 31]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[
]r]|.)*](?:(?:rn)?[ t])*))*(?:,@(?:(?:rn)?[ t])*(?:[^()<>@,;:".
[] 00- 31]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]
r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[]
00- 31]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]
|.)*](?:(?:rn)?[ t])*))*)*:(?:(?:rn)?[ t])*)?(?:[^()<>@,;:".[]
00- 31]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|
.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,
;:".[] 00- 31]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?
:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*))*@(?:(?:rn)?[ t])*
(?:[^()<>@,;:".[] 00- 31]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".
[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[
^()<>@,;:".[] 00- 31]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]
]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*>(?:(?:rn)?[ t])*)(?:,s*(
?:(?:[^()<>@,;:".[] 00- 31]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:
".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)(?:.(?:(
?:rn)?[ t])*(?:[^()<>@,;:".[] 00- 31]+(?:(?:(?:rn)?[ t])+|Z|(?=[
["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t
])*))*@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 00- 31]+(?:(?:(?:rn)?[ t
])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?
:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 00- 31]+(?:(?:(?:rn)?[ t])+|
Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*|(?:
[^()<>@,;:".[] 00- 31]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[
]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)*<(?:(?:rn)
?[ t])*(?:@(?:[^()<>@,;:".[] 00- 31]+(?:(?:(?:rn)?[ t])+|Z|(?=[["
()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)
?[ t])*(?:[^()<>@,;:".[] 00- 31]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>
@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*(?:,@(?:(?:rn)?[
t])*(?:[^()<>@,;:".[] 00- 31]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,
;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t]
)*(?:[^()<>@,;:".[] 00- 31]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:
".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*)*:(?:(?:rn)?[ t])*)?
(?:[^()<>@,;:".[] 00- 31]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".
[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)(?:.(?:(?:
rn)?[ t])*(?:[^()<>@,;:".[] 00- 31]+(?:(?:(?:rn)?[ t])+|Z|(?=[[
"()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])
*))*@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 00- 31]+(?:(?:(?:rn)?[ t])
+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:
.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 00- 31]+(?:(?:(?:rn)?[ t])+|Z
|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*>(?:(
?:rn)?[ t])*))*)?;s*)
你能向我解释一下这里发生了什么事吗?
我们正在查看一个字符串并确定它是否是一个电子邮件地址?
你能否至少解释第一行:
(?:(?:rn)?[ t])*(?:(?:(?:[^()<>@,;:".[] 00- 31]+(?:(?:(?:rn)?[ t]
你能给我一个很少见的电子邮件地址的例子,但它是有效的?
我想你应该忘记这一点。 如果你想在正则表达式方面做得更好,那是一回事,而且可能有更好的学习方式。 否则,验证电子邮件地址是一项非常复杂且容易出错的活动,我不知道任何剪切和粘贴解决方案将完全涵盖所有的角落案例。 在那条路上是疯狂的。 如果您有互联网连接的应用程序,最好通过实际发送确认电子邮件来验证地址。
如果你想了解正在发生的事情,你应该看看像Email :: Address这样的体面的模块,并注意模式是如何从其组成部分构建的:
my $CTL = q{x00-x1Fx7F};
my $special = q{()<>[]:;@\,."};
my $text = qr/[^x0Ax0D]/;
my $quoted_pair = qr/$text/;
my $ctext = qr/(?>[^()]+)/;
my ($ccontent, $comment) = (q{})x2;
for (1 .. $COMMENT_NEST_LEVEL) {
$ccontent = qr/$ctext|$quoted_pair|$comment/;
$comment = qr/s*((?:s*$ccontent)*s*)s*/;
}
my $cfws = qr/$comment|s+/;
my $atext = qq/[^$CTL$specials]/;
my $atom = qr/$cfws*$atext+$cfws*/;
my $dot_atom_text = qr/$atext+(?:.$atext+)*/;
my $dot_atom = qr/$cfws*$dot_atom_text$cfws*/;
my $qtext = qr/[^"]/;
my $qcontent = qr/$qtext|$quoted_pair/;
my $quoted_string = qr/$cfws*"$qcontent+"$cfws*/;
my $word = qr/$atom|$quoted_string/;
等等等等。
my $simple_word = qr/$atom|.|s*"$qcontent+"s*/;
my $obs_phrase = qr/$simple_word+/;
my $phrase = qr/$obs_phrase|(?:$word+)/;
my $local_part = qr/$dot_atom|$quoted_string/;
my $dtext = qr/[^[]]/;
my $dcontent = qr/$dtext|$quoted_pair/;
my $domain_literal = qr/$cfws*[(?:s*$dcontent)*s*]$cfws*/;
my $domain = qr/$dot_atom|$domain_literal/;
我昨天看到这个表情
/^([w!#$%&'*+-/=?^`{|}~]+.)*[w!#$%&'*+-/=?^`{|}~]+@((((([a-z0-9]{1}[a-z0-9-]{0,62}[a-z0-9]{1})|[a-z]).)+[a-z]{2,6})|(d{1,3}.){3}d{1,3}(:d{1,5})?)$/i
从http://fightingforalostcause.net/misc/2006/compare-email-regex.php
链接地址: http://www.djcxy.com/p/92755.html上一篇: RFC822: email address validation with regular expressions
下一篇: Why are people using regexp for email and other complex validation?