正则表达式很难识别电子邮件地址?

我最近在某处写了一个正则表达式来匹配一个电子邮件地址,考虑到标准的所有变化和可能性都非常困难,并且比最初假设的要复杂得多。

任何人都可以提供一些见解,为什么这是?

是否有任何已知的和可靠的正则表达式实际上完成了这个工作?

使用正则表达式匹配电子邮件地址有什么好的选择?


对于正式的电子邮件规范,是的,通过Regex在技术上是不可能的,因为递归的东西,如评论(尤其是如果你不先删除空白的评论),和各种不同的格式(电子邮件地址isn总是someone@somewhere.tld)。 你可以靠近(有一些庞大而难以理解的正则表达式模式),但更好的检查电子邮件的方式是做非常熟悉的握手:

  • 他们告诉你他们的电子邮件
  • 你通过电子邮件发送给他们一个Guid的配置链接
  • 当他们点击链接时您知道:

  • 电子邮件是正确的
  • 它存在
  • 他们拥有它
  • 远胜于盲目接受电子邮件地址。


    有很多Perl模块(例如)可以做到这一点。 不要尝试编写自己的正则表达式来完成它。 看着

    Mail::VRFY将执行语法和网络检查(确实和SMTP服务器接受此地址)

    https://metacpan.org/pod/Mail::VRFY

    RFC::RFC822::Address - 一个递归下降电子邮件地址解析器。

    https://metacpan.org/pod/RFC::RFC822::Address

    Mail::RFC822::Address - 基于regexp的地址验证,值得关注的是疯狂的正则表达式

    http://ex-parrot.com/~pdw/Mail-RFC822-Address.html

    其他语言也有类似的工具。 疯狂的正则表达式下面...

    (?:(?:rn)?[ t])*(?:(?:(?:[^()<>@,;:".[] 00-31]+(?:(?:(?:rn)?[ t]
    )+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:
    rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 00-31]+(?:(?:(
    ?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ 
    t]))*"(?:(?:rn)?[ t])*))*@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 00-
    31]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*
    ](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 00-31]+
    (?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:
    (?:rn)?[ t])*))*|(?:[^()<>@,;:".[] 00-31]+(?:(?:(?:rn)?[ t])+|Z
    |(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)
    ?[ t])*)*<(?:(?:rn)?[ t])*(?:@(?:[^()<>@,;:".[] 00-31]+(?:(?:(?:
    rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[
     t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 00-31]+(?:(?:(?:rn)
    ?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t]
    )*))*(?:,@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 00-31]+(?:(?:(?:rn)?[
     t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*
    )(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 00-31]+(?:(?:(?:rn)?[ t]
    )+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*)
    *:(?:(?:rn)?[ t])*)?(?:[^()<>@,;:".[] 00-31]+(?:(?:(?:rn)?[ t])+
    |Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:r
    n)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 00-31]+(?:(?:(?:
    rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t
    ]))*"(?:(?:rn)?[ t])*))*@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 00-31
    ]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](
    ?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 00-31]+(?
    :(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?
    :rn)?[ t])*))*>(?:(?:rn)?[ t])*)|(?:[^()<>@,;:".[] 00-31]+(?:(?
    :(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?
    [ t]))*"(?:(?:rn)?[ t])*)*:(?:(?:rn)?[ t])*(?:(?:(?:[^()<>@,;:".[] 
    00-31]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|
    .|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>
    @,;:".[] 00-31]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"
    (?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*))*@(?:(?:rn)?[ t]
    )*(?:[^()<>@,;:".[] 00-31]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:
    ".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?
    :[^()<>@,;:".[] 00-31]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[
    ]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*|(?:[^()<>@,;:".[] 00-
    31]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(
    ?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)*<(?:(?:rn)?[ t])*(?:@(?:[^()<>@,;
    :".[] 00-31]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([
    ^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:"
    .[] 00-31]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[
    ]r]|.)*](?:(?:rn)?[ t])*))*(?:,@(?:(?:rn)?[ t])*(?:[^()<>@,;:".
    [] 00-31]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]
    r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 
    00-31]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]
    |.)*](?:(?:rn)?[ t])*))*)*:(?:(?:rn)?[ t])*)?(?:[^()<>@,;:".[] 
    00-31]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|
    .|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,
    ;:".[] 00-31]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?
    :[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*))*@(?:(?:rn)?[ t])*
    (?:[^()<>@,;:".[] 00-31]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".
    []]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[
    ^()<>@,;:".[] 00-31]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]
    ]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*>(?:(?:rn)?[ t])*)(?:,s*(
    ?:(?:[^()<>@,;:".[] 00-31]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:
    ".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)(?:.(?:(
    ?:rn)?[ t])*(?:[^()<>@,;:".[] 00-31]+(?:(?:(?:rn)?[ t])+|Z|(?=[
    ["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t
    ])*))*@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 00-31]+(?:(?:(?:rn)?[ t
    ])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?
    :.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 00-31]+(?:(?:(?:rn)?[ t])+|
    Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*|(?:
    [^()<>@,;:".[] 00-31]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[
    ]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)*<(?:(?:rn)
    ?[ t])*(?:@(?:[^()<>@,;:".[] 00-31]+(?:(?:(?:rn)?[ t])+|Z|(?=[["
    ()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)
    ?[ t])*(?:[^()<>@,;:".[] 00-31]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>
    @,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*(?:,@(?:(?:rn)?[
     t])*(?:[^()<>@,;:".[] 00-31]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,
    ;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t]
    )*(?:[^()<>@,;:".[] 00-31]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:
    ".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*)*:(?:(?:rn)?[ t])*)?
    (?:[^()<>@,;:".[] 00-31]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".
    []]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)(?:.(?:(?:
    rn)?[ t])*(?:[^()<>@,;:".[] 00-31]+(?:(?:(?:rn)?[ t])+|Z|(?=[[
    "()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])
    *))*@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 00-31]+(?:(?:(?:rn)?[ t])
    +|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:
    .(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 00-31]+(?:(?:(?:rn)?[ t])+|Z
    |(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*>(?:(
    ?:rn)?[ t])*))*)?;s*)
    

    无论如何,验证电子邮件地址并不是非常有用。 它不会捕获常见的拼写错误或电子邮件地址,因为这些地址在语法上看起来像有效地址。

    如果你想确定一个地址是有效的,你别无选择,只能发送确认邮件。

    如果您只想确保用户输入的内容看起来像电子邮件而不是“asdf”,那么请检查@。 更复杂的验证并不能真正提供任何好处。

    (我知道这不能回答你的问题,但我认为这是值得一提的)

    链接地址: http://www.djcxy.com/p/2707.html

    上一篇: Regexp recognition of email address hard?

    下一篇: Is there a php library for email address validation?