用regex提取电子邮件和名称

时间:2023-01-14 11:36:59

What would be the regular expressions to extract the name and email from strings like these?

从这些字符串中提取名称和电子邮件的正则表达式是什么?

johndoe@example.com
John <johndoe@example.com>
John Doe <johndoe@example.com>
"John Doe" <johndoe@example.com>

It can be assumed that the email is valid. The name will be separated by the email by a single space, and might be quoted.

可以假定电子邮件是有效的。该名称将被电子邮件分隔为一个单独的空格,并可能被引用。

The expected results are:

预期的结果是:

johndoe@example.com
Name: nil
Email: johndoe@example.com

John <johndoe@example.com>
Name: John
Email: johndoe@example.com

John Doe <johndoe@example.com>
Name: John Doe
Email: johndoe@example.com

"John Doe" <johndoe@example.com>
Name: John Doe
Email: johndoe@example.com

This is my progress so far:

这是我到目前为止取得的进步:

(("?(.*)"?)\s)?(<?(.*@.*)>?)

(which can be tested here: http://regexr.com/?337i5)

(可以在这里测试:http://regexr.com/?337i5)

5 个解决方案

#1


12  

The following regex appears to work on all inputs and uses only two capturing groups:

下面的regex似乎对所有输入都起作用,只使用两个捕获组:

(?:"?([^"]*)"?\s)?(?:<?(.+@[^>]+)>?)

http://regex101.com/r/dR8hL3

http://regex101.com/r/dR8hL3

Thanks to @RohitJain and @burning_LEGION for introducing the idea of non-capturing groups and character exclusion respectively.

感谢@RohitJain和@burning_LEGION分别介绍了非捕获组和字符排除的概念。

#2


1  

use this regex "?([^"]*)"?\s*([^\s]+@.+)

使用这个正则表达式”?([^]*)”? \ s *(^ \[s]+ @。+)

group 1 contains name

组1包含名称

group 2 contains email

组2包含电子邮件

#3


0  

You can try this (same code as yours but improved), but you need to check returned groups after matching because the email is either returned in group 2 or group 3, depending on whether a name is given.

您可以尝试这一点(与您的代码相同但有所改进),但是您需要在匹配后检查返回的组,因为根据名称是否给定,电子邮件要么在组2中返回,要么在组3中返回。

(?:("?(?:.*)"?)\s)?<(.*@.*)>|(.*@.*)

#4


0  

This way you can get with or without name, removing the quotes.

通过这种方式,您可以使用或不使用名称,删除引号。

\"*?(([\p{L}0-9-_ ]+)\"?)*?\b\ *<?([a-z0-9-_\.]+@[a-z0-9-_\.]+\.[a-z]+)>?

#5


0  

(([^<>()\[\]\\.,;:\s@"]+(\.[^<>()\[\]\\.,;:\s@"]+)*)|(".+"))@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\])|(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))

https://regex101.com/r/pVV5TI/1

https://regex101.com/r/pVV5TI/1

#1


12  

The following regex appears to work on all inputs and uses only two capturing groups:

下面的regex似乎对所有输入都起作用,只使用两个捕获组:

(?:"?([^"]*)"?\s)?(?:<?(.+@[^>]+)>?)

http://regex101.com/r/dR8hL3

http://regex101.com/r/dR8hL3

Thanks to @RohitJain and @burning_LEGION for introducing the idea of non-capturing groups and character exclusion respectively.

感谢@RohitJain和@burning_LEGION分别介绍了非捕获组和字符排除的概念。

#2


1  

use this regex "?([^"]*)"?\s*([^\s]+@.+)

使用这个正则表达式”?([^]*)”? \ s *(^ \[s]+ @。+)

group 1 contains name

组1包含名称

group 2 contains email

组2包含电子邮件

#3


0  

You can try this (same code as yours but improved), but you need to check returned groups after matching because the email is either returned in group 2 or group 3, depending on whether a name is given.

您可以尝试这一点(与您的代码相同但有所改进),但是您需要在匹配后检查返回的组,因为根据名称是否给定,电子邮件要么在组2中返回,要么在组3中返回。

(?:("?(?:.*)"?)\s)?<(.*@.*)>|(.*@.*)

#4


0  

This way you can get with or without name, removing the quotes.

通过这种方式,您可以使用或不使用名称,删除引号。

\"*?(([\p{L}0-9-_ ]+)\"?)*?\b\ *<?([a-z0-9-_\.]+@[a-z0-9-_\.]+\.[a-z]+)>?

#5


0  

(([^<>()\[\]\\.,;:\s@"]+(\.[^<>()\[\]\\.,;:\s@"]+)*)|(".+"))@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\])|(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))

https://regex101.com/r/pVV5TI/1

https://regex101.com/r/pVV5TI/1