电子邮件验证regex需要很长时间才能在中长字符串中完成

时间:2022-10-19 18:51:25

After returning true or false with:

返回真或假后:

return (/^\w+([\.-]?\w+)*@\w+([\.-]?\w+)*(\.\w{2,8})+$/.test(str));

Where str is testing123@testing123.testing123 it takes about 25 seconds to complete.

str testing123@testing123。测试完成大约需要25秒。

In general, shorter strings take less than 1 second.

一般来说,较短的字符串花费的时间少于1秒。

This is most likely due to back-tracking. I am not very good with Regex, can someone help me reduce the time it takes to process an email. E.g. it must have letter(s) then @ then letter(s) then . then letter(s) and must not be too long.

这很可能是由于回溯。我对Regex不是很在行,可以有人帮我减少处理电子邮件的时间吗?例如:它必须有字母(s),然后是字母(s)。然后是字母,不能太长。

6 个解决方案

#1


2  

Just use

只使用

\S+@\S+

Or even (with anchors)

甚至(锚)

^\S+@\S+$

and actually send an email to that address rather than using a complicated, likely error-prone expression.

实际上,给那个地址发一封电子邮件,而不是使用复杂的、可能会出错的表达式。

#2


1  

Remove ? after each [.-]:

删除吗?每次[-]:

/^\w+(?:[.-]\w+)*@\w+(?:[.-]\w+)*(?:\.\w{2,8})+$/

See the regex demo

看到regex演示

In ([.-]?\w+)*, the [.-]? matches 1 or 0 occurrences of . or -, and the whole group pattern gets reduced to (\w+)* after \w+, it causes too many redundant backtracking steps.

([-]? \ w +)*,(。)?匹配1或0次。或者-,当整个组模式被减少到(\w+)*在\w+之后,它会导致太多的冗余回溯步骤。

Also, it is a good idea to use non-capturing groups if you are only using the grouping construct to quantify a group of subpatterns.

此外,如果您只是使用分组构造来量化一组子模式,那么最好使用非捕获组。

Now, regarding

现在,关于

it must have letter(s) then @ then letter(s) then . then letter(s) and must not be too long

它必须有字母然后@然后是字母。然后是字母,不能太长

I see others suggest ^\S+@\S+\.\S+$ like solutions, which is a good idea, just make sure you understand that \S matches any char other than whitespace (not just letters). Besides, this does not actually provide the final solution since "must not be too long" condition is not met (+ matches from 1 to quite many occurrences, that is why it is described as 1 or more).

我看到别人建议^ \ S + @ \ S + \。\S+$ like解决方案,这是一个好主意,只要确保您理解\S匹配除空格(不仅仅是字母)之外的任何字符即可。此外,这实际上并没有提供最终解决方案,因为“不应该太长”的条件没有满足(+匹配从1到相当多的情况,这就是为什么它被描述为1或更多)。

I suggest using the pattern inside an HTML5 pattern attribute and restrict the number of chars a user can type with maxlength attribute:

我建议在HTML5 pattern属性中使用模式,并限制用户可以使用maxlength属性键入的字符数量:

input:valid {
  color: black;
}
input:invalid {
  color: red;
}
<form name="form1"> 
 <input pattern="\S+@\S+\.\S+" maxlength="256" title="Please enter an email address like name@myhost.com!" placeholder="name@myhost.com"/>
 <input type="Submit"/> 
</form>

NOTE: the pattern regex is compiled by enclosing the pattern with ^(?: and )$, you do not need to use ^ and $ in the regex here. So, pattern="\S+@\S+\.\S+" is translated into:

注意:封闭的模式编译正则表达式模式与^(?和):美元,您不需要使用regex ^和$。因此,模式= " \ S + @ \ S + \。\ S +”是翻译成:

  • ^(?: (this is added by HTML5) - start of a string (and a non-capturing group starts)
  • ^(?:(这是HTML5添加的)——字符串的开始(非捕获组开始)
  • \S+ - any 1 or more non-whitespace chars
  • \S+ -任何一个或多个非空格字符
  • @ - a @ char
  • @ -一个@ char
  • \S+ - any 1 or more non-whitespace chars
  • \S+ -任何一个或多个非空格字符
  • \. - a dot
  • \。——一个点
  • \S+ - any 1 or more non-whitespace chars
  • \S+ -任何一个或多个非空格字符
  • )$ (this is added by HTML5) - the non-capturing group ends and the end of a string is matched.
  • )$(这是HTML5添加的)—非捕获组结束,字符串的结束匹配。

#3


0  

You can use this Regex to check emails :

您可以使用此Regex检查电子邮件:

var emailregex = /^([\w-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([\w-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$/;

#4


0  

This is the RFC 2822 Standrard for matching emails. It can match 99.9% of emails out today.

这是用于匹配电子邮件的RFC 2822标准。它可以匹配目前99.9%的电子邮件。

[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?

If you want to just catch syntax errors, you can simply use

如果您只想捕获语法错误,可以简单地使用

\S+@\S+

Taken from one of the answers from another question.

选自另一个问题的答案。

#5


0  

replace ()* with ()? PS: Pretty weird expression to match emails:)

替换()与()* ?PS:用来匹配邮件的奇怪表达方式:)

#6


0  

An efficient method of matching emails is:

匹配电子邮件的有效方法是:

\S+@\S+\.\S+

It's short, matches almost any email, and won't match:

它很短,几乎可以匹配任何电子邮件,不会匹配:

abc@abc

as some of these other answers might.

就像其他一些答案可能的那样。

#1


2  

Just use

只使用

\S+@\S+

Or even (with anchors)

甚至(锚)

^\S+@\S+$

and actually send an email to that address rather than using a complicated, likely error-prone expression.

实际上,给那个地址发一封电子邮件,而不是使用复杂的、可能会出错的表达式。

#2


1  

Remove ? after each [.-]:

删除吗?每次[-]:

/^\w+(?:[.-]\w+)*@\w+(?:[.-]\w+)*(?:\.\w{2,8})+$/

See the regex demo

看到regex演示

In ([.-]?\w+)*, the [.-]? matches 1 or 0 occurrences of . or -, and the whole group pattern gets reduced to (\w+)* after \w+, it causes too many redundant backtracking steps.

([-]? \ w +)*,(。)?匹配1或0次。或者-,当整个组模式被减少到(\w+)*在\w+之后,它会导致太多的冗余回溯步骤。

Also, it is a good idea to use non-capturing groups if you are only using the grouping construct to quantify a group of subpatterns.

此外,如果您只是使用分组构造来量化一组子模式,那么最好使用非捕获组。

Now, regarding

现在,关于

it must have letter(s) then @ then letter(s) then . then letter(s) and must not be too long

它必须有字母然后@然后是字母。然后是字母,不能太长

I see others suggest ^\S+@\S+\.\S+$ like solutions, which is a good idea, just make sure you understand that \S matches any char other than whitespace (not just letters). Besides, this does not actually provide the final solution since "must not be too long" condition is not met (+ matches from 1 to quite many occurrences, that is why it is described as 1 or more).

我看到别人建议^ \ S + @ \ S + \。\S+$ like解决方案,这是一个好主意,只要确保您理解\S匹配除空格(不仅仅是字母)之外的任何字符即可。此外,这实际上并没有提供最终解决方案,因为“不应该太长”的条件没有满足(+匹配从1到相当多的情况,这就是为什么它被描述为1或更多)。

I suggest using the pattern inside an HTML5 pattern attribute and restrict the number of chars a user can type with maxlength attribute:

我建议在HTML5 pattern属性中使用模式,并限制用户可以使用maxlength属性键入的字符数量:

input:valid {
  color: black;
}
input:invalid {
  color: red;
}
<form name="form1"> 
 <input pattern="\S+@\S+\.\S+" maxlength="256" title="Please enter an email address like name@myhost.com!" placeholder="name@myhost.com"/>
 <input type="Submit"/> 
</form>

NOTE: the pattern regex is compiled by enclosing the pattern with ^(?: and )$, you do not need to use ^ and $ in the regex here. So, pattern="\S+@\S+\.\S+" is translated into:

注意:封闭的模式编译正则表达式模式与^(?和):美元,您不需要使用regex ^和$。因此,模式= " \ S + @ \ S + \。\ S +”是翻译成:

  • ^(?: (this is added by HTML5) - start of a string (and a non-capturing group starts)
  • ^(?:(这是HTML5添加的)——字符串的开始(非捕获组开始)
  • \S+ - any 1 or more non-whitespace chars
  • \S+ -任何一个或多个非空格字符
  • @ - a @ char
  • @ -一个@ char
  • \S+ - any 1 or more non-whitespace chars
  • \S+ -任何一个或多个非空格字符
  • \. - a dot
  • \。——一个点
  • \S+ - any 1 or more non-whitespace chars
  • \S+ -任何一个或多个非空格字符
  • )$ (this is added by HTML5) - the non-capturing group ends and the end of a string is matched.
  • )$(这是HTML5添加的)—非捕获组结束,字符串的结束匹配。

#3


0  

You can use this Regex to check emails :

您可以使用此Regex检查电子邮件:

var emailregex = /^([\w-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([\w-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$/;

#4


0  

This is the RFC 2822 Standrard for matching emails. It can match 99.9% of emails out today.

这是用于匹配电子邮件的RFC 2822标准。它可以匹配目前99.9%的电子邮件。

[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?

If you want to just catch syntax errors, you can simply use

如果您只想捕获语法错误,可以简单地使用

\S+@\S+

Taken from one of the answers from another question.

选自另一个问题的答案。

#5


0  

replace ()* with ()? PS: Pretty weird expression to match emails:)

替换()与()* ?PS:用来匹配邮件的奇怪表达方式:)

#6


0  

An efficient method of matching emails is:

匹配电子邮件的有效方法是:

\S+@\S+\.\S+

It's short, matches almost any email, and won't match:

它很短,几乎可以匹配任何电子邮件,不会匹配:

abc@abc

as some of these other answers might.

就像其他一些答案可能的那样。