如何正确地转义regexp中的字符

时间:2022-05-22 22:24:25

I want to do a string search inside a string. Simply saying MySTR.search(Needle).

我想在字符串中做一个字符串搜索。简单地说MySTR.search(针)。

The problem occurs when this needle string contains special regex characters like *,+ and so on. It fails with error invalid quantifier.

当这个指针字符串包含特殊的regex字符(如*、+等)时,就会出现问题。它以错误无效量词失败。

I have browsed the web and found out that string can be escaped with \Q some string \E.

我浏览了一下网络,发现字符串可以用\Q一些字符串来转接。

However, this does not always produce the desired behavior. For example:

然而,这并不总是产生期望的行为。例如:

var sNeedle = '*Stars!*';
var sMySTR = 'The contents of this string have no importance';
sMySTR.search('\Q' + sNeedle + '\E');

Result is -1. OK.

结果是1。好的。

var sNeedle = '**Stars!**';
var sMySTR = 'The contents of this string have no importance';
sMySTR.search('\Q' + sNeedle + '\E');

Result is "invalid quantifier". This happens because 2 or more special characters are 'touching' each other, because:

结果是“无效的量词”。之所以会出现这种情况,是因为两个或两个以上的特殊字符正在“接触”对方,因为:

var sNeedle = '*Dont touch me*Stars!*Dont touch me*';
var sMySTR = 'The contents of this string have no importance';
sMySTR.search('\Q' + sNeedle + '\E');

Will work OK.

将好的工作。

I know I could make a function escapeAllBadChars(sInStr) and just add double slashes before every possible special regex character, but I'm wondering if there is a simpler way to do it?

我知道我可以创建一个函数escapeAllBadChars(sInStr)并在每个特殊的regex字符之前添加双斜杠,但是我想知道是否有更简单的方法来实现它?

4 个解决方案

#1


30  

\Q...\E doesn't work in JavaScript (at least, they don't escape anything...) as you can see:

\问…在JavaScript中(至少,它们不会逃避任何事情),正如您所看到的:

var s = "*";
print(s.search(/\Q*\E/));
print(s.search(/\*/));

produces:

生产:

-1
0

as you can see on Ideone.

正如你在Ideone上看到的。

The following chars need to be escaped:

以下的chars需要转义:

  • (
  • (
  • )
  • )
  • [
  • (
  • {
  • {
  • *
  • *
  • +
  • +
  • .
  • $
  • 美元
  • ^
  • ^
  • \
  • \
  • |
  • |
  • ?
  • 吗?

So, something like this would do:

所以,像这样的东西可以:

function quote(regex) {
  return regex.replace(/([()[{*+.$^\\|?])/g, '\\$1');
}

No, ] and } don't need to be escaped: they have no special meaning, only their opening counter parts.

不,]和}不需要转义:它们没有特殊的意义,只有它们的开口部分。

Note that when using a literal regex, /.../, you also need to escape the / char. However, / is not a regex meta character: when using it in a RegExp object, it doesn't need an escape.

注意,在使用文字regex时,/…/,您还需要转义/ char。然而,/不是regex元字符:当在RegExp对象中使用它时,它不需要转义。

#2


4  

I'm just dipping my feet in Javascript, but is there a reason you need to use the regex engine at all? How about

我只是在使用Javascript,但是有什么理由需要使用regex引擎吗?如何

var sNeedle = '*Stars!*';
var sMySTR = 'The contents of this string have no importance';
if ( sMySTR.indexOf(sNeedle) > -1 ) {
   //found it
}

#3


1  

I performed a quick Google search to see what's out there and it appears that you've got a few options for escaping regular expression characters. According to one page, you can define & run a function like below to escape problematic characters:

我执行了一个快速的谷歌搜索,看看外面有什么,似乎您有一些选项可以转义正则表达式字符。根据一个页面,您可以定义并运行如下所示的函数来转义有问题的字符:

RegExp.escape = function(text) {
    return text.replace(/[-[\]{}()*+?.,\\^$|#\s]/g, "\\$&");
}

Alternatively, you can try and use a separate library such as XRegExp, which already handles nuances you're trying to re-solve.

或者,您可以尝试使用一个单独的库,比如XRegExp,它已经处理了您试图重新解决的细微差别。

#4


0  

Duplicate of https://*.com/a/6969486/151312

https://*.com/a/6969486/151312的复制

This is proper as per MDN (see explanation in post above):

这是适当的MDN(见上文的解释):

function escapeRegExp(str) {
  return str.replace(/[\-\[\]\/\{\}\(\)\*\+\?\.\\\^\$\|]/g, "\\$&");
}

#1


30  

\Q...\E doesn't work in JavaScript (at least, they don't escape anything...) as you can see:

\问…在JavaScript中(至少,它们不会逃避任何事情),正如您所看到的:

var s = "*";
print(s.search(/\Q*\E/));
print(s.search(/\*/));

produces:

生产:

-1
0

as you can see on Ideone.

正如你在Ideone上看到的。

The following chars need to be escaped:

以下的chars需要转义:

  • (
  • (
  • )
  • )
  • [
  • (
  • {
  • {
  • *
  • *
  • +
  • +
  • .
  • $
  • 美元
  • ^
  • ^
  • \
  • \
  • |
  • |
  • ?
  • 吗?

So, something like this would do:

所以,像这样的东西可以:

function quote(regex) {
  return regex.replace(/([()[{*+.$^\\|?])/g, '\\$1');
}

No, ] and } don't need to be escaped: they have no special meaning, only their opening counter parts.

不,]和}不需要转义:它们没有特殊的意义,只有它们的开口部分。

Note that when using a literal regex, /.../, you also need to escape the / char. However, / is not a regex meta character: when using it in a RegExp object, it doesn't need an escape.

注意,在使用文字regex时,/…/,您还需要转义/ char。然而,/不是regex元字符:当在RegExp对象中使用它时,它不需要转义。

#2


4  

I'm just dipping my feet in Javascript, but is there a reason you need to use the regex engine at all? How about

我只是在使用Javascript,但是有什么理由需要使用regex引擎吗?如何

var sNeedle = '*Stars!*';
var sMySTR = 'The contents of this string have no importance';
if ( sMySTR.indexOf(sNeedle) > -1 ) {
   //found it
}

#3


1  

I performed a quick Google search to see what's out there and it appears that you've got a few options for escaping regular expression characters. According to one page, you can define & run a function like below to escape problematic characters:

我执行了一个快速的谷歌搜索,看看外面有什么,似乎您有一些选项可以转义正则表达式字符。根据一个页面,您可以定义并运行如下所示的函数来转义有问题的字符:

RegExp.escape = function(text) {
    return text.replace(/[-[\]{}()*+?.,\\^$|#\s]/g, "\\$&");
}

Alternatively, you can try and use a separate library such as XRegExp, which already handles nuances you're trying to re-solve.

或者,您可以尝试使用一个单独的库,比如XRegExp,它已经处理了您试图重新解决的细微差别。

#4


0  

Duplicate of https://*.com/a/6969486/151312

https://*.com/a/6969486/151312的复制

This is proper as per MDN (see explanation in post above):

这是适当的MDN(见上文的解释):

function escapeRegExp(str) {
  return str.replace(/[\-\[\]\/\{\}\(\)\*\+\?\.\\\^\$\|]/g, "\\$&");
}