如何在JavaScript中转义和取消引用?

时间:2022-09-15 15:35:54

Here's a short piece of code:

这是一段很短的代码:

var utility = {
    escapeQuotes: function(string) {
        return string.replace(new RegExp('"', 'g'),'\\"');
    },
    unescapeQuotes: function(string) {
        return string.replace(new RegExp('\\"', 'g'),'"');
    }
};

var a = 'hi "';

var b = utility.escapeQuotes(a);
var c = utility.unescapeQuotes(b);

console.log(b + ' | ' + c);

I would expect this code to work, however as a result I receive:

我希望这段代码能够正常运行,但结果却收到了:

hi \" | hi \"

If I change the first parameter of the new RegExp constructor in the unescapeQuotes method to 4 backslashes everything starts working as it should.

如果我将unescapeQuotes方法中新的RegExp构造函数的第一个参数更改为4个反斜杠,则一切都会按预期开始工作。

string.replace(new RegExp('\\\\"', 'g'),'"');

The result:

hi \" | hi " 

Why are four backslashes needed as the first parameter of the new RegExp constructor? Why doesn't it work with only 2 of them?

为什么需要四个反斜杠作为新RegExp构造函数的第一个参数?为什么它只与其中2个一起使用?

1 个解决方案

#1


11  

The problem is that you're using the RegExp constructor, which accepts a string, rather than using a regular expression literal. So in this line in your unescape:

问题是你正在使用RegExp构造函数,它接受一个字符串,而不是使用正则表达式文字。所以在你的unescape中的这一行:

return string.replace(new RegExp('\\"', 'g'),'"');

...the \\ is interpreted by the JavaScript parser as part handling the string, resulting in a single backslash being handed to the regular expression parser. So the expression the regular expression parser sees is \". The backslash is an escape character in regex, too, but \" doesn't mean anything special and just ends up being ". To have an actual backslash in a regex, you have to have two of them; to do that in a string literal, you have to have four (so they survive both layers of interpretation).

... \\是由JavaScript解析器解释为处理字符串的部分,导致将单个反斜杠传递给正则表达式解析器。所以正则表达式解析器看到的表达式是\“。反斜杠也是正则表达式中的转义字符,但是\”并不意味着什么特别的东西,最终只是“。要在正则表达式中有一个实际的反斜杠,你有拥有其中两个;要在字符串文字中执行此操作,您必须有四个(因此它们可以在两个解释层中存活)。

Unless you have a very good reason to use the RegExp constructor (e.g., you have to use some varying input), always use the literal form:

除非你有充分的理由使用RegExp构造函数(例如,你必须使用一些不同的输入),所以总是使用文字形式:

var utility = {
    escapeQuotes: function(string) {
        return string.replace(/"/g, '\\"');
    },
    unescapeQuotes: function(string) {
        return string.replace(/\\"/g, '"');
    }
};

It's a lot less confusing.

它不那么令人困惑。

#1


11  

The problem is that you're using the RegExp constructor, which accepts a string, rather than using a regular expression literal. So in this line in your unescape:

问题是你正在使用RegExp构造函数,它接受一个字符串,而不是使用正则表达式文字。所以在你的unescape中的这一行:

return string.replace(new RegExp('\\"', 'g'),'"');

...the \\ is interpreted by the JavaScript parser as part handling the string, resulting in a single backslash being handed to the regular expression parser. So the expression the regular expression parser sees is \". The backslash is an escape character in regex, too, but \" doesn't mean anything special and just ends up being ". To have an actual backslash in a regex, you have to have two of them; to do that in a string literal, you have to have four (so they survive both layers of interpretation).

... \\是由JavaScript解析器解释为处理字符串的部分,导致将单个反斜杠传递给正则表达式解析器。所以正则表达式解析器看到的表达式是\“。反斜杠也是正则表达式中的转义字符,但是\”并不意味着什么特别的东西,最终只是“。要在正则表达式中有一个实际的反斜杠,你有拥有其中两个;要在字符串文字中执行此操作,您必须有四个(因此它们可以在两个解释层中存活)。

Unless you have a very good reason to use the RegExp constructor (e.g., you have to use some varying input), always use the literal form:

除非你有充分的理由使用RegExp构造函数(例如,你必须使用一些不同的输入),所以总是使用文字形式:

var utility = {
    escapeQuotes: function(string) {
        return string.replace(/"/g, '\\"');
    },
    unescapeQuotes: function(string) {
        return string.replace(/\\"/g, '"');
    }
};

It's a lot less confusing.

它不那么令人困惑。