从JavaScript字符串中移除零宽度的空格字符。

时间:2022-08-22 13:12:35

I take user-input (JS code) and execute (process) them in realtime to show some output.

我使用用户输入(JS代码)并在实时执行(进程)以显示一些输出。

Sometimes the code has those zero width space, it's really weird. i don't know how the users are input'ing that. Example - "(​$".length === 3

有时代码的宽度为零,这很奇怪。我不知道用户是如何输入的。例子——“美元”。长度= = = 3

I need to be able to remove that character from my code in JS. how do i do so ? or maybe theres some other way to execute that JS code so that the browser doesn't takes the zero width space characters into account ?

我需要能够从我的JS代码中删除这个字符。我该怎么做呢?或者可能还有其他的方式来执行JS代码,这样浏览器就不会考虑到零宽度的空间字符了?

4 个解决方案

#1


61  

Unicode has the following zero-width characters:

Unicode具有以下零宽度字符:

  • U+200B zero width space
  • U + 200 b零宽度的空间
  • U+200C zero width non-joiner Unicode code point
  • U+200C零宽度非joiner Unicode码点。
  • U+200D zero width joiner Unicode code point
  • U+200D零宽度的joiner Unicode编码点。
  • U+FEFF zero width no-break space Unicode code point
  • U+FEFF零宽度无断点空间Unicode编码点。

To remove them from a string in JavaScript, you can use a simple regular expression:

要从JavaScript中的字符串中删除它们,可以使用简单的正则表达式:

var userInput = 'a\u200Bb\u200Cc\u200Dd\uFEFFe';
console.log(userInput.length); // 9
var result = userInput.replace(/[\u200B-\u200D\uFEFF]/g, '');
console.log(result.length); // 5

Note that there are many more symbols that may not be visible. Some of ASCII’s control characters, for example.

注意,还有更多的符号可能是不可见的。例如,一些ASCII的控制字符。

#2


5  

I had a problem some invisible characters were corrupting my JSON and causing Unexpected Token ILLEGAL exception which was crashing my site.

我遇到了一个问题,一些看不见的字符正在腐蚀我的JSON,并导致意外的令牌非法异常,这使我的网站崩溃。

Here is my solution using RegExp variable:

这里是我使用RegExp变量的解决方案:

    var re = new RegExp("\u2028|\u2029");
    var result = text.replace(re, '');

More about Javascript and zero width spaces you can find here: Zero Width Spaces

更多关于Javascript和零宽度的空间,您可以在这里找到:零宽度空间。

#3


2  

str.replace(/\u200B/g,'');

200B is the hexadecimal of the zero width space 8203. replace this with empty string to remove this

200B是零宽空间8203的十六进制。用空字符串替换这个。

#4


0  

[].filter.call( str, function( c ) {
    return c.charCodeAt( 0 ) !== 8203;
} );

Filter each character to remove the 8203 char code (zero-width space unicode number).

过滤每个字符以删除8203字符代码(零宽度的空间unicode数字)。

#1


61  

Unicode has the following zero-width characters:

Unicode具有以下零宽度字符:

  • U+200B zero width space
  • U + 200 b零宽度的空间
  • U+200C zero width non-joiner Unicode code point
  • U+200C零宽度非joiner Unicode码点。
  • U+200D zero width joiner Unicode code point
  • U+200D零宽度的joiner Unicode编码点。
  • U+FEFF zero width no-break space Unicode code point
  • U+FEFF零宽度无断点空间Unicode编码点。

To remove them from a string in JavaScript, you can use a simple regular expression:

要从JavaScript中的字符串中删除它们,可以使用简单的正则表达式:

var userInput = 'a\u200Bb\u200Cc\u200Dd\uFEFFe';
console.log(userInput.length); // 9
var result = userInput.replace(/[\u200B-\u200D\uFEFF]/g, '');
console.log(result.length); // 5

Note that there are many more symbols that may not be visible. Some of ASCII’s control characters, for example.

注意,还有更多的符号可能是不可见的。例如,一些ASCII的控制字符。

#2


5  

I had a problem some invisible characters were corrupting my JSON and causing Unexpected Token ILLEGAL exception which was crashing my site.

我遇到了一个问题,一些看不见的字符正在腐蚀我的JSON,并导致意外的令牌非法异常,这使我的网站崩溃。

Here is my solution using RegExp variable:

这里是我使用RegExp变量的解决方案:

    var re = new RegExp("\u2028|\u2029");
    var result = text.replace(re, '');

More about Javascript and zero width spaces you can find here: Zero Width Spaces

更多关于Javascript和零宽度的空间,您可以在这里找到:零宽度空间。

#3


2  

str.replace(/\u200B/g,'');

200B is the hexadecimal of the zero width space 8203. replace this with empty string to remove this

200B是零宽空间8203的十六进制。用空字符串替换这个。

#4


0  

[].filter.call( str, function( c ) {
    return c.charCodeAt( 0 ) !== 8203;
} );

Filter each character to remove the 8203 char code (zero-width space unicode number).

过滤每个字符以删除8203字符代码(零宽度的空间unicode数字)。