
时间:2023-02-12 20:14:00

So I was working on some regex and came across some weird behavior in regex.


I had a character class in the regex that included a bunch of characters (alphanumeric) and ended with a space, a dash, and a plus. The weird behavior is reproducable using the following regex.


^[ -+]*$

So what happens is that a space is valid text input and so is the plus. However, for some reason the dash is not valid text input. The regex can be fixed by rearranging the charaters in the class as so:


^[ +-]*$

Now all the characters are valid input. This has been reproduced in Chrome using jsFiddle and also using Expresso.


My question is basically, am I doing something wrong or is this just weird? :)

我的问题基本上是,我做错了什么或者这只是奇怪吗? :)

2 个解决方案



The - character has special meaning inside character classes. When it appears between two characters, it creates a range, e.g. [0-9] matches any character between 0 and 9, inclusive. However, when placed at the start or the end of the character class (or when escaped) it represents a literal - character.

- 字符在字符类中具有特殊含义。当它出现在两个字符之间时,它会创建一个范围,例如[0-9]匹配0到9之间的任何字符,包括0和9。但是,当放置在角色类的开头或结尾时(或者在转义时),它代表一个文字字符。

  • [ -+] will match any character between a space (char code 32) and a + (char code 43), inclusive.
  • [ - +]将匹配空格(字符代码32)和+(字符代码43)之间的任何字符。

  • [ +-] will match a space (char code 32), a + (char code 43), or a - (char code 45)
  • [+ - ]将匹配空格(字符代码32),+(字符代码43)或 - (字符代码45)



Because in first you were treating - as "to" or range operator as in a-z

因为首先你要处理 - 像a-z中的“to”或范围运算符

So it is becoming space to + which is a range. Either escape - by prepending a \ or put it at first or at last.

所以它变成了空间到+这是一个范围。要么逃避 - 通过预先设置\或者将它放在第一个或最后一个。



The - character has special meaning inside character classes. When it appears between two characters, it creates a range, e.g. [0-9] matches any character between 0 and 9, inclusive. However, when placed at the start or the end of the character class (or when escaped) it represents a literal - character.

- 字符在字符类中具有特殊含义。当它出现在两个字符之间时,它会创建一个范围,例如[0-9]匹配0到9之间的任何字符,包括0和9。但是,当放置在角色类的开头或结尾时(或者在转义时),它代表一个文字字符。

  • [ -+] will match any character between a space (char code 32) and a + (char code 43), inclusive.
  • [ - +]将匹配空格(字符代码32)和+(字符代码43)之间的任何字符。

  • [ +-] will match a space (char code 32), a + (char code 43), or a - (char code 45)
  • [+ - ]将匹配空格(字符代码32),+(字符代码43)或 - (字符代码45)



Because in first you were treating - as "to" or range operator as in a-z

因为首先你要处理 - 像a-z中的“to”或范围运算符

So it is becoming space to + which is a range. Either escape - by prepending a \ or put it at first or at last.

所以它变成了空间到+这是一个范围。要么逃避 - 通过预先设置\或者将它放在第一个或最后一个。