在正则表达式中,如果第一个括号不匹配,在JavaScript中使用replace()和在Ruby中使用gsub时,$ 1是否为空字符串?

时间:2022-07-22 16:54:01

Using JavaScript, to extract the prefix foo. including the . from foo.bar, I could use:

使用JavaScript,提取前缀foo。包括 。从foo.bar,我可以使用:

> "foo.bar".replace(/(\w+.)(.*)/, "$1")

But if there is no such prefix, I'd expect it to give an empty string or null, but instead it gives the full string:


> "foobar".replace(/(\w+.)(.*)/, "$1")

Why will $1 give the whole string? -- as I thought it means the first parenthesis.

为什么1美元会给整个字符串? - 因为我认为这意味着第一个括号。

  1. Maybe it means the first parenthesis that actually matched?
  2. 也许这意味着实际匹配的第一个括号?

  3. If #1 is true, then maybe a common, standard technique is to use ?, which works in Ruby:


    using irb:

    > "foo.bar".gsub(/(\w+\.)?(.*)/, '\1')
    > "foobar".gsub(/(\w+\.)?(.*)/, '\1')

    Because the ? is optional, and it will match anyway. However, it doesn't work in JavaScript:


    > "foobar".replace(/(\w+.)?(.*)/, "$1")

    I can use match() in JavaScript to do it, and it will be quite clean, but just for the sake of understanding replace() more:


  4. What is the reason that it works differently in Ruby vs JavaScript, and do #1 and #2 above also apply and/or what is a good alternative way to "grab" the prefix or get "" if it doesn't exist using replace()?


2 个解决方案



FYI, I think your JavaScript's regex isn't correct since it doesn't escape the . (dot) character.

仅供参考,我认为你的JavaScript的正则表达式是不正确的,因为它没有逃脱。 (点)字符。

The reason why $1 returns the whole string is $1 tricked you to believe that it matches the first group (which isn't true).

$ 1返回整个字符串的原因是$ 1诱骗你相信它匹配第一组(这不是真的)。

/* your js regex is /(\w+.)/, I use /(\w+\.)/ instead to demonstrate it */
"foobar".replace(/(\w+\.)/, "$1"); // 'foobar'

It's because $1 matches nothing which is (empty) then the regex tries to replace the original string foobar with $1 (since it doesn't match anything it just returns the whole original string. To make it clears take a look at following example.

这是因为$ 1匹配任何(空)然后正则表达式尝试用$ 1替换原始字符串foobar(因为它不匹配任何它只返回整个原始字符串。为了使它清除,请看下面的示例。

"foobar".replace(/(\w+\.)/, '-');    // 'foobar' (No matches, so nothing get replaced)
"foobar".replace(/(\w+\.)/, '$1');   // 'foobar' (No matches, $1 is empty, nothing get replaced)
"foobar.a".replace(/(\w+\.)/, '-');  // '-a' (matches 'foobar.' so replaces 'foobar.' with '-') + ('a')
"foobar.a".replace(/(\w+\.)/, '$1'); // 'foobar.a' (matches 'foobar.' so replaces 'foobar.' with itself) + ('a')



The replace method in JavaScript gives you a copy of the original string whether you successfully altered it or not.


So for instance:


alert( "atari.teenageRiot".replace(/5/,'reverse polarity of the neutron flow') );

Replace isn't about finding matches. It's about altering the string by replacing what you match with the first argument with the second argument so you always return the string that's meant to be changed whether it was changed or not.


Also, I would use this instead:


"foo.bar".replace(/(\w+\.)(.*)/, "$1")

You didn' have the \ before . so it was treated as the wild card which matches most characters.




FYI, I think your JavaScript's regex isn't correct since it doesn't escape the . (dot) character.

仅供参考,我认为你的JavaScript的正则表达式是不正确的,因为它没有逃脱。 (点)字符。

The reason why $1 returns the whole string is $1 tricked you to believe that it matches the first group (which isn't true).

$ 1返回整个字符串的原因是$ 1诱骗你相信它匹配第一组(这不是真的)。

/* your js regex is /(\w+.)/, I use /(\w+\.)/ instead to demonstrate it */
"foobar".replace(/(\w+\.)/, "$1"); // 'foobar'

It's because $1 matches nothing which is (empty) then the regex tries to replace the original string foobar with $1 (since it doesn't match anything it just returns the whole original string. To make it clears take a look at following example.

这是因为$ 1匹配任何(空)然后正则表达式尝试用$ 1替换原始字符串foobar(因为它不匹配任何它只返回整个原始字符串。为了使它清除,请看下面的示例。

"foobar".replace(/(\w+\.)/, '-');    // 'foobar' (No matches, so nothing get replaced)
"foobar".replace(/(\w+\.)/, '$1');   // 'foobar' (No matches, $1 is empty, nothing get replaced)
"foobar.a".replace(/(\w+\.)/, '-');  // '-a' (matches 'foobar.' so replaces 'foobar.' with '-') + ('a')
"foobar.a".replace(/(\w+\.)/, '$1'); // 'foobar.a' (matches 'foobar.' so replaces 'foobar.' with itself) + ('a')



The replace method in JavaScript gives you a copy of the original string whether you successfully altered it or not.


So for instance:


alert( "atari.teenageRiot".replace(/5/,'reverse polarity of the neutron flow') );

Replace isn't about finding matches. It's about altering the string by replacing what you match with the first argument with the second argument so you always return the string that's meant to be changed whether it was changed or not.


Also, I would use this instead:


"foo.bar".replace(/(\w+\.)(.*)/, "$1")

You didn' have the \ before . so it was treated as the wild card which matches most characters.
