正则表达式匹配单词之间,如果没有结束单词

时间:2022-09-13 09:36:24

I want to match specific parts (bold) in the following strings:

我想在以下字符串中匹配特定部分(粗体):

  • \doc doc1,doc2
  • \doc doc1,doc2 \in filed1,field2
  • \ doc doc1,doc2 \ in filed1,field2

  • \doc doc1,doc2 \in filed1,field2 \doc doc3,doc4 \in field3,field4
  • \ doc doc1,doc2 \ in filed1,field2 \ doc doc3,doc4 \ in field3,field4

I came out with this regex /\\doc(.*?)\\in/g https://regex101.com/r/dV7mF4/1

我出来了这个正则表达式/\\doc(.*?)\\in/g https://regex101.com/r/dV7mF4/1

But it doesn't match the first string doc1,doc2. What do I need to add to my regex to match all these strings above?

但它与第一​​个字符串doc1,doc2不匹配。我需要添加到我的正则表达式以匹配上面的所有这些字符串?

2 个解决方案

#1


1  

You may use an alternation in the positive lookahead to set the context:

您可以在正向前瞻中使用替换来设置上下文:

\\doc(.*?)(?=$|\\in)
          ^^^^^^^^^^

See the regex demo

请参阅正则表达式演示

The (?=$|\\in) will allow .*? to match up to the end of string (the $ branch) or up to the first \in (the second branch).

(?= $ | \\ in)将允许。*?匹配到字符串的结尾($ branch)或直到第一个\ in(第二个分支)。

As an alternative, you may just specify you want to match anything but \in after \doc:

作为替代方案,您可以在\ doc之后指定除了\ in之外的任何内容:

\\doc([^\\]*(?:\\(?!in)[^\\]*)*)

See this regex demo

看到这个正则表达式演示

Here, [^\\]*(?:\\(?!in)[^\\]*)* matches zero or more characters other than \, then 0+ sequences of \ not followed with in and followed with 0+ characters other than \. Basically, any text that is not \in.

这里,[^ \\] *(?:\\(?!in)[^ \\] *)*匹配除\之外的零个或多个字符,然后是0+序列\ not后跟in,后跟0+除\之外的字符。基本上,任何不是\的文本。

#2


1  

Change your regex expression to : /\\doc(.*?)(?:\\in|\s)/g

将你的正则表达式改为:/\\doc(。*?)(?:\\ in | \)/ g

Demo and Explaination

演示和解释

#1


1  

You may use an alternation in the positive lookahead to set the context:

您可以在正向前瞻中使用替换来设置上下文:

\\doc(.*?)(?=$|\\in)
          ^^^^^^^^^^

See the regex demo

请参阅正则表达式演示

The (?=$|\\in) will allow .*? to match up to the end of string (the $ branch) or up to the first \in (the second branch).

(?= $ | \\ in)将允许。*?匹配到字符串的结尾($ branch)或直到第一个\ in(第二个分支)。

As an alternative, you may just specify you want to match anything but \in after \doc:

作为替代方案,您可以在\ doc之后指定除了\ in之外的任何内容:

\\doc([^\\]*(?:\\(?!in)[^\\]*)*)

See this regex demo

看到这个正则表达式演示

Here, [^\\]*(?:\\(?!in)[^\\]*)* matches zero or more characters other than \, then 0+ sequences of \ not followed with in and followed with 0+ characters other than \. Basically, any text that is not \in.

这里,[^ \\] *(?:\\(?!in)[^ \\] *)*匹配除\之外的零个或多个字符,然后是0+序列\ not后跟in,后跟0+除\之外的字符。基本上,任何不是\的文本。

#2


1  

Change your regex expression to : /\\doc(.*?)(?:\\in|\s)/g

将你的正则表达式改为:/\\doc(。*?)(?:\\ in | \)/ g

Demo and Explaination

演示和解释