用于匹配引用和单引号的Regex。

时间:2022-09-15 16:14:04

I'm currently writing a parser for ColdFusion code. I'm using a regex (in c#) to extract the name datasource attribute of the cfquery tag.

我目前正在为ColdFusion代码编写一个解析器。我使用regex(在c#中)提取cfquery标记的名称datasource属性。

For the time being the regex is the following <cfquery\s.*datasource\s*=\s*(?:'|")(.*)(?:'|")

目前,regex是以下 \s.*datasource\s*=\s*(?:'|>

it works well for strings like <cfquery datasource="myDS" or <cfquery datasource='myDS'

它适用于诸如

But it gets crazy when parsing strings like <cfquery datasource="#GetSourceName('myDS')#"

但是当解析像

Obviously the part of the regex (?:'|") is the cause. Is there a way to only match single quote when the first match was a single quote? And only match the double quote when the first match was a double quote?

显然,regex的部分(?:|)是原因所在。当第一场比赛是单引号时,是否只有一种方法可以匹配单引号?只有当第一场比赛是双引号时才会有双引号?

Thanks in advance!

提前谢谢!

3 个解决方案

#1


6  

Edit: I think this should work in C# you just need to do a back reference:

编辑:我认为这应该在c#中有效,你只需要做一个背景参考:

datasource\s*=\s*('|")(.*)(?:\1)

or perhaps

或者

datasource\s*=\s*('|")(.*)(?:$1)

matches datasource="#GetSourceName('myDS')#" with a back reference to the first match with \1.

匹配数据源="#GetSourceName('myDS')#",并返回第一个与\1匹配的引用。

Of course, you cannot ignore the first capture group with ?: and still have this work. Also, you may want to set the lazy flag so as not to match additional "'s

当然,您不能忽略第一个捕获组?:并且仍然有这个工作。另外,您可能想要设置lazy flag,以免与其他的“s”匹配

#2


1  

I would suggest using two different regexes if possible, or splitting the regex in a different way.

如果可能的话,我建议使用两个不同的regex,或者以不同的方式拆分regex。

For a single regex, considering the question @Mike posted, ("[^"]*")|('[^']*') Then you can parse out the quotes.

一个正则表达式,考虑问题@Mike发布,(“[^]*)|(“[^]*)你就可以解析出报价。

The other potential way of doing this is by using lookahead/lookbehind, but that tends to get messy and isn't universally supported.

另一种可能的方法是使用lookahead/lookbehind,但这往往会变得很混乱,而且不会得到普遍的支持。

#3


0  

Try looking at this post:

看看这篇文章:

How can I match a quote-delimited string with a regex?

如何将引用分隔的字符串与正则表达式匹配?

They seem to be dealing with the same problem.

他们似乎在处理同样的问题。

#1


6  

Edit: I think this should work in C# you just need to do a back reference:

编辑:我认为这应该在c#中有效,你只需要做一个背景参考:

datasource\s*=\s*('|")(.*)(?:\1)

or perhaps

或者

datasource\s*=\s*('|")(.*)(?:$1)

matches datasource="#GetSourceName('myDS')#" with a back reference to the first match with \1.

匹配数据源="#GetSourceName('myDS')#",并返回第一个与\1匹配的引用。

Of course, you cannot ignore the first capture group with ?: and still have this work. Also, you may want to set the lazy flag so as not to match additional "'s

当然,您不能忽略第一个捕获组?:并且仍然有这个工作。另外,您可能想要设置lazy flag,以免与其他的“s”匹配

#2


1  

I would suggest using two different regexes if possible, or splitting the regex in a different way.

如果可能的话,我建议使用两个不同的regex,或者以不同的方式拆分regex。

For a single regex, considering the question @Mike posted, ("[^"]*")|('[^']*') Then you can parse out the quotes.

一个正则表达式,考虑问题@Mike发布,(“[^]*)|(“[^]*)你就可以解析出报价。

The other potential way of doing this is by using lookahead/lookbehind, but that tends to get messy and isn't universally supported.

另一种可能的方法是使用lookahead/lookbehind,但这往往会变得很混乱,而且不会得到普遍的支持。

#3


0  

Try looking at this post:

看看这篇文章:

How can I match a quote-delimited string with a regex?

如何将引用分隔的字符串与正则表达式匹配?

They seem to be dealing with the same problem.

他们似乎在处理同样的问题。