REGEX_MATCH匹配PG,而不是PG13,反之亦然

时间:2023-02-13 15:27:58

I'm using bigquery and need to match PG for movies that are rated PG and PG13 for movies rated as such.

我正在使用bigquery,需要为PG级别的电影匹配PG,为PG级别的电影匹配PG13。

I'm struggling to find a good source for BQ's implementation of REGEX_MATCH and was hoping for some assistance.

我正在努力为BQ实现REGEX_MATCH找到一个好的源代码,并希望得到一些帮助。

So, to find PG I tried SELECT REGEX_MATCH(PC_Rating, r'PG') which finds the value fine but when I try to exclude PG13 as follows SELECT REGEX_MATCH(PC_Rating,r'PG![0-9]{2}') or SELECT REGEX_MATCH(PC_Rating,r'PG^[0-9]{2}') it doesn't match PG to true.

找到PG我尝试选择REGEX_MATCH(PC_Rating r 'PG”)发现价值很好,但当我试图排除PG13如下选择REGEX_MATCH(PC_Rating r 'PG ![0 - 9]{ 2 }”)或选择REGEX_MATCH(PC_Rating r 'PG ^[0 - 9]{ 2 }”)不匹配PG为true。

My column has Either PG or PG13*, where * can be one or many of the following [VSLNP].

我的列有PG或PG13*,其中*可以是以下[VSLNP]中的一个或多个。

Thanks.

谢谢。

2 个解决方案

#1


2  

Use $ in the regex to do an exact match.

在regex中使用$ $进行精确匹配。

SELECT REGEX_MATCH(PC_Rating, r'PG$')

r'PG$' would match all the strings which ends with PG. You may do a further more exact match by adding start of the line anchor ^ at the start.

r 'PG美元”将匹配所有的字符串结尾PG。你可以通过添加开始做进一步更精确匹配的锚^。

SELECT REGEX_MATCH(PC_Rating, r'^PG$')

#2


1  

To match "PG" in the list of ratings you can use below. It has no dependency on where in the list this rating is (start, end or in the middle ...)

要在下面的评级列表中匹配“PG”。它不依赖于列表中的什么地方(开始,结束或中间…)

WHERE REGEXP_MATCH(PC_Rating, r"\bPG\b")

Note, REGEXP_MATCH is relatively expensive function - so if the "RG" value is the only value you expect in the column - you rather should use

注意,REGEXP_MATCH是相对昂贵的函数——因此,如果“RG”值是您在列中期望的惟一值,那么您应该使用它。

WHERE PC_Rating = "PG" 

And to match PG13*, where * can be one or many of the following [VSLNP] you can use below

为了匹配PG13*,其中*可以是下列[VSLNP]中的一个或多个,您可以在下面使用

WHERE REGEXP_MATCH(PC_Rating, r"\bPG13(V|S|L|N|P)*\b")  

#1


2  

Use $ in the regex to do an exact match.

在regex中使用$ $进行精确匹配。

SELECT REGEX_MATCH(PC_Rating, r'PG$')

r'PG$' would match all the strings which ends with PG. You may do a further more exact match by adding start of the line anchor ^ at the start.

r 'PG美元”将匹配所有的字符串结尾PG。你可以通过添加开始做进一步更精确匹配的锚^。

SELECT REGEX_MATCH(PC_Rating, r'^PG$')

#2


1  

To match "PG" in the list of ratings you can use below. It has no dependency on where in the list this rating is (start, end or in the middle ...)

要在下面的评级列表中匹配“PG”。它不依赖于列表中的什么地方(开始,结束或中间…)

WHERE REGEXP_MATCH(PC_Rating, r"\bPG\b")

Note, REGEXP_MATCH is relatively expensive function - so if the "RG" value is the only value you expect in the column - you rather should use

注意,REGEXP_MATCH是相对昂贵的函数——因此,如果“RG”值是您在列中期望的惟一值,那么您应该使用它。

WHERE PC_Rating = "PG" 

And to match PG13*, where * can be one or many of the following [VSLNP] you can use below

为了匹配PG13*,其中*可以是下列[VSLNP]中的一个或多个,您可以在下面使用

WHERE REGEXP_MATCH(PC_Rating, r"\bPG13(V|S|L|N|P)*\b")