Oracle REGEXP_LIKE和单词边界。

时间:2022-09-13 08:57:06

I am having a problem with matching work boundaries with REGEXP_LIKE. The following query returns a single row, as expected.

我有一个与REGEXP_LIKE匹配工作边界的问题。下面的查询按预期返回一行。

select 1 from dual
where regexp_like('DOES TEST WORK HERE','TEST');

But I want to match on word boundaries as well. So, adding the "\b" characters gives this query

但是我也想在单词边界上匹配。因此,添加“\b”字符会给出这个查询。

select 1 from dual
where regexp_like('DOES TEST WORK HERE','\bTEST\b');

Running this returns zero rows. Any ideas?

运行这个返回零行。什么好主意吗?

2 个解决方案

#1


37  

I believe you want to try

我相信你想试试。

 select 1 from dual 
  where regexp_like ('does test work here', '(^|\s)test(\s|$)');

because the \b does not appear on this list: http://download.oracle.com/docs/cd/B19306_01/appdev.102/b14251/adfns_regexp.htm#i1007670

因为\b没有出现在这个列表中:http://download.oracle.com/docs/cd/B19306_01/appdev.102/b14251/adfns_regexp.htm#i1007670。

The \s makes sure that test starts and ends in a whitespace. This is not sufficient, however, since the string test could also appear at the very start or end of the string being matched. Therefore, I use the alternative (indicated by the |) ^ for start of string and $ for end of string.

\s确保测试开始并以空格结束。但是,这还不够,因为字符串测试还可能出现在匹配的字符串的开头或结尾。因此,我使用了替代(|)^开始的字符串,字符串的结束。

Update (after 3 years+)... As it happens, I needed this functionality today, and it appears to me, that even better a regular expression is (^|\s|\W)test($|\s|\W) (The missing \b regular expression special character in Oracle).

后更新(3年以上)……凑巧的是,今天我需要此功能,在我看来,更好的一个正则表达式(^ | \ s | \ W)测试(s | | \ \ W美元)(失踪\ b正则表达式特殊字符在Oracle)。

#2


0  

In general, I would stick with René's solution, the exception being when you need the match to be zero-length. ie You don't want to actually capture the non-word character at the beginning/end.

一般来说,我将坚持使用Rene的解决方案,当你需要匹配为零长度时例外。在开始/结束时,您不希望实际捕获非单词字符。

For example, if our string is test test then (\b)test(\b) will match twice but (^|\s|\W)test($|\s|\W) will only match the first occurrence. At least, that's certainly the case if you try to use regexp_substr.

例如,如果我们的字符串测试测试然后(\ b)测试(\ b)将匹配两次但(^ | \ s | \ W)测试($ | \ s | \ W)只匹配第一次出现。至少,如果您尝试使用regexp_substr,这是肯定的。

Example

例子

SELECT regexp_substr('test test', '(^|\s|\W)test($|\s|\W)', 1, 1, 'i'), regexp_substr('test test', '(^|\s|\W)test($|\s|\W)', 1, 2, 'i') FROM dual;

的选择regexp_substr(“测试测试”、“(^ | \ s | \ W)测试(s | | \ \ W美元),1,1,“我”),regexp_substr(“试验”,“(^ | \ s | \ W)测试(s | | \ \ W美元),1、2中,“我”)从双;

Returns

返回

test |NULL

测试|零

#1


37  

I believe you want to try

我相信你想试试。

 select 1 from dual 
  where regexp_like ('does test work here', '(^|\s)test(\s|$)');

because the \b does not appear on this list: http://download.oracle.com/docs/cd/B19306_01/appdev.102/b14251/adfns_regexp.htm#i1007670

因为\b没有出现在这个列表中:http://download.oracle.com/docs/cd/B19306_01/appdev.102/b14251/adfns_regexp.htm#i1007670。

The \s makes sure that test starts and ends in a whitespace. This is not sufficient, however, since the string test could also appear at the very start or end of the string being matched. Therefore, I use the alternative (indicated by the |) ^ for start of string and $ for end of string.

\s确保测试开始并以空格结束。但是,这还不够,因为字符串测试还可能出现在匹配的字符串的开头或结尾。因此,我使用了替代(|)^开始的字符串,字符串的结束。

Update (after 3 years+)... As it happens, I needed this functionality today, and it appears to me, that even better a regular expression is (^|\s|\W)test($|\s|\W) (The missing \b regular expression special character in Oracle).

后更新(3年以上)……凑巧的是,今天我需要此功能,在我看来,更好的一个正则表达式(^ | \ s | \ W)测试(s | | \ \ W美元)(失踪\ b正则表达式特殊字符在Oracle)。

#2


0  

In general, I would stick with René's solution, the exception being when you need the match to be zero-length. ie You don't want to actually capture the non-word character at the beginning/end.

一般来说,我将坚持使用Rene的解决方案,当你需要匹配为零长度时例外。在开始/结束时,您不希望实际捕获非单词字符。

For example, if our string is test test then (\b)test(\b) will match twice but (^|\s|\W)test($|\s|\W) will only match the first occurrence. At least, that's certainly the case if you try to use regexp_substr.

例如,如果我们的字符串测试测试然后(\ b)测试(\ b)将匹配两次但(^ | \ s | \ W)测试($ | \ s | \ W)只匹配第一次出现。至少,如果您尝试使用regexp_substr,这是肯定的。

Example

例子

SELECT regexp_substr('test test', '(^|\s|\W)test($|\s|\W)', 1, 1, 'i'), regexp_substr('test test', '(^|\s|\W)test($|\s|\W)', 1, 2, 'i') FROM dual;

的选择regexp_substr(“测试测试”、“(^ | \ s | \ W)测试(s | | \ \ W美元),1,1,“我”),regexp_substr(“试验”,“(^ | \ s | \ W)测试(s | | \ \ W美元),1、2中,“我”)从双;

Returns

返回

test |NULL

测试|零