将搜索字符串转换为(my)SQL where子句和regex分组。

时间:2022-09-19 07:35:54

I want to develop a simple search for my webpage which uses PHP and a MySQL database and thought it'd be a good idea to use a textfield where the user can enter a basic search term with support for OR, - and ". I don't want to use multiple form inputs but just one textfield for a better user experience just like Google does.

我想为我的网页开发一个简单的搜索,它使用PHP和MySQL数据库,我认为最好使用一个textfield,用户可以输入一个基本的搜索词,支持OR, and "。我不希望使用多个表单输入,而是使用一个textfield来获得更好的用户体验,就像谷歌所做的那样。

The idea was to write a parser that uses regexes to extract all sub-search groups and then build the SQL statement from these

我们的想法是编写一个解析器,使用regexes提取所有子搜索组,然后从中构建SQL语句

So valid search terms with its subgroups were

所以有效的搜索词和它的子组是

a b c -> ['a', 'b', 'c']

a b c -> ['a', 'b', 'c']

a b OR c -> ['a', 'b OR c']

a b或c -> ['a', 'b或c']

a -b -> ['a', '-b']

a -b -> ['a', '-b']

a "b c" -> ['a', '"b c"']

a "b c" -> ['a', ' b c ']

a b OR c -d -> ['a', 'b OR c', '-d']

a b或c -d -> ['a', 'b或c', '-d']

a "b c" -d -> ['a', '"b c"', '-d']

a "b c" -d -> ['a', ' ' b c ' ', '-d']

a "b c" OR d -e -> ['a', '"b c" OR d', '-e']

a "b c"或d -e -> ['a', ' b c '或d', '-e']

a "b c" OR d OR "e f" -g -> ['a', '"b c" OR d OR "e f"', '-g']

b c或d或“e f - g - >[a,b c或d或f“e”,' g ']

The result group could then be used to dynamically create the where clause.

然后可以使用结果组来动态创建where子句。

I tried myself with regex ([\-a-z])|(\"[a-z\s]+\") but failed when it comes to the grouping by OR which can happens two or more times (see last example).

我尝试使用regex ([\-a-z])|(\“[a-z\s]+\”)进行分组,但当分组发生两次或两次以上时失败了(见最后一个例子)。

1 个解决方案

#1


2  

You may use

你可以用

(?:"[^"]*"|\S+)(?:\s+OR\s+(?:"[^"]*"|\S+))*

See the regex demo

看到regex演示

Details

细节

  • (?:"[^"]*"|\S+) - a "..." substring or 1+ non-whitespaces
  • (?:“[^]* | \ S +)——“……”子字符串或1 +非空
  • (?:\s+OR\s+(?:"[^"]*"|\S+))* - 0+ sequences of:
    • \s+OR\s+ - OR substring enclosed with 1+ whitespaces
    • \s+或\s+ -或附带1+白色空格的子串
    • (?:"[^"]*"|\S+) - a "..." substring or 1+ non-whitespaces
    • (?:“[^]* | \ S +)——“……”子字符串或1 +非空
  • (?:\ s +或\ s +(?:“[^]* | \ s +))* - 0 +序列:\ s +或\ s + -或者子串封闭1 +空格(?:“[^]* | \ s +)——“……”子字符串或1 +非空

NOTE: If the "..." substrings can have escape sequences, you will need to alter that part of the expression depending on the escape char.

注意:如果“…”子字符串可以有转义序列,您将需要根据转义字符修改表达式的那个部分。

#1


2  

You may use

你可以用

(?:"[^"]*"|\S+)(?:\s+OR\s+(?:"[^"]*"|\S+))*

See the regex demo

看到regex演示

Details

细节

  • (?:"[^"]*"|\S+) - a "..." substring or 1+ non-whitespaces
  • (?:“[^]* | \ S +)——“……”子字符串或1 +非空
  • (?:\s+OR\s+(?:"[^"]*"|\S+))* - 0+ sequences of:
    • \s+OR\s+ - OR substring enclosed with 1+ whitespaces
    • \s+或\s+ -或附带1+白色空格的子串
    • (?:"[^"]*"|\S+) - a "..." substring or 1+ non-whitespaces
    • (?:“[^]* | \ S +)——“……”子字符串或1 +非空
  • (?:\ s +或\ s +(?:“[^]* | \ s +))* - 0 +序列:\ s +或\ s + -或者子串封闭1 +空格(?:“[^]* | \ s +)——“……”子字符串或1 +非空

NOTE: If the "..." substrings can have escape sequences, you will need to alter that part of the expression depending on the escape char.

注意:如果“…”子字符串可以有转义序列,您将需要根据转义字符修改表达式的那个部分。