CONTAINSTABLE如何与许多搜索词一起使用

时间:2022-04-06 20:03:54

For example I got a table Companies. There is a field FullName in it, on which I got full-text index. Then I join that table with CONTAINSTABLE with @search_word like "company*" AND "name*" AND "oil*" AND "propan*" AND "liquid*"... from 1 to 10 words.

例如,我有一个表公司。在其中有一个字段FullName,我在其上获得了全文索引。然后我用CONTAINSTABLE和@search_word加入那个表,如“company *”和“name *”和“oil *”AND“propan *”和“liquid *”......从1到10个单词。

And I know that words (with variants *) got this number of matches:

而且我知道单词(带变体*)得到了这么多匹配:

  • company - 10k
  • 公司 - 10k

  • name - 5k
  • 名字 - 5k

  • oil - 2k
  • 油 - 2k

  • propan - 1k
  • propan - 1k

  • liquid - 500
  • 液体 - 500

  • all words above in 1 row - 300 matches
  • 以上所有单词排成一排 - 300场比赛

So, if I will search in that order:

所以,如果我按顺序搜索:

@search = '"company*" AND "name*" AND "oil*" AND "propan*" AND "liquid*"'

@search ='“company *”AND“name *”AND“oil *”AND“propan *”AND“liquid *”'

and in that order:

并按此顺序:

@search = '"liquid*" AND "propan*" AND "oil*" AND "name*" AND "company*"'

@search ='“liquid *”AND“propan *”AND“oil *”AND“name *”AND“company *”'

SELECT [FullName]
FROM dbo.Companies c
INNER JOIN CONTAINSTABLE (dbo.Companies, [FullName], @search) as s ON s.[KEY] = c.[KEY_FIELD];

Will I have any differences in speed of my query?

我的查询速度有什么不同吗?

1 个解决方案

#1


1  

I ran a few tests monitoring "Query costs" of the Actual Execution Plan.

我运行了一些测试来监控实际执行计划的“查询成本”。

It seems overall cost of CONTAINSTABLE over any number of words joined in a search phrase with "AND" is equal to the cost of the least popular of those words alone.

似乎CONTAINSTABLE的总成本超过了在“AND”的搜索短语中加入的任何数量的单词等于单独这些单词中最不受欢迎的单词的成本。

Overall cost of CONTAINSTABLE over any number of words joined with "OR" is equal to the cost of the most popular of those words alone.

与“OR”相加的任意数量的单词的CONTAINSTABLE的总成本等于单独最受欢迎的单词的成本。

That suggests Full-Text Search engine prioritizes words from the search string in accordance with their popularity (occurrence count) in the index. Hence I think there would be no benefit in trying to pre-order search string words on the client.

这表明全文搜索引擎根据索引中的流行度(出现次数)对搜索字符串中的单词进行优先级排序。因此,我认为尝试在客户端上预订搜索字符串单词没有任何好处。

Here are my Full-Text Search tests:

以下是我的全文搜索测试:

Declare @Word1        nvarchar(50) = N'"Word1*"';
Declare @Word2        nvarchar(50) = N'"Word2*"';
Declare @SearchString nvarchar(100) = '';

PRINT SUBSTRING(CONVERT(varchar, SYSDATETIME(), 121), 12, 11) + ' Start';
Set @SearchString = @Word1;
Select * From CONTAINSTABLE([Table], [Field], @SearchString);
PRINT SUBSTRING(CONVERT(varchar, SYSDATETIME(), 121), 12, 11) + ' ' + @SearchString;
Set @SearchString = @Word2;
Select * From CONTAINSTABLE([Table], [Field], @SearchString);
PRINT SUBSTRING(CONVERT(varchar, SYSDATETIME(), 121), 12, 11) + ' ' + @SearchString;
Set @SearchString = @Word1 + ' AND ' + @Word2;
Select * From CONTAINSTABLE([Table], [Field], @SearchString);
PRINT SUBSTRING(CONVERT(varchar, SYSDATETIME(), 121), 12, 11) + ' ' + @SearchString;
Set @SearchString = @Word2 + ' AND ' + @Word1;
Select * From CONTAINSTABLE([Table], [Field], @SearchString);
PRINT SUBSTRING(CONVERT(varchar, SYSDATETIME(), 121), 12, 11) + ' ' + @SearchString;
Set @SearchString = @Word1 + ' OR ' + @Word2;
Select * From CONTAINSTABLE([Table], [Field], @SearchString);
PRINT SUBSTRING(CONVERT(varchar, SYSDATETIME(), 121), 12, 11) + ' ' + @SearchString;
Set @SearchString = @Word2 + ' OR ' + @Word1;
Select * From CONTAINSTABLE([Table], [Field], @SearchString);
PRINT SUBSTRING(CONVERT(varchar, SYSDATETIME(), 121), 12, 11) + ' ' + @SearchString;

Please replace [Table], [Field] with your actual Full-Text-indexed table and field names, and set @Word1 and @Word2 to popular and rear words from your data set.

请将[Table],[Field]替换为实际的全文索引表和字段名称,并将@ Word1和@Word2设置为数据集中的常用和后置单词。

#1


1  

I ran a few tests monitoring "Query costs" of the Actual Execution Plan.

我运行了一些测试来监控实际执行计划的“查询成本”。

It seems overall cost of CONTAINSTABLE over any number of words joined in a search phrase with "AND" is equal to the cost of the least popular of those words alone.

似乎CONTAINSTABLE的总成本超过了在“AND”的搜索短语中加入的任何数量的单词等于单独这些单词中最不受欢迎的单词的成本。

Overall cost of CONTAINSTABLE over any number of words joined with "OR" is equal to the cost of the most popular of those words alone.

与“OR”相加的任意数量的单词的CONTAINSTABLE的总成本等于单独最受欢迎的单词的成本。

That suggests Full-Text Search engine prioritizes words from the search string in accordance with their popularity (occurrence count) in the index. Hence I think there would be no benefit in trying to pre-order search string words on the client.

这表明全文搜索引擎根据索引中的流行度(出现次数)对搜索字符串中的单词进行优先级排序。因此,我认为尝试在客户端上预订搜索字符串单词没有任何好处。

Here are my Full-Text Search tests:

以下是我的全文搜索测试:

Declare @Word1        nvarchar(50) = N'"Word1*"';
Declare @Word2        nvarchar(50) = N'"Word2*"';
Declare @SearchString nvarchar(100) = '';

PRINT SUBSTRING(CONVERT(varchar, SYSDATETIME(), 121), 12, 11) + ' Start';
Set @SearchString = @Word1;
Select * From CONTAINSTABLE([Table], [Field], @SearchString);
PRINT SUBSTRING(CONVERT(varchar, SYSDATETIME(), 121), 12, 11) + ' ' + @SearchString;
Set @SearchString = @Word2;
Select * From CONTAINSTABLE([Table], [Field], @SearchString);
PRINT SUBSTRING(CONVERT(varchar, SYSDATETIME(), 121), 12, 11) + ' ' + @SearchString;
Set @SearchString = @Word1 + ' AND ' + @Word2;
Select * From CONTAINSTABLE([Table], [Field], @SearchString);
PRINT SUBSTRING(CONVERT(varchar, SYSDATETIME(), 121), 12, 11) + ' ' + @SearchString;
Set @SearchString = @Word2 + ' AND ' + @Word1;
Select * From CONTAINSTABLE([Table], [Field], @SearchString);
PRINT SUBSTRING(CONVERT(varchar, SYSDATETIME(), 121), 12, 11) + ' ' + @SearchString;
Set @SearchString = @Word1 + ' OR ' + @Word2;
Select * From CONTAINSTABLE([Table], [Field], @SearchString);
PRINT SUBSTRING(CONVERT(varchar, SYSDATETIME(), 121), 12, 11) + ' ' + @SearchString;
Set @SearchString = @Word2 + ' OR ' + @Word1;
Select * From CONTAINSTABLE([Table], [Field], @SearchString);
PRINT SUBSTRING(CONVERT(varchar, SYSDATETIME(), 121), 12, 11) + ' ' + @SearchString;

Please replace [Table], [Field] with your actual Full-Text-indexed table and field names, and set @Word1 and @Word2 to popular and rear words from your data set.

请将[Table],[Field]替换为实际的全文索引表和字段名称,并将@ Word1和@Word2设置为数据集中的常用和后置单词。