有没有可能在没有SSIS的情况下执行T-SQL模糊查找?

时间:2022-12-27 19:21:51

SSIS 2005/2008 does fuzzy lookups and groupings. Is there a feature that does the same in T-SQL?

SSIS 2005/2008做模糊查找和分组。在T-SQL中是否有相同的特性?

4 个解决方案

#1


7  

Fuzzy lookup uses a q-gram approach, by breaking strings up into tiny sub-strings and indexing them. You can then then search input by breaking it up into equally sized strings. You can inspect the format of their index and write a CLR function to use the same style of index but you might be talking about a fair chunk of work.

模糊查找使用q-gram方法,将字符串分解成小的子字符串并对它们进行索引。然后你可以通过将输入分成大小相同的字符串来搜索输入。您可以检查它们的索引的格式,并编写一个CLR函数来使用相同的索引样式,但是您可能正在讨论大量的工作。

It is actually quite interesting how they did it, very simple yet provides very robust matching and is very configurable.

他们的做法非常有趣,非常简单,但是提供了非常健壮的匹配,并且是非常可配置的。

From that I recall of the index when I last looked at it, each q-gram or substring is stored in a row in an table (the index). That row contains an nvarchar column (among other values) that is used as binary data and contains references to the rows that match.

我记得上次查看索引时,每个q-gram或子字符串都存储在表(索引)中的一行中。该行包含一个nvarchar列(以及其他值),该列用作二进制数据,并包含对匹配行的引用。

There is also an open feedback suggestion on Microsoft Connect for this feature.

对于这个特性,微软Connect也有一个开放的反馈建议。

#2


7  

SQL Server has a SOUNDEX() function:

SQL Server具有SOUNDEX()函数:

SELECT * 
FROM Customers
WHERE SOUNDEX(Lastname) = SOUNDEX('Stonehouse')
AND SOUNDEX(Firstname) = SOUNDEX('Scott')

#3


3  

Full Text Search is a great fuzzy tool. Brief primer here

全文搜索是一个很好的模糊工具。这里简短的底漆

#4


1  

On March 5 2009 I will have an article posted on www.sqlservercentral.com with a sample of Jaro-Winkler TSQL

2009年3月5日,我将在www.sqlservercentral.com上发布一篇文章,其中包含Jaro-Winkler TSQL的示例

#1


7  

Fuzzy lookup uses a q-gram approach, by breaking strings up into tiny sub-strings and indexing them. You can then then search input by breaking it up into equally sized strings. You can inspect the format of their index and write a CLR function to use the same style of index but you might be talking about a fair chunk of work.

模糊查找使用q-gram方法,将字符串分解成小的子字符串并对它们进行索引。然后你可以通过将输入分成大小相同的字符串来搜索输入。您可以检查它们的索引的格式,并编写一个CLR函数来使用相同的索引样式,但是您可能正在讨论大量的工作。

It is actually quite interesting how they did it, very simple yet provides very robust matching and is very configurable.

他们的做法非常有趣,非常简单,但是提供了非常健壮的匹配,并且是非常可配置的。

From that I recall of the index when I last looked at it, each q-gram or substring is stored in a row in an table (the index). That row contains an nvarchar column (among other values) that is used as binary data and contains references to the rows that match.

我记得上次查看索引时,每个q-gram或子字符串都存储在表(索引)中的一行中。该行包含一个nvarchar列(以及其他值),该列用作二进制数据,并包含对匹配行的引用。

There is also an open feedback suggestion on Microsoft Connect for this feature.

对于这个特性,微软Connect也有一个开放的反馈建议。

#2


7  

SQL Server has a SOUNDEX() function:

SQL Server具有SOUNDEX()函数:

SELECT * 
FROM Customers
WHERE SOUNDEX(Lastname) = SOUNDEX('Stonehouse')
AND SOUNDEX(Firstname) = SOUNDEX('Scott')

#3


3  

Full Text Search is a great fuzzy tool. Brief primer here

全文搜索是一个很好的模糊工具。这里简短的底漆

#4


1  

On March 5 2009 I will have an article posted on www.sqlservercentral.com with a sample of Jaro-Winkler TSQL

2009年3月5日,我将在www.sqlservercentral.com上发布一篇文章,其中包含Jaro-Winkler TSQL的示例