如何按相关性查询SQL Server 2008数据库的名字和姓氏和顺序?

时间:2022-01-29 01:06:31

Basically I have a table like this:

基本上我有这样一个表:

CREATE TABLE Person(
    PersonID int IDENTITY(1,1) NOT NULL,
    FirstName nvarchar(512) NOT NULL,
    LastName nvarchar(512) NULL
)

And I need to find the top n results based on a user-query like this:

我需要根据用户查询找到前n个结果,如下所示:

"Joh Smi"

The following query returns the results I need (I think). Just not in the relevant order.

以下查询返回我需要的结果(我认为)。只是不在相关的顺序。

SELECT
    PersonID, FirstName, LastName
FROM
    Person
WHERE
    FirstName LIKE 'Joh%' OR
    LastName LIKE 'Joh%' OR
    FirstName LIKE 'Smi%' OR
    LastName LIKE 'Smi%'

If the following names were in the database and our user-query was "Joh Smi" the names should appear in the following order (or similar)

如果以下名称在数据库中,并且我们的用户查询是“Joh Smi”,则名称应按以下顺序出现(或类似)

  1. John Smith
  2. Johnny Smith
  3. John Jacob
  4. David Smithsonian
  5. Daniel Johnson

I'm hoping to get it to work similar to facebook's autocomplete friend-search.

我希望它能像facebook的自动完成朋友搜索一样工作。

So, how do I return the top n most relevant rows in SQL Server 2008?

那么,如何在SQL Server 2008中返回前n个最相关的行?

3 个解决方案

#1


2  

As an addendum to OMG Ponies answer...

作为OMG小马的补充答案......

To get the best results from a full text search you might want to create an indexed view that concatenates the first and last name fields.

要从全文搜索中获得最佳结果,您可能需要创建一个连接名字和姓氏字段的索引视图。

This will allow you to weight individual parts of the full name more precisely.

这样您就可以更精确地对全名的各个部分进行加权。

Example code as follows:

示例代码如下:

    CREATE VIEW [dbo].[vFullname] WITH SCHEMABINDING AS
    SELECT personID, FirstName + ' ' + LastName AS name 
    FROM dbo.person 

    WITH ranks AS(
    SELECT FT_TBL.personid 
        ,FT_TBL.name
        ,KEY_TBL.RANK
    FROM dbo.vfullname AS FT_TBL 
        INNER JOIN CONTAINSTABLE(vfullname, (name),
    'ISABOUT ("Smith" WEIGHT (0.4), "Smi*" WEIGHT (0.2),
"John" WEIGHT (0.3), "Joh*" WEIGHT (0.1))') AS KEY_TBL  
          ON FT_TBL.personid = KEY_TBL.[KEY]
    ) 
    SELECT
    r.personID,
    p.firstname,
    p.lastname,
    r.rank
    FROM ranks r INNER JOIN
    person p ON r.personID = p.personID
    ORDER BY rank DESC;

The CTE just allows you to return the individual firstname and lastname fields. If you don't need these as an output then ignore it.

CTE只允许您返回单个的firstname和lastname字段。如果您不需要这些作为输出,则忽略它。

#2


5  

I recommend implementing Full Text Search (FTS) for two reasons:

我建议实施全文搜索(FTS)有两个原因:

  1. Simplify the query (shouldn't need the ORs, which'll perform poorly)
  2. 简化查询(不应该需要OR,这将表现不佳)

  3. Utilize the FTS ranking functionality - see this article
  4. 利用FTS排名功能 - 请参阅此文章

#3


1  

You need first to decide what you mean by relevance. Here is my list of relevance

您首先需要确定相关性的含义。这是我的相关列表

  1. Both words are exactly first and last name
  2. 这两个词都是名字和姓氏

  3. Same as first but inverse order
  4. 与第一个相同,但顺序相反

  5. One is full word, one is substring
  6. 一个是完整的单词,一个是子串

  7. Same as 3rd
  8. 与第三名相同

  9. Both are substrings. Inside this you can rate by length of substring (more means good)
  10. 两者都是子串。在这里你可以按子串的长度评分(更多意味着好)

  11. One is substring
  12. 一个是子串

So i would recommend you to create function that will take FirstName and LastName and input string and return int, or float accoring to rules, then sort by it.

所以我建议你创建一个函数,它将接受FirstName和LastName并输入字符串并返回int,或者按照规则浮点数,然后按它排序。

#1


2  

As an addendum to OMG Ponies answer...

作为OMG小马的补充答案......

To get the best results from a full text search you might want to create an indexed view that concatenates the first and last name fields.

要从全文搜索中获得最佳结果,您可能需要创建一个连接名字和姓氏字段的索引视图。

This will allow you to weight individual parts of the full name more precisely.

这样您就可以更精确地对全名的各个部分进行加权。

Example code as follows:

示例代码如下:

    CREATE VIEW [dbo].[vFullname] WITH SCHEMABINDING AS
    SELECT personID, FirstName + ' ' + LastName AS name 
    FROM dbo.person 

    WITH ranks AS(
    SELECT FT_TBL.personid 
        ,FT_TBL.name
        ,KEY_TBL.RANK
    FROM dbo.vfullname AS FT_TBL 
        INNER JOIN CONTAINSTABLE(vfullname, (name),
    'ISABOUT ("Smith" WEIGHT (0.4), "Smi*" WEIGHT (0.2),
"John" WEIGHT (0.3), "Joh*" WEIGHT (0.1))') AS KEY_TBL  
          ON FT_TBL.personid = KEY_TBL.[KEY]
    ) 
    SELECT
    r.personID,
    p.firstname,
    p.lastname,
    r.rank
    FROM ranks r INNER JOIN
    person p ON r.personID = p.personID
    ORDER BY rank DESC;

The CTE just allows you to return the individual firstname and lastname fields. If you don't need these as an output then ignore it.

CTE只允许您返回单个的firstname和lastname字段。如果您不需要这些作为输出,则忽略它。

#2


5  

I recommend implementing Full Text Search (FTS) for two reasons:

我建议实施全文搜索(FTS)有两个原因:

  1. Simplify the query (shouldn't need the ORs, which'll perform poorly)
  2. 简化查询(不应该需要OR,这将表现不佳)

  3. Utilize the FTS ranking functionality - see this article
  4. 利用FTS排名功能 - 请参阅此文章

#3


1  

You need first to decide what you mean by relevance. Here is my list of relevance

您首先需要确定相关性的含义。这是我的相关列表

  1. Both words are exactly first and last name
  2. 这两个词都是名字和姓氏

  3. Same as first but inverse order
  4. 与第一个相同,但顺序相反

  5. One is full word, one is substring
  6. 一个是完整的单词,一个是子串

  7. Same as 3rd
  8. 与第三名相同

  9. Both are substrings. Inside this you can rate by length of substring (more means good)
  10. 两者都是子串。在这里你可以按子串的长度评分(更多意味着好)

  11. One is substring
  12. 一个是子串

So i would recommend you to create function that will take FirstName and LastName and input string and return int, or float accoring to rules, then sort by it.

所以我建议你创建一个函数,它将接受FirstName和LastName并输入字符串并返回int,或者按照规则浮点数,然后按它排序。