具有前导通配符的参数化查询的SQL Server性能

I have a SQL 2008 R2 Database with about 2 million rows in one of the tables and am struggling with the performance of a specific query when using parameterized SQL.

我有一个SQL 2008 R2数据库,其中一个表中有大约200万行,并且在使用参数化SQL时,我在努力处理特定查询的性能。

In the table, there's a field containing a name in it:

在表中,有一个包含名称的字段:

[PatientsName] nvarchar NULL,

[PatientsName] nvarchar NULL,

There's also a simple index on the field:

该领域还有一个简单的索引:


CREATE NONCLUSTERED INDEX [IX_Study_PatientsName] ON [dbo].[Study] 
(
    [PatientsName] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON, FILLFACTOR = 90) ON [INDEXES]
GO

When I do this query in the management studio, it takes around 4 seconds to execute:

当我在管理工作室中执行此查询时,执行大约需要4秒钟:


declare @StudyPatientsName nvarchar(64)
set @StudyPatientsName= '%Jones%'

SELECT COUNT(*) FROM Study WHERE Study.PatientsName like @StudyPatientsName

But, when I execute this query:

但是,当我执行此查询时:


SELECT COUNT(*) FROM Study WHERE Study.PatientsName like '%Jones%'

it takes a bit more than a half second to execute.

执行需要花费半秒多一点的时间。

Looking at the execution plans, the query without parameterization does an Index Scan using the above mentioned index, which obviously is efficient. The parameterized query uses the index, but does a range seek on the index.

查看执行计划,没有参数化的查询使用上面提到的索引进行索引扫描,这显然是有效的。参数化查询使用索引,但对索引执行范围搜索。

Part of the issue is having the leading wildcard. When I remove the leading wildcard, both queries return in a fraction of a second. Unfortunately, I do need to support leading wildcards.

部分问题是拥有领先的通配符。当我删除前导通配符时,两个查询都会在几分之一秒内返回。不幸的是,我确实需要支持领先的通配符。

We have a home grown ORM that does parameterized queries where the problem originated. These queries are done based on input from a user, so parameterized queries make sense to avoid things like a SQL injection attack. I'm wondering if there's a way to make the parameterized query function as well as the non-parameterized query?

我们有一个自行开发的ORM,可以在问题产生的地方进行参数化查询。这些查询是基于用户的输入完成的,因此参数化查询有助于避免像SQL注入攻击这样的事情。我想知道是否有办法制作参数化查询功能以及非参数化查询?

I've done some research looking at different ways to give hints to the query optimizer, trying to force the optimizer to redo the query plan on each query, but haven't found anything yet to improve the performance. I tried this query:

我已经做了一些研究,研究了向查询优化器提供提示的不同方法,试图强制优化器在每个查询上重做查询计划,但还没有找到任何改进性能的方法。我试过这个查询:


SELECT COUNT(*) FROM Study WHERE Study.PatientsName like @StudyPatientsName
OPTION ( OPTIMIZE FOR (@StudyPatientsName = '%Jones%'))

which was mentioned as a solution in this question, but it didn't make a difference.

在这个问题中被提到作为解决方案,但它并没有什么不同。

Any help would be appreciated.

任何帮助,将不胜感激。

4 个解决方案

#1

It seems like you want to force a scan. There is a FORCESEEK hint but I couldn't see any analogous FORCESCAN hint. This should do it though.

好像你想要强制扫描一下。有一个FORCESEEK提示,但我看不到任何类似的FORCESCAN提示。这应该这样做。

SELECT COUNT(*) 
FROM Study 
WHERE Study.PatientsName + '' like @StudyPatientsName

Although maybe you could try the following on your data and see how it works out .

虽然您可以在数据上尝试以下操作,看看它是如何工作的。

SELECT COUNT(*) 
FROM Study 
WHERE Study.PatientsName  like @StudyPatientsName
option (recompile)

#2

I think your best chance of improving performance here is to look into using a full text index.

我认为在这里提高性能的最佳机会是研究使用全文索引。

#3

I'm having trouble finding the documentation to verify this, but IIRC, COUNT(*) does a full table scan in MS SQL (as opposed to using a cached value). If you run it against a column which cannot be null and/or has an index defined, I believe (again, still can't find docs to confirm, so I could be off base here) that will be faster.

我无法找到文档来验证这一点,但是IIRC,COUNT(*)在MS SQL中进行全表扫描(而不是使用缓存值)。如果你针对一个不能为null和/或定义了索引的列运行它,我相信(再次,仍然找不到要确认的文档,所以我可能会在这里离开)会更快。

What happens when you modify the query to something like:

将查询修改为以下内容时会发生什么:

SELECT COUNT(id) FROM Study WHERE Study.PatientsName Like @StudyPatientsName

SELECT COUNT(PatientsName) FROM Study 
WHERE Study.PatientsName 
LIKE @StudyPatientsName

#4

If all else fails you could try

如果一切都失败了你可以试试

SELECT COUNT(*) FROM Study WITH(INDEX(0)) WHERE Study.PatientsName like @StudyPatientsName

Perhaps you could wrap it in an IF

也许你可以把它包装在一个IF中

IF substring(@StudyPatientsName, 1, 1) = '%'
    SELECT COUNT(*) FROM Study WITH(INDEX(0)) WHERE Study.PatientsName like @StudyPatientsName
ELSE
    SELECT COUNT(*) FROM Study WHERE Study.PatientsName like @StudyPatientsName

Edit: As martin pointed out, for this specific query this is probably not the best way to do it since an index scan of the existing index is likely faster. It might be applicable in similar situations, though.

编辑:正如马丁所指出的,对于这个特定的查询,这可能不是最好的方法,因为现有索引的索引扫描可能更快。但是,它可能适用于类似的情况。

#1