为什么在SQL Server中使用游标被认为是不好的做法?

I knew of some performance reasons back in the SQL 7 days, but do the same issues still exist in SQL Server 2005? If I have a resultset in a stored procedure that I want to act upon individually, are cursors still a bad choice? If so, why?

我在SQL 7天前就知道一些性能原因，但是在SQL Server 2005中还存在同样的问题吗?如果我在存储过程中有一个要单独操作的resultset，那么游标仍然是一个糟糕的选择吗?如果是这样,为什么?

11 个解决方案

#1

Because cursors take up memory and create locks.

因为游标占用内存并创建锁。

What you are really doing is attempting to force set-based technology into non-set based functionality. And, in all fairness, I should point out that cursors do have a use, but they are frowned upon because many folks who are not used to using set-based solutions use cursors instead of figuring out the set-based solution.

您真正要做的是尝试将基于集合的技术强制为基于非集合的功能。而且，在所有的公平性中，我应该指出游标是有用处的，但是它们是不受欢迎的，因为许多不习惯使用基于集合的解决方案的人使用游标，而不是使用基于集合的解决方案。

But, when you open a cursor, you are basically loading those rows into memory and locking them, creating potential blocks. Then, as you cycle through the cursor, you are making changes to other tables and still keeping all of the memory and locks of the cursor open.

但是，当您打开游标时，基本上是将这些行加载到内存中并锁定它们，从而创建潜在的块。然后，当您循环遍历游标时，您正在对其他表进行更改，并且仍然保持游标的所有内存和锁处于打开状态。

All of which has the potential to cause performance issues for other users.

所有这些都有可能导致其他用户的性能问题。

So, as a general rule, cursors are frowned upon. Especially if that's the first solution arrived at in solving a problem.

因此，一般来说，游标是不受欢迎的。特别是如果这是解决问题的第一个方法。

#2

The above comments about SQL being a set-based environment are all true. However there are times when row-by-row operations are useful. Consider a combination of metadata and dynamic-sql.

上面关于SQL是基于集合的环境的评论都是正确的。然而，有时逐行操作是有用的。考虑元数据和动态sql的组合。

As a very simple example, say I have 100+ records in a table that define the names of tables that I want to copy/truncate/whatever. Which is best? Hardcoding the SQL to do what I need to? Or iterate through this resultset and use dynamic-SQL (sp_executesql) to perform the operations?

作为一个非常简单的例子，假设我在一个表中有100多个记录，它定义了我想要复制/截去的表的名称。哪个是最好的呢?硬编码SQL以完成我需要的任务?或者迭代这个resultset并使用dynamic-SQL (sp_executesql)执行操作?

There is no way to achieve the above objective using set-based SQL.

使用基于集合的SQL无法实现上述目标。

So, to use cursors or a while loop (pseudo-cursors)?

那么，要使用游标还是while循环(伪游标)呢?

SQL Cursors are fine as long as you use the correct options:

只要使用正确的选项，SQL游标就可以:

INSENSITIVE will make a temporary copy of your result set (saving you from having to do this yourself for your pseudo-cursor).

不敏感将生成结果集的临时副本(避免您自己为伪游标做此操作)。

READ_ONLY will make sure no locks are held on the underlying result set. Changes in the underlying result set will be reflected in subsequent fetches (same as if getting TOP 1 from your pseudo-cursor).

READ_ONLY将确保在底层结果集上没有锁。底层结果集的更改将反映在后续的获取中(与您的伪游标上的TOP 1相同)。

FAST_FORWARD will create an optimised forward-only, read-only cursor.

FAST_FORWARD将创建一个优化的只读游标。

Read about the available options before ruling all cursors as evil.

在将所有游标都视为邪恶之前，先阅读一下可用选项。

#3

There is a work around about cursors that I use every time I need one.

有一个关于游标的工作，每当我需要游标时我都会用到它。

I create a table variable with an identity column in it.

我创建了一个包含标识列的表变量。

insert all the data i need to work with in it.

在其中插入我需要处理的所有数据。

Then make a while block with a counter variable and select the data I want from the table variable with a select statement where the identity column matches the counter.

然后使用计数器变量创建一个while块，并使用select语句从表变量中选择我想要的数据，其中标识列与计数器匹配。

This way i dont lock anything and use alot less memory and its safe, i will not lose anything with a memory corruption or something like that.

这样我就不会锁定任何东西，使用更少的内存和它的安全性，我不会丢失任何东西，因为内存损坏或类似的东西。

And the block code is easy to see and handle.

而且块代码很容易看到和处理。

This is a simple example:

这是一个简单的例子:

DECLARE @TAB TABLE(ID INT IDENTITY, COLUMN1 VARCHAR(10), COLUMN2 VARCHAR(10))

DECLARE @COUNT INT,
        @MAX INT, 
        @CONCAT VARCHAR(MAX), 
        @COLUMN1 VARCHAR(10), 
        @COLUMN2 VARCHAR(10)

SET @COUNT = 1

INSERT INTO @TAB VALUES('TE1S', 'TE21')
INSERT INTO @TAB VALUES('TE1S', 'TE22')
INSERT INTO @TAB VALUES('TE1S', 'TE23')
INSERT INTO @TAB VALUES('TE1S', 'TE24')
INSERT INTO @TAB VALUES('TE1S', 'TE25')

SELECT @MAX = @@IDENTITY

WHILE @COUNT <= @MAX BEGIN
    SELECT @COLUMN1 = COLUMN1, @COLUMN2 = COLUMN2 FROM @TAB WHERE ID = @COUNT

    IF @CONCAT IS NULL BEGIN
        SET @CONCAT = '' 
    END ELSE BEGIN 
        SET @CONCAT = @CONCAT + ',' 
    END

    SET @CONCAT = @CONCAT + @COLUMN1 + @COLUMN2

    SET @COUNT = @COUNT + 1
END

SELECT @CONCAT

#4

I think cursors get a bad name because SQL newbies discover them and think "Hey a for loop! I know how to use those!" and then they continue to use them for everything.

我认为游标有一个不好的名字，因为SQL新手发现它们并认为“嘿，for loop!”我知道怎么用它们!”然后他们继续用它们做所有的事情。

If you use them for what they're designed for, I can't find fault with that.

如果你把它们用在它们设计的用途上，我找不出有什么错。

#5

SQL is a set based language--that's what it does best.

SQL是一种基于集合的语言——这是它最擅长的。

I think cursors are still a bad choice unless you understand enough about them to justify their use in limited circumstances.

我认为游标仍然是一个糟糕的选择，除非您对它们有足够的了解，以证明在有限的情况下使用它们是合理的。

Another reason I don't like cursors is clarity. The cursor block is so ugly that it's difficult to use in a clear and effective way.

我不喜欢游标的另一个原因是清晰度。游标块非常难看，因此很难以清晰有效的方式使用。

All that having been said, there are some cases where a cursor really is best--they just aren't usually the cases that beginners want to use them for.

说了这么多，有一些情况下光标真的是最好的——它们通常不是初学者想要使用它们的情况。

#6

Sometimes the nature of the processing you need to perform requires cursors, though for performance reasons it's always better to write the operation(s) using set-based logic if possible.

有时，您需要执行的处理的性质需要游标，但出于性能原因，最好尽可能使用基于集合的逻辑编写操作。

I wouldn't call it "bad practice" to use cursors, but they do consume more resources on the server (than an equivalent set-based approach) and more often than not they aren't necessary. Given that, my advice would be to consider other options before resorting to a cursor.

我不认为使用游标是一种“坏习惯”，但它们确实在服务器上消耗了更多的资源(而不是类似的基于集合的方法)，而且常常不是必需的。鉴于此，我的建议是在使用光标之前考虑其他选项。

There are several types of cursors (forward-only, static, keyset, dynamic). Each one has different performance characteristics and associated overhead. Make sure you use the correct cursor type for your operation. Forward-only is the default.

有几种类型的游标(只向前、静态、键集、动态)。每一个都有不同的性能特征和相关的开销。确保您在操作中使用了正确的游标类型。之所以是默认的。

One argument for using a cursor is when you need to process and update individual rows, especially for a dataset that doesn't have a good unique key. In that case you can use the FOR UPDATE clause when declaring the cursor and process updates with UPDATE ... WHERE CURRENT OF.

使用游标的一个参数是，当您需要处理和更新单个行时，特别是对于没有良好唯一键的数据集时。在这种情况下，您可以在声明游标时使用FOR UPDATE子句，并使用UPDATE…在当前的。

Note that "server-side" cursors used to be popular (from ODBC and OLE DB), but ADO.NET does not support them, and AFAIK never will.

注意，“服务器端”游标过去很流行(来自ODBC和OLE DB)，但是ADO。NET不支持他们，而AFAIK永远不会支持他们。

#7

There are very, very few cases where the use of a cursor is justified. There are almost no cases where it will outperform a relational, set-based query. Sometimes it is easier for a programmer to think in terms of loops, but the use of set logic, for example to update a large number of rows in a table, will result in a solution that is not only many less lines of SQL code, but that runs much faster, often several orders of magnitude faster.

在非常、非常少的情况下，使用游标是合理的。几乎没有哪种情况下，它的性能会优于基于关系、集的查询。有时是一个程序员更容易考虑循环,但是使用的逻辑,例如更新大量的表中的行,将导致一个解决方案,不仅是许多不如行SQL代码,但运行更快,经常快几个数量级。

Even the fast forward cursor in Sql Server 2005 can't compete with set-based queries. The graph of performance degradation often starts to look like an n^2 operation compared to set-based, which tends to be more linear as the data set grows very large.

甚至Sql Server 2005中的快进游标也无法与基于集合的查询相竞争。性能下降的曲线通常开始看起来像一个n ^ 2操作基于集合的相比,往往更为线性随着数据集非常大。

#8

@ Daniel P -> you don't need to use a cursor to do it. You can easily use set based theory to do it. Eg: with Sql 2008

你不需要用光标来做。你可以很容易地使用基于集合的理论来做。例如:Sql 2008

DECLARE @commandname NVARCHAR(1000) = '';

SELECT @commandname += 'truncate table ' + tablename + '; ';
FROM tableNames;

EXEC sp_executesql @commandname;

will simply do what you have said above. And you can do the same with Sql 2000 but the syntax of query would be different.

将简单地做你上面所说的。使用Sql 2000也可以这样做，但是查询的语法不同。

However, my advice is to avoid cursors as much as possible.

然而，我的建议是尽量避免游标。

Gayam

#9

Cursors do have their place, however I think it's mainly because they are often used when a single select statement would suffice to provide aggregation and filtering of results.

游标确实有自己的位置，但是我认为这主要是因为当一个select语句足以提供聚合和结果过滤时，常常使用游标。

Avoiding cursors allows SQL Server to more fully optimize the performance of the query, very important in larger systems.

避免游标允许SQL Server更充分地优化查询的性能，这在大型系统中非常重要。

#10

Cursors are usually not the disease, but a symptom of it: not using the set-based approach (as mentioned in the other answers).

游标通常不是疾病，而是疾病的症状:不使用基于集合的方法(如其他答案所述)。

Not understanding this problem, and simply believing that avoiding the "evil" cursor will solve it, can make things worse.

不理解这个问题，并且仅仅相信避免“邪恶”的光标将会解决它，会使事情变得更糟。

For example, replacing cursor iteration by other iterative code, such as moving data to temporary tables or table variables, to loop over the rows in a way like:

例如，用其他迭代代码替换游标迭代，例如将数据移动到临时表或表变量，以以下方式循环行:

SELECT * FROM @temptable WHERE Id=@counter

或

SELECT TOP 1 * FROM @temptable WHERE Id>@lastId

Such an approach, as shown in the code of another answer, makes things much worse and doesn't fix the original problem. It's an anti-pattern called cargo cult programming: not knowing WHY something is bad and thus implementing something worse to avoid it! I recently changed such code (using a #temptable and no index on identity/PK) back to a cursor, and updating slightly more than 10000 rows took only 1 second instead of almost 3 minutes. Still lacking set-based approach (being the lesser evil), but the best I could do that moment.

这样的方法，如另一个答案的代码所示，会使事情变得更糟，而且无法解决最初的问题。这是一种反模式，称为货物崇拜编程:不知道为什么某些东西是坏的，因此实现了一些更坏的东西来避免它!我最近将这类代码(使用#temptable，没有标识/PK上的索引)更改为游标，并且更新稍微超过10000行只需要1秒，而不是3分钟。仍然缺乏基于集合的方法(小的邪恶)，但我能做的最好的那一刻。

Another symptom of this lack of understanding can be what I sometimes call "one object disease": database applications which handle single objects through data access layers or object-relational mappers. Typically code like:

这种缺乏理解的另一个症状可能是我有时称之为“一种对象疾病”:通过数据访问层或对象关系映射器处理单个对象的数据库应用程序。通常代码:

var items = new List<Item>();
foreach(int oneId in itemIds)
{
    items.Add(dataAccess.GetItemById(oneId);
}

instead of

而不是

var items = dataAccess.GetItemsByIds(itemIds);

The first will usually flood the database with tons of SELECTs, one round trip for each, especially when object trees/graphs come into play and the infamous SELECT N+1 problem strikes.

第一个通常会向数据库注入大量的选择，每一次都有一次往返，特别是当对象树/图形开始发挥作用，以及臭名昭著的SELECT N+1问题的时候。

This is the application side of not understanding relational databases and set based approach, just the same way cursors are when using procedural database code, like T-SQL or PL/SQL!

这是不理解关系数据库和基于集合的方法的应用程序方面，就像使用过程数据库代码时游标一样，比如T-SQL或PL/SQL!

#11

The basic issue, I think, is that databases are designed and tuned for set-based operations -- selects, updates, and deletes of large amounts of data in a single quick step based on relations in the data.

我认为，最基本的问题是，数据库是为基于集的操作设计和调优的——根据数据中的关系，在一个快速的步骤中选择、更新和删除大量数据。

In-memory software, on the other hand, is designed for individual operations, so looping over a set of data and potentially performing different operations on each item serially is what it is best at.

另一方面，内存中的软件是为单独的操作而设计的，所以它最擅长的是在一组数据上循环，并可能对每个项目串行地执行不同的操作。

Looping is not what the database or storage architecture are designed for, and even in SQL Server 2005, you are not going to get performance anywhere close to you get if you pull the basic data set out into a custom program and do the looping in memory, using data objects/structures that are as lightweight as possible.

循环不是数据库或存储架构设计,甚至在SQL Server 2005中,你不会得到性能接近你如果你把基本数据集到一个自定义的程序在内存中做循环,使用数据对象/结构尽可能的轻。

#1