在UPDATE之后可靠地测量*匹配*行的@@ ROWCOUNT吗?

时间:2021-09-11 17:00:29

Does @@ROWCOUNT reliably tell you how many rows matched the WHERE clause in an UPDATE, as opposed to how many where actually changed by it?

@@ ROWCOUNT是否可靠地告诉您在UPDATE中与WHERE子句匹配的行数,而不是实际更改了多少行?

In the documentation for @@ROWCOUNT it says:

在@@ ROWCOUNT的文档中,它说:

Data manipulation language (DML) statements set the @@ROWCOUNT value to the number of rows affected by the query and return that value to the client.

数据操作语言(DML)语句将@@ ROWCOUNT值设置为受查询影响的行数,并将该值返回给客户端。

(My emphasis.)

(我的重点。)

But if I have

但如果我有

CREATE TABLE [Foo] ([a] INT, [b] INT)
GO
INSERT INTO [Foo] ([a], [b]) VALUES (1, 1),(1, 2),(1, 3),(2, 2)
GO
UPDATE [Foo] SET [b] = 1 WHERE [a] = 1
SELECT @@ROWCOUNT
GO

...I see 3 (the number of rows matching [a] = 1), not 2 (the number of rows modified by the UPDATE — one of the three rows already had the value 1 for b). This seems like an odd definition of "affected" (not wrong, just at odds with how I'd normally use the word — it's actually quite handy for what I want to do, in fact).

...我看到3(匹配[a] = 1的行数),而不是2(UPDATE修改的行数 - 三行中的一行已经具有b的值1)。这似乎是一个奇怪的“受影响”的定义(没有错,只是与我通常使用这个词的方式不一致 - 事实上它实际上非常方便我想做的事)。

(The similar MySQL ROW_COUNT function, for instance, would return 2 in this situation.)

(例如,类似的MySQL ROW_COUNT函数在这种情况下将返回2。)

Is this reliable behavior, ideally documented somewhere I just haven't found? Or are there odd edge cases...

这种可靠的行为,理想情况下记录在某些我还没有找到的地方吗?还是有奇怪的边缘情况......

To be clear: I'm not asking if 3 is the right answer. I'm asking if it's a reliable answer, or are there edge cases where SQL Server will leave out rows that matched but didn't require a change.

要明确:我不是在问3是否是正确的答案。我问这是否是一个可靠的答案,或者是否存在SQL Server将遗漏匹配但不需要更改的行的边缘情况。

Update: A couple of people have asked (or hinted at) what kind of "reliability" issues I'm worried about. The fact is they're quite nebulous, but, dunno, replication? Transactions? Partitioning? Indexes it could use to avoid seeking to rows because it knows that b is already 1, and so it skips those? ...?

更新:有几个人问过(或暗示)我担心的“可靠性”问题。事实上,他们是相当模糊的,但是,dunno,复制?交易?分区?它可以用来避免寻找行的索引,因为它知道b已经是1,所以它会跳过那些? ...?

Update: I was hoping for someone with a more "insider" view of how SQL Server works to answer this question, but it looks like the triggers example (and others I've played with) by xacinay is as close as we're going to get. And it seems pretty darned solid; if it behaves that way in the normal case and it didn't behave that way despite partitioning or whatsit, as someone said, surely that would qualify as a bug. It's just empirical rather than academic.

更新:我希望有人能够更加“内部”地了解SQL Server如何回答这个问题,但看起来像xacinay的触发器示例(以及我玩过的其他人)就像我们要走的那样近要得到。它似乎相当坚实;如果它在正常情况下表现如此,并且尽管有分区或whatsit它没有表现出来,正如有人说的那样,肯定会有资格作为一个bug。这只是经验而非学术。

4 个解决方案

#1


9  

The documentation for @@ROWCOUNT is telling you the truth because 3 rows would be reliably affected as opposed to MySQL's ROW_COUNT().

@@ ROWCOUNT的文档告诉你实情,因为3行会受到可靠的影响而不是MySQL的ROW_COUNT()。

not 2 (the number of rows modified by the UPDATE — one of the three rows already had the value 1 for b).

not 2(由UPDATE修改的行数 - 三行中的一行已经具有b的值1)。

For UPDATE it's not important if the new and previous values are identical. It simply does what its told to: finds data source, filters rows according to provided condition, and applies 'set' changes to filtered rows.

对于UPDATE,如果新值和先前值相同则不重要。它只是执行它告诉的内容:查找数据源,根据提供的条件过滤行,并对过滤的行应用“set”更改。

That's the way SQL Server works without any reservations. MySQL may work different. A row counting procedure is not a part of the SQL standard. So, you have to look before you leap for those kinds of artefacts every time you switch from one RDBMS to another.

这就是SQL Server在没有任何保留的情况下工作的方式。 MySQL可能会有所不同。行计数过程不是SQL标准的一部分。因此,每当您从一个RDBMS切换到另一个RDBMS时,您必须先看一下这些类型的文物。

Some triggers to see actual update behaviour:

一些触发器可以看到实际的更新行为

CREATE TRIGGER [dbo].[trgFooForUpd]
ON [dbo].[Foo]
FOR UPDATE 
AS begin declare @id int;
      select @id = [a] from INSERTED;
      select * from INSERTED; end;
GO
CREATE TRIGGER [dbo].[trgFooAfterUpd]
ON [dbo].[Foo]
AFTER UPDATE 
AS print 'update done for ' + cast(coalesce( @@ROWCOUNT, -1) as varchar )+'rows'

#2


7  

To expand on xacinay's answer because he is correct.

为了扩展xacinay的答案,因为他是正确的。

You have 3 rows changed and therefore @@ROWCOUNT is accurate. The SQL Server changes all rows, it does not verify that a value is in fact different before changing it, as that would require a lot of overhead on update commands. Just imagining having to check a VARCHAR(MAX) for whether the value was actually changed or not.

您有3行已更改,因此@@ ROWCOUNT是准确的。 SQL Server会更改所有行,在更改之前它不会验证值实际上是否有所不同,因为这需要在更新命令上花费大量开销。只是想象必须检查VARCHAR(MAX)是否实际更改了值。

The easiest way to illustrate this is to actually change yor UPDATE query to something like this:

说明这一点的最简单方法是将yor UPDATE查询实际更改为以下内容:

UPDATE [Foo]  SET [b] = 1
OUTPUT INSERTED.b
WHERE [a] = 1

It will output 3 rows of INSERTED which is the 'pseudo' table that holds the new values for a given update/insert command. That the value in fact is already b = 1 in one instance does not matter.

它将输出3行INSERTED,这是“伪”表,它保存给定更新/插入命令的新值。在一个实例中,该值实际上已经是b = 1并不重要。

If you want that to matter you'll need to include it in your WHERE clause:

如果您想要这一点,您需要将它包含在WHERE子句中:

UPDATE [Foo]  SET [b] = 1
WHERE [a] = 1 AND [b] <> 1
SELECT @@ROWCOUNT

Alternatively, and as a more general way of doing this check, you can make a trigger and compare the values/fields in the DELETED table with the values in the INSERTED table and use that as foundation for whether a row is actually 'changed'.

或者,作为执行此检查的更一般方法,您可以创建一个触发器,并将DELETED表中的值/字段与INSERTED表中的值进行比较,并将其用作行是否实际“更改”的基础。

So - 3 is the accurate number as you have updated 3 rows because 3 rows were touched by [a] = 1

所以 - 3是准确的数字,因为您更新了3行,因为[a] = 1触及了3行

#3


3  

I think the documentation is correct because no matter that one row in your example already had 1 as the value in [b], that row still satisfied the criteria in the WHERE clause so the value was 'updated'.

我认为文档是正确的,因为无论你的例子中的哪一行已经有1作为[b]中的值,该行仍然满足WHERE子句中的条件,因此值是'更新'。

We can see a proof by extending your example a little bit and including a TIMESTAMP column as in this SQLFiddle. After the update the TIMESTAMP on all columns which were matched by the WHERE cluase has changed indicating that the row itself was changed and not merely assessed and discarded because the target value matched that which was already present.

我们可以通过扩展您的示例并在此SQLFiddle中包含TIMESTAMP列来查看证明。更新后,WHERE cluase匹配的所有列上的TIMESTAMP已更改,表明行本身已更改,而不仅仅是评估和丢弃,因为目标值与已存在的目标值匹配。

#4


0  

In summary you are asking if @@rowcount is deterministic. Really if it was non-deterministic you don't think you would see that in the documentation? It is reasonable to assume deterministic. @@VERSION and @@MAX_PRECISION are also not documented as deterministic based on that are you questioning edge cases where they are not deterministic. I seriously doubt there is an edge case that it fails but it it does then Microsoft will accept it as a bug. They are not going to come back with @@rowcount is nondeterministic - the documentation did not explicitly state deterministic.

总而言之,您在询问@@ rowcount是否具有确定性。真的,如果它是非确定性的,你不认为你会在文档中看到它吗?假设确定性是合理的。 @@ VERSION和@@ MAX_PRECISION也没有被记录为确定性的,因为你质疑它们不具有确定性的边缘情况。我严重怀疑有一个边缘情况,它失败但它确实然后微软将接受它作为一个错误。他们不会回来@@ rowcount是不确定的 - 文档没有明确说明确定性。

Numerous examples that "affected" in MMSQL TSQL counts a row set to the same value (it only cares about the where)
And you have examples that MSSQL does in fact assign the same value (timestamp)
You assert but SQL could care so how do I do I know there is not an edge case that is does
Because that is not reasonable behavior
Any program should produce consistent output
The order without sort is not guaranteed but the rows are the same - that is specifically documented by Microsoft and SQL as nondeterministic
If @@rowcount was nondeterministic then I trust Microsoft would document that
It is reasonable to assume @@rowcount is deterministic
C# and Java do not always have the same behavior
That is not a reason for me to suspect C# and Java are not reliable

在MMSQL TSQL中“受影响”的众多示例将一行设置为相同的值(它只关心其中的位置)并且您有MSSQL实际上分配相同值(时间戳)的示例您断言但SQL可以关心所以如何做我知道没有一个边缘情况,因为这是不合理的行为任何程序应该产生一致的输出没有排序的顺序不保证但行是相同的 - 这是由Microsoft和SQL特别记录为不确定的If @@ rowcount是不确定的,然后我相信微软会记录,假设@@ rowcount是确定性的C#并且Java并不总是具有相同的行为是合理的。这不是我怀疑C#和Java不可靠的原因

Look at the query plan in MSSQL
There is no predicate on [a]
There is a predicate on [b]

查看MSSQL中的查询计划[a]上没有谓词[b]上有谓词

If you change the query to

如果您将查询更改为

UPDATE [Foo] SET [b] = 1 WHERE [a] <> 1 and [b] = 1;

Then you will see predicates on both [a] and [b]

然后你会在[a]和[b]上看到谓词

The query optimizer will decide how most efficiently process the query it is not going to change the query.
Introducing a predicate on [b] in the first query is changing the query.
A proper database just does not do that.

查询优化器将决定如何最有效地处理查询,它不会更改查询。在第一个查询中在[b]上引入谓词正在改变查询。一个合适的数据库就是不这样做。

I highly suspect that in MySQL if you look at the query plan there will be a predicate on [a] in the first query.

我非常怀疑在MySQL中,如果查看查询计划,第一个查询中的[a]会有一个谓词。

Rather than require proof that is reliable create an edge case that proves it is not reliable.
If you can create a situation it is not reliable I submit Microsoft would accept it as a bug.

而不是要求可靠的证据创建一个证明它不可靠的边缘情况。如果你可以创建一个不可靠的情况我提交微软将接受它作为一个错误。

What is the business case?
If you have business case that a value must be updated with same value then by definition you have something to test.
The only examples I can thing of are timestamp or trigger.
If you have true need to for an update to the same to take place then you have something to measure.
Do you have any evidence that a value is not updated to the same value?
If you still don't trust it then wrap it in a transaction.

什么是商业案例?如果您有业务案例,必须使用相同的值更新值,那么根据定义,您需要测试一些内容。我可以做的唯一例子是时间戳或触发器。如果您确实需要对其进行更新,那么您需要测量一些内容。您是否有任何证据表明某个值未更新为相同值?如果您仍然不信任它,那么将其包装在一个事务中。

If you don't need an update to the same value then why not just add [b] <> 1. That is more efficient.

如果您不需要更新相同的值,那么为什么不添加[b] <> 1.这样更有效。

SO is for specific programming question.
What is the programming question?
Prove MSSQL is reliable is not a programming question.

SO适用于特定的编程问题。什么是编程问题?证明MSSQL可靠不是一个编程问题。

#1


9  

The documentation for @@ROWCOUNT is telling you the truth because 3 rows would be reliably affected as opposed to MySQL's ROW_COUNT().

@@ ROWCOUNT的文档告诉你实情,因为3行会受到可靠的影响而不是MySQL的ROW_COUNT()。

not 2 (the number of rows modified by the UPDATE — one of the three rows already had the value 1 for b).

not 2(由UPDATE修改的行数 - 三行中的一行已经具有b的值1)。

For UPDATE it's not important if the new and previous values are identical. It simply does what its told to: finds data source, filters rows according to provided condition, and applies 'set' changes to filtered rows.

对于UPDATE,如果新值和先前值相同则不重要。它只是执行它告诉的内容:查找数据源,根据提供的条件过滤行,并对过滤的行应用“set”更改。

That's the way SQL Server works without any reservations. MySQL may work different. A row counting procedure is not a part of the SQL standard. So, you have to look before you leap for those kinds of artefacts every time you switch from one RDBMS to another.

这就是SQL Server在没有任何保留的情况下工作的方式。 MySQL可能会有所不同。行计数过程不是SQL标准的一部分。因此,每当您从一个RDBMS切换到另一个RDBMS时,您必须先看一下这些类型的文物。

Some triggers to see actual update behaviour:

一些触发器可以看到实际的更新行为

CREATE TRIGGER [dbo].[trgFooForUpd]
ON [dbo].[Foo]
FOR UPDATE 
AS begin declare @id int;
      select @id = [a] from INSERTED;
      select * from INSERTED; end;
GO
CREATE TRIGGER [dbo].[trgFooAfterUpd]
ON [dbo].[Foo]
AFTER UPDATE 
AS print 'update done for ' + cast(coalesce( @@ROWCOUNT, -1) as varchar )+'rows'

#2


7  

To expand on xacinay's answer because he is correct.

为了扩展xacinay的答案,因为他是正确的。

You have 3 rows changed and therefore @@ROWCOUNT is accurate. The SQL Server changes all rows, it does not verify that a value is in fact different before changing it, as that would require a lot of overhead on update commands. Just imagining having to check a VARCHAR(MAX) for whether the value was actually changed or not.

您有3行已更改,因此@@ ROWCOUNT是准确的。 SQL Server会更改所有行,在更改之前它不会验证值实际上是否有所不同,因为这需要在更新命令上花费大量开销。只是想象必须检查VARCHAR(MAX)是否实际更改了值。

The easiest way to illustrate this is to actually change yor UPDATE query to something like this:

说明这一点的最简单方法是将yor UPDATE查询实际更改为以下内容:

UPDATE [Foo]  SET [b] = 1
OUTPUT INSERTED.b
WHERE [a] = 1

It will output 3 rows of INSERTED which is the 'pseudo' table that holds the new values for a given update/insert command. That the value in fact is already b = 1 in one instance does not matter.

它将输出3行INSERTED,这是“伪”表,它保存给定更新/插入命令的新值。在一个实例中,该值实际上已经是b = 1并不重要。

If you want that to matter you'll need to include it in your WHERE clause:

如果您想要这一点,您需要将它包含在WHERE子句中:

UPDATE [Foo]  SET [b] = 1
WHERE [a] = 1 AND [b] <> 1
SELECT @@ROWCOUNT

Alternatively, and as a more general way of doing this check, you can make a trigger and compare the values/fields in the DELETED table with the values in the INSERTED table and use that as foundation for whether a row is actually 'changed'.

或者,作为执行此检查的更一般方法,您可以创建一个触发器,并将DELETED表中的值/字段与INSERTED表中的值进行比较,并将其用作行是否实际“更改”的基础。

So - 3 is the accurate number as you have updated 3 rows because 3 rows were touched by [a] = 1

所以 - 3是准确的数字,因为您更新了3行,因为[a] = 1触及了3行

#3


3  

I think the documentation is correct because no matter that one row in your example already had 1 as the value in [b], that row still satisfied the criteria in the WHERE clause so the value was 'updated'.

我认为文档是正确的,因为无论你的例子中的哪一行已经有1作为[b]中的值,该行仍然满足WHERE子句中的条件,因此值是'更新'。

We can see a proof by extending your example a little bit and including a TIMESTAMP column as in this SQLFiddle. After the update the TIMESTAMP on all columns which were matched by the WHERE cluase has changed indicating that the row itself was changed and not merely assessed and discarded because the target value matched that which was already present.

我们可以通过扩展您的示例并在此SQLFiddle中包含TIMESTAMP列来查看证明。更新后,WHERE cluase匹配的所有列上的TIMESTAMP已更改,表明行本身已更改,而不仅仅是评估和丢弃,因为目标值与已存在的目标值匹配。

#4


0  

In summary you are asking if @@rowcount is deterministic. Really if it was non-deterministic you don't think you would see that in the documentation? It is reasonable to assume deterministic. @@VERSION and @@MAX_PRECISION are also not documented as deterministic based on that are you questioning edge cases where they are not deterministic. I seriously doubt there is an edge case that it fails but it it does then Microsoft will accept it as a bug. They are not going to come back with @@rowcount is nondeterministic - the documentation did not explicitly state deterministic.

总而言之,您在询问@@ rowcount是否具有确定性。真的,如果它是非确定性的,你不认为你会在文档中看到它吗?假设确定性是合理的。 @@ VERSION和@@ MAX_PRECISION也没有被记录为确定性的,因为你质疑它们不具有确定性的边缘情况。我严重怀疑有一个边缘情况,它失败但它确实然后微软将接受它作为一个错误。他们不会回来@@ rowcount是不确定的 - 文档没有明确说明确定性。

Numerous examples that "affected" in MMSQL TSQL counts a row set to the same value (it only cares about the where)
And you have examples that MSSQL does in fact assign the same value (timestamp)
You assert but SQL could care so how do I do I know there is not an edge case that is does
Because that is not reasonable behavior
Any program should produce consistent output
The order without sort is not guaranteed but the rows are the same - that is specifically documented by Microsoft and SQL as nondeterministic
If @@rowcount was nondeterministic then I trust Microsoft would document that
It is reasonable to assume @@rowcount is deterministic
C# and Java do not always have the same behavior
That is not a reason for me to suspect C# and Java are not reliable

在MMSQL TSQL中“受影响”的众多示例将一行设置为相同的值(它只关心其中的位置)并且您有MSSQL实际上分配相同值(时间戳)的示例您断言但SQL可以关心所以如何做我知道没有一个边缘情况,因为这是不合理的行为任何程序应该产生一致的输出没有排序的顺序不保证但行是相同的 - 这是由Microsoft和SQL特别记录为不确定的If @@ rowcount是不确定的,然后我相信微软会记录,假设@@ rowcount是确定性的C#并且Java并不总是具有相同的行为是合理的。这不是我怀疑C#和Java不可靠的原因

Look at the query plan in MSSQL
There is no predicate on [a]
There is a predicate on [b]

查看MSSQL中的查询计划[a]上没有谓词[b]上有谓词

If you change the query to

如果您将查询更改为

UPDATE [Foo] SET [b] = 1 WHERE [a] <> 1 and [b] = 1;

Then you will see predicates on both [a] and [b]

然后你会在[a]和[b]上看到谓词

The query optimizer will decide how most efficiently process the query it is not going to change the query.
Introducing a predicate on [b] in the first query is changing the query.
A proper database just does not do that.

查询优化器将决定如何最有效地处理查询,它不会更改查询。在第一个查询中在[b]上引入谓词正在改变查询。一个合适的数据库就是不这样做。

I highly suspect that in MySQL if you look at the query plan there will be a predicate on [a] in the first query.

我非常怀疑在MySQL中,如果查看查询计划,第一个查询中的[a]会有一个谓词。

Rather than require proof that is reliable create an edge case that proves it is not reliable.
If you can create a situation it is not reliable I submit Microsoft would accept it as a bug.

而不是要求可靠的证据创建一个证明它不可靠的边缘情况。如果你可以创建一个不可靠的情况我提交微软将接受它作为一个错误。

What is the business case?
If you have business case that a value must be updated with same value then by definition you have something to test.
The only examples I can thing of are timestamp or trigger.
If you have true need to for an update to the same to take place then you have something to measure.
Do you have any evidence that a value is not updated to the same value?
If you still don't trust it then wrap it in a transaction.

什么是商业案例?如果您有业务案例,必须使用相同的值更新值,那么根据定义,您需要测试一些内容。我可以做的唯一例子是时间戳或触发器。如果您确实需要对其进行更新,那么您需要测量一些内容。您是否有任何证据表明某个值未更新为相同值?如果您仍然不信任它,那么将其包装在一个事务中。

If you don't need an update to the same value then why not just add [b] <> 1. That is more efficient.

如果您不需要更新相同的值,那么为什么不添加[b] <> 1.这样更有效。

SO is for specific programming question.
What is the programming question?
Prove MSSQL is reliable is not a programming question.

SO适用于特定的编程问题。什么是编程问题?证明MSSQL可靠不是一个编程问题。