在使用字符串替换进行更新时,是否应该添加WHERE子句

时间:2022-09-10 23:35:17

I want to perform a string replacement on an entire column, changing all instances of one phrase to another:

我想在整个列上执行一个字符串替换,将一个短语的所有实例更改为另一个短语:

UPDATE `some_records`
SET `some_column` = REPLACE(`some_column`, 'foo', 'bar');

Since many of the rows do not contain the string 'foo' they will be unaffected by this query, which is fine; I only care about the rows that do contain it. My question is, is there any reason to add a WHERE clause to explicitly target the rows that will be affected? e.g.

由于许多行不包含字符串“foo”,因此不受该查询的影响,这很好;我只关心包含它的行。我的问题是,是否有理由添加WHERE子句来显式地针对将受到影响的行?如。

UPDATE `some_records`
SET `some_column` = REPLACE(`some_column`, 'foo', 'bar')
WHERE `some_column` LIKE '%foo%';

As far as I can tell, both queries have the exact same effect. Is there any advantage to the 2nd version? Does it provide better performance or any other benefits? So far I haven't found documentation to say one is better than the other.

就我所知,这两个查询具有完全相同的效果。第二版有什么优势吗?它提供更好的性能或其他好处吗?到目前为止,我还没有找到证明其中一个比另一个更好的文件。

2 个解决方案

#1


1  

If there's a BEFORE/AFTER UPDATE trigger defined on the table, the difference in the queries is whether the trigger is fired for all rows in the table, or just the rows that satisfy the predicate in the WHERE clause.

如果表中定义了一个前后更新触发器,那么查询的不同之处在于,触发器是针对表中的所有行触发的,还是只针对WHERE子句中的谓词触发的。

Otherwise, in MySQL, these two queries are equivalent. MySQL doesn't count (or report) a row as being "affected" by an UPDATE if the value assigned to the column is identical the value already in the column. (Other relational databases do count such rows in the "affected" count.

否则,在MySQL中,这两个查询是等价的。如果分配给列的值与列中已经存在的值相同,那么MySQL不会将行计算(或报告)为受更新影响。(其他关系数据库确实在“受影响”计数中计算此类行。

Because of the leading percent sign in the LIKE comparison, that condition will need to be evaluated for every row in the table, so there's not going to be any difference in performance. If there's an index on some_records(some_column), MySQL might choose to a full index scan which might be slightly faster in some cases.)

由于相似比较中的领先百分比符号,因此需要对表中的每一行进行评估,因此性能不会有任何差异。如果在some_records(some_column)上有索引,MySQL可能会选择完全索引扫描,这在某些情况下可能会稍微快一些)。

If you're familiar with other relational databases (Oracle, SQL Server, et al.) then adding the WHERE clause is second nature.

如果您熟悉其他关系数据库(Oracle、SQL Server等),那么添加WHERE子句是第二天性。

Aside from those issues, it doesn't really matter if you add the WHERE clause or not.

除了这些问题之外,是否添加WHERE子句并不重要。

The reasons I could see with bothering with adding a WHERE clause:

添加WHERE子句的原因我可以理解为:

  • avoid firing BEFORE/AFTER UPDATE triggers
  • 避免在更新触发器之前或之后触发
  • familiar pattern used in other relational databases
  • 在其他关系数据库中使用的常见模式
  • possibly improved performance (if the rows are really long, if the index is much, much shorter, and a small fraction of the rows will satisfy the condition)
  • 可能会提高性能(如果行真的很长,如果索引很短,并且只有一小部分行满足条件)

#2


2  

AFAIK, if you have an index on a column which is used as a condition in the WHERE clause it should speed up the lookup of the rows which are supposed to be updated.

AFAIK,如果您在列上有一个索引,该索引用作WHERE子句中的条件,那么它应该会加快查找应该更新的行。

If you don't have a where clause, the database default reads all the rows from the disk and then does replace. For strings which don't qualify for the replace it is an unnecessary lookup from the disk.

如果没有where子句,则数据库默认从磁盘读取所有行,然后进行替换。对于不符合替换条件的字符串,它是来自磁盘的不必要的查找。

#1


1  

If there's a BEFORE/AFTER UPDATE trigger defined on the table, the difference in the queries is whether the trigger is fired for all rows in the table, or just the rows that satisfy the predicate in the WHERE clause.

如果表中定义了一个前后更新触发器,那么查询的不同之处在于,触发器是针对表中的所有行触发的,还是只针对WHERE子句中的谓词触发的。

Otherwise, in MySQL, these two queries are equivalent. MySQL doesn't count (or report) a row as being "affected" by an UPDATE if the value assigned to the column is identical the value already in the column. (Other relational databases do count such rows in the "affected" count.

否则,在MySQL中,这两个查询是等价的。如果分配给列的值与列中已经存在的值相同,那么MySQL不会将行计算(或报告)为受更新影响。(其他关系数据库确实在“受影响”计数中计算此类行。

Because of the leading percent sign in the LIKE comparison, that condition will need to be evaluated for every row in the table, so there's not going to be any difference in performance. If there's an index on some_records(some_column), MySQL might choose to a full index scan which might be slightly faster in some cases.)

由于相似比较中的领先百分比符号,因此需要对表中的每一行进行评估,因此性能不会有任何差异。如果在some_records(some_column)上有索引,MySQL可能会选择完全索引扫描,这在某些情况下可能会稍微快一些)。

If you're familiar with other relational databases (Oracle, SQL Server, et al.) then adding the WHERE clause is second nature.

如果您熟悉其他关系数据库(Oracle、SQL Server等),那么添加WHERE子句是第二天性。

Aside from those issues, it doesn't really matter if you add the WHERE clause or not.

除了这些问题之外,是否添加WHERE子句并不重要。

The reasons I could see with bothering with adding a WHERE clause:

添加WHERE子句的原因我可以理解为:

  • avoid firing BEFORE/AFTER UPDATE triggers
  • 避免在更新触发器之前或之后触发
  • familiar pattern used in other relational databases
  • 在其他关系数据库中使用的常见模式
  • possibly improved performance (if the rows are really long, if the index is much, much shorter, and a small fraction of the rows will satisfy the condition)
  • 可能会提高性能(如果行真的很长,如果索引很短,并且只有一小部分行满足条件)

#2


2  

AFAIK, if you have an index on a column which is used as a condition in the WHERE clause it should speed up the lookup of the rows which are supposed to be updated.

AFAIK,如果您在列上有一个索引,该索引用作WHERE子句中的条件,那么它应该会加快查找应该更新的行。

If you don't have a where clause, the database default reads all the rows from the disk and then does replace. For strings which don't qualify for the replace it is an unnecessary lookup from the disk.

如果没有where子句,则数据库默认从磁盘读取所有行,然后进行替换。对于不符合替换条件的字符串,它是来自磁盘的不必要的查找。