SQL Server数据库与多个连接

时间:2022-10-19 09:30:37

What is more efficient to use in SQL Server 2005: PIVOT or MULTIPLE JOIN?

在SQL Server 2005中使用什么更有效:PIVOT还是MULTIPLE JOIN?

For example, I got this query using two joins:

例如,我使用两个连接获得此查询:

SELECT p.name, pc1.code as code1, pc2.code as code2
FROM product p
    INNER JOIN product_code pc1
    ON p.product_id=pc1.product_id AND pc1.type=1
    INNER JOIN product_code pc2
    ON p.product_id=pc2.product_id AND pc2.type=2

I can do the same using PIVOT:

我可以使用PIVOT做同样的事情:

SELECT name, [1] as code1, [2] as code2
FROM (
    SELECT p.name, pc.type, pc.code
    FROM product p
        INNER JOIN product_code pc
        ON p.product_id=pc.product_id
    WHERE pc.type IN (1,2)) prods1
PIVOT(
    MAX(code) FOR type IN ([1], [2])) prods2

Which one will be more efficient?

哪一个会更有效率?

2 个解决方案

#1


27  

The answer will of course be "it depends" but based on testing this end...

答案当然是“它取决于”,但基于测试这一目的...

Assuming

假设

  1. 1 million products
  2. 100万件产品
  3. product has a clustered index on product_id
  4. product在product_id上有一个聚簇索引
  5. Most (if not all) products have corresponding information in the product_code table
  6. 大多数(如果不是全部)产品在product_code表中都有相应的信息
  7. Ideal indexes present on product_code for both queries.
  8. 两个查询的product_code上都存在理想索引。

The PIVOT version ideally needs an index product_code(product_id, type) INCLUDE (code) whereas the JOIN version ideally needs an index product_code(type,product_id) INCLUDE (code)

PIVOT版本理想地需要索引product_code(product_id,type)INCLUDE(代码),而JOIN版本理想地需要索引product_code(type,product_id)INCLUDE(代码)

If these are in place giving the plans below

如果这些已经到位,给出了以下计划

SQL Server数据库与多个连接

then the JOIN version is more efficient.

那么JOIN版本更有效率。

In the case that type 1 and type 2 are the only types in the table then the PIVOT version slightly has the edge in terms of number of reads as it doesn't have to seek into product_code twice but that is more than outweighed by the additional overhead of the stream aggregate operator

在类型1和类型2是表中唯一类型的情况下,PIVOT版本在读取次数方面略有优势,因为它不需要两次搜索到product_code,但是这超出了额外的数量。流聚合运算符的开销

PIVOT

Table 'product_code'. Scan count 1, logical reads 10467
Table 'product'. Scan count 1, logical reads 4750
   CPU time = 3297 ms,  elapsed time = 3260 ms.

JOIN

Table 'product_code'. Scan count 2, logical reads 10471
Table 'product'. Scan count 1, logical reads 4750
   CPU time = 1906 ms,  elapsed time = 1866 ms.

If there are additional type records other than 1 and 2 the JOIN version will increase its advantage as it just does merge joins on the relevant sections of the type,product_id index whereas the PIVOT plan uses product_id, type and so would have to scan over the additional type rows that are intermingled with the 1 and 2 rows.

如果除了1和2之外还有其他类型记录,JOIN版本将增加其优势,因为它只是在类型product_id索引的相关部分上合并连接,而PIVOT计划使用product_id,类型,因此必须扫描与1行和2行混合的其他类型行。

#2


5  

I don't think anyone can tell you which will be more efficient without knowledge of your indexing and table size.

我不认为任何人都可以告诉你,如果不了解你的索引和表大小,哪个会更有效率。

That said, rather than hypothesizing about which is more efficient you should analyze the execution plan of these two queries.

也就是说,您应该分析这两个查询的执行计划,而不是假设哪个更有效。

#1


27  

The answer will of course be "it depends" but based on testing this end...

答案当然是“它取决于”,但基于测试这一目的...

Assuming

假设

  1. 1 million products
  2. 100万件产品
  3. product has a clustered index on product_id
  4. product在product_id上有一个聚簇索引
  5. Most (if not all) products have corresponding information in the product_code table
  6. 大多数(如果不是全部)产品在product_code表中都有相应的信息
  7. Ideal indexes present on product_code for both queries.
  8. 两个查询的product_code上都存在理想索引。

The PIVOT version ideally needs an index product_code(product_id, type) INCLUDE (code) whereas the JOIN version ideally needs an index product_code(type,product_id) INCLUDE (code)

PIVOT版本理想地需要索引product_code(product_id,type)INCLUDE(代码),而JOIN版本理想地需要索引product_code(type,product_id)INCLUDE(代码)

If these are in place giving the plans below

如果这些已经到位,给出了以下计划

SQL Server数据库与多个连接

then the JOIN version is more efficient.

那么JOIN版本更有效率。

In the case that type 1 and type 2 are the only types in the table then the PIVOT version slightly has the edge in terms of number of reads as it doesn't have to seek into product_code twice but that is more than outweighed by the additional overhead of the stream aggregate operator

在类型1和类型2是表中唯一类型的情况下,PIVOT版本在读取次数方面略有优势,因为它不需要两次搜索到product_code,但是这超出了额外的数量。流聚合运算符的开销

PIVOT

Table 'product_code'. Scan count 1, logical reads 10467
Table 'product'. Scan count 1, logical reads 4750
   CPU time = 3297 ms,  elapsed time = 3260 ms.

JOIN

Table 'product_code'. Scan count 2, logical reads 10471
Table 'product'. Scan count 1, logical reads 4750
   CPU time = 1906 ms,  elapsed time = 1866 ms.

If there are additional type records other than 1 and 2 the JOIN version will increase its advantage as it just does merge joins on the relevant sections of the type,product_id index whereas the PIVOT plan uses product_id, type and so would have to scan over the additional type rows that are intermingled with the 1 and 2 rows.

如果除了1和2之外还有其他类型记录,JOIN版本将增加其优势,因为它只是在类型product_id索引的相关部分上合并连接,而PIVOT计划使用product_id,类型,因此必须扫描与1行和2行混合的其他类型行。

#2


5  

I don't think anyone can tell you which will be more efficient without knowledge of your indexing and table size.

我不认为任何人都可以告诉你,如果不了解你的索引和表大小,哪个会更有效率。

That said, rather than hypothesizing about which is more efficient you should analyze the execution plan of these two queries.

也就是说,您应该分析这两个查询的执行计划,而不是假设哪个更有效。