SQL Server 2005 T-SQL问题:您能信任查询优化器吗?我知道我不能!

时间:2023-01-21 04:01:02

This question is linked to my previous one ( posted as an anonymous user - now I have an account ) and, before I begin, I would like to give the credit to Rob Farley for providing the right indexing schema.

这个问题与我之前的问题相关联(以匿名用户身份发布 - 现在我有一个帐户),在开始之前,我想向Rob Farley提供正确的索引架构。

But the problem is not the indexing schema.

It's the Query Optimizer !

The query :

查询:

SELECT s.ID_i
     , s.ShortName_v
     , sp.Path_v
     , ( SELECT TOP 1 1         -- has also user access on subsites ?
           FROM SitePath_T usp
              , UserSiteRight_t usr
          WHERE usr.SiteID_i = usp.SiteID_i
            AND usp.Path_v LIKE sp.Path_v + '%_'
            AND usr.UserID_i = 1 )
  FROM Site_T s
     , SitePath_T sp
 WHERE sp.SiteID_i = s.ID_i
   AND s.ShortName_v LIKE '[a-y]%'
   AND s.ParentID_i = 1
   AND EXISTS ( SELECT *
                  FROM SitePath_T usp
                     , UserSiteRight_t usr
                 WHERE usr.SiteID_i = usp.SiteID_i
                   AND usp.Path_v LIKE sp.Path_v + '%'
                   AND usr.UserID_i = 1 )

... runs in :

...运行于:

CPU   Reads  Writes Duration
2073  49572  0      2241      -- more than 2 sec

Execution plan :

执行计划:

  |--Compute Scalar(DEFINE:([Expr1014]=[Expr1014]))
       |--Nested Loops(Left Outer Join, OUTER REFERENCES:([sp].[Path_v]))
            |--Nested Loops(Left Semi Join, OUTER REFERENCES:([Expr1016], [Expr1017], [Expr1018], [Expr1019]))
            |    |--Merge Join(Inner Join, MERGE:([sp].[SiteID_i])=([s].[ID_i]), RESIDUAL:([dbo].[SitePath_T].[SiteID_i] as [sp].[SiteID_i]=[dbo].[Site_T].[ID_i] as [s].[ID_i]))
            |    |    |--Compute Scalar(DEFINE:([Expr1016]=[dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%', [Expr1017]=LikeRangeStart([dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%'), [Expr1018]=LikeRangeEnd([dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%'), [Expr1019]=LikeRangeInfo([dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%')))
            |    |    |    |--Index Scan(OBJECT:([dbo].[SitePath_T].[IDX_SitePath_SiteID_<Path>] AS [sp]), ORDERED FORWARD)
            |    |    |--Sort(ORDER BY:([s].[ID_i] ASC))
            |    |         |--Clustered Index Seek(OBJECT:([dbo].[Site_T].[IDXC_Site_ParentID+ShortName+ID] AS [s]), SEEK:([s].[ParentID_i]=(1) AND [s].[ShortName_v] >= '9þþþþþ' AND [s].[ShortName_v] < 'Z'),  WHERE:([dbo].[Site_T].[ShortName_v] as [s].[ShortName_v] like '[a-y]%') ORDERED FORWARD)
            |    |--Nested Loops(Inner Join, OUTER REFERENCES:([usp].[SiteID_i], [Expr1020]) WITH UNORDERED PREFETCH)
            |         |--Clustered Index Scan(OBJECT:([dbo].[SitePath_T].[IDXC_SitePath_Path+SiteID] AS [usp]), WHERE:([dbo].[SitePath_T].[Path_v] as [usp].[Path_v] like [Expr1016]))
            |         |--Index Seek(OBJECT:([dbo].[UserSiteRight_T].[IDX_UserSiteRight_UserID+SiteID] AS [usr]), SEEK:([usr].[UserID_i]=(1) AND [usr].[SiteID_i]=[dbo].[SitePath_T].[SiteID_i] as [usp].[SiteID_i]) ORDERED FORWARD)
            |--Compute Scalar(DEFINE:([Expr1014]=(1)))
                 |--Top(TOP EXPRESSION:((1)))
                      |--Nested Loops(Inner Join, OUTER REFERENCES:([usp].[SiteID_i], [Expr1021]) WITH UNORDERED PREFETCH)
                           |--Clustered Index Scan(OBJECT:([dbo].[SitePath_T].[IDXC_SitePath_Path+SiteID] AS [usp]), WHERE:([dbo].[SitePath_T].[Path_v] as [usp].[Path_v] like [dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%_'))
                           |--Index Seek(OBJECT:([dbo].[UserSiteRight_T].[IDX_UserSiteRight_UserID+SiteID] AS [usr]), SEEK:([usr].[UserID_i]=(1) AND [usr].[SiteID_i]=[dbo].[SitePath_T].[SiteID_i] as [usp].[SiteID_i]) ORDERED FORWARD)

But if I enforce the indexes, the following query :

但是,如果我强制执行索引,则以下查询:

SELECT s.ID_i
     , s.ShortName_v
     , sp.Path_v
     , ( SELECT TOP 1 1        -- has also user access on subsites ?
           FROM SitePath_T usp WITH ( INDEX ( [IDX_SitePath_Path+SiteID] ) )
                               -- same performance when using WITH ( INDEX ( [IDX_SitePath_Path_INC<SiteID>] ) )
              , UserSiteRight_t usr WITH ( INDEX ( [IDX_UserSiteRight_UserID+SiteID] ) )
          WHERE usr.SiteID_i = usp.SiteID_i
            AND usp.Path_v LIKE sp.Path_v + '%_'
            AND usr.UserID_i = 1)
  FROM Site_T s
     , SitePath_T sp WITH ( INDEX ( [IDX_SitePath_SiteID+Path] ) )
                     -- same performance when using WITH ( INDEX ( [IDX_SitePath_SiteID_INC<Path>] ) )
 WHERE sp.SiteID_i = s.ID_i
   AND s.ShortName_v LIKE '[a-y]%'
   AND s.ParentID_i = 1
   AND EXISTS ( SELECT *
                  FROM SitePath_T usp WITH ( INDEX ( [IDX_SitePath_Path+SiteID] ) ) 
                                      -- same performance when using WITH ( INDEX ( [IDX_SitePath_Path_INC<SiteID>] ) )
                     , UserSiteRight_t usr WITH ( INDEX ( [IDX_UserSiteRight_UserID+SiteID] ) )
                 WHERE usr.SiteID_i = usp.SiteID_i
                   AND usp.Path_v LIKE sp.Path_v + '%'
                   AND usr.UserID_i = 1 )

will run in :

将运行于:

CPU  Reads  Writes  Duration
50   11237  0       55

the duration will drop to 55 milliseconds ( from more than 2 sec ) !!!!

持续时间将降至55毫秒(超过2秒)!!!!

And I'm happy with this result !

Execution plan :

执行计划:

  |--Compute Scalar(DEFINE:([Expr1014]=[Expr1014]))
       |--Nested Loops(Left Outer Join, OUTER REFERENCES:([sp].[Path_v]))
            |--Nested Loops(Left Semi Join, OUTER REFERENCES:([Expr1016], [Expr1017], [Expr1018], [Expr1019]))
            |    |--Merge Join(Inner Join, MERGE:([sp].[SiteID_i])=([s].[ID_i]), RESIDUAL:([dbo].[SitePath_T].[SiteID_i] as [sp].[SiteID_i]=[dbo].[Site_T].[ID_i] as [s].[ID_i]))
            |    |    |--Compute Scalar(DEFINE:([Expr1016]=[dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%', [Expr1017]=LikeRangeStart([dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%'), [Expr1018]=LikeRangeEnd([dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%'), [Expr1019]=LikeRangeInfo([dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%')))
            |    |    |    |--Index Scan(OBJECT:([dbo].[SitePath_T].[IDX_SitePath_SiteID_<Path>] AS [sp]), ORDERED FORWARD)
            |    |    |--Sort(ORDER BY:([s].[ID_i] ASC))
            |    |         |--Clustered Index Seek(OBJECT:([dbo].[Site_T].[IDXC_Site_ParentID+ShortName+ID] AS [s]), SEEK:([s].[ParentID_i]=(1) AND [s].[ShortName_v] >= '9þþþþþ' AND [s].[ShortName_v] < 'Z'),  WHERE:([dbo].[Site_T].[ShortName_v] as [s].[ShortName_v] like '[a-y]%') ORDERED FORWARD)
            |    |--Nested Loops(Inner Join, OUTER REFERENCES:([usp].[SiteID_i], [Expr1023]) WITH UNORDERED PREFETCH)
            |         |--Nested Loops(Inner Join, OUTER REFERENCES:([Expr1017], [Expr1018], [Expr1019]))
            |         |    |--Compute Scalar(DEFINE:([Expr1017]=[Expr1017], [Expr1018]=[Expr1018], [Expr1019]=[Expr1019]))
            |         |    |    |--Constant Scan
            |         |    |--Index Seek(OBJECT:([dbo].[SitePath_T].[IDX_SitePath_Path+SiteID] AS [usp]), SEEK:([usp].[Path_v] > [Expr1017] AND [usp].[Path_v] < [Expr1018]),  WHERE:([dbo].[SitePath_T].[Path_v] as [usp].[Path_v] like [Expr1016]) ORDERED FORWARD)
            |         |--Index Seek(OBJECT:([dbo].[UserSiteRight_T].[IDX_UserSiteRight_UserID+SiteID] AS [usr]), SEEK:([usr].[UserID_i]=(1) AND [usr].[SiteID_i]=[dbo].[SitePath_T].[SiteID_i] as [usp].[SiteID_i]) ORDERED FORWARD)
            |--Compute Scalar(DEFINE:([Expr1014]=(1)))
                 |--Top(TOP EXPRESSION:((1)))
                      |--Nested Loops(Inner Join, OUTER REFERENCES:([usp].[SiteID_i], [Expr1027]) WITH UNORDERED PREFETCH)
                           |--Nested Loops(Inner Join, OUTER REFERENCES:([Expr1024], [Expr1025], [Expr1026]))
                           |    |--Compute Scalar(DEFINE:([Expr1024]=LikeRangeStart([dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%_'), [Expr1025]=LikeRangeEnd([dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%_'), [Expr1026]=LikeRangeInfo([dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%_')))
                           |    |    |--Constant Scan
                           |    |--Index Seek(OBJECT:([dbo].[SitePath_T].[IDX_SitePath_Path+SiteID] AS [usp]), SEEK:([usp].[Path_v] > [Expr1024] AND [usp].[Path_v] < [Expr1025]),  WHERE:([dbo].[SitePath_T].[Path_v] as [usp].[Path_v] like [dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%_') ORDERED FORWARD)
                           |--Index Seek(OBJECT:([dbo].[UserSiteRight_T].[IDX_UserSiteRight_UserID+SiteID] AS [usr]), SEEK:([usr].[UserID_i]=(1) AND [usr].[SiteID_i]=[dbo].[SitePath_T].[SiteID_i] as [usp].[SiteID_i]) ORDERED FORWARD)

The next step is to run it for different users, thus I will declare UserID_i as a variable :

下一步是为不同的用户运行它,因此我将UserID_i声明为变量:

DECLARE @UserID_i INT 
SELECT @UserID_i = 1

BUT NOW THE BELOW QUERY BECOMES CRAZY SLOW !!!

但现在以下的查询变得疯狂!

SELECT s.ID_i
  , s.ShortName_v
  , sp.Path_v
  , ( SELECT TOP 1 1        -- has also user access on subsites ?
        FROM SitePath_T usp WITH ( INDEX ( [IDX_SitePath_Path+SiteID] ) ) 
           , UserSiteRight_t usr WITH ( INDEX ( [IDX_UserSiteRight_UserID+SiteID] ) )
       WHERE usr.SiteID_i = usp.SiteID_i
         AND usp.Path_v LIKE sp.Path_v + '%_'
         AND usr.UserID_i = @UserID_i)
  FROM Site_T s
     , SitePath_T sp WITH ( INDEX ( [IDX_SitePath_SiteID+Path] ) )
 WHERE sp.SiteID_i = s.ID_i
   AND s.ShortName_v LIKE '[a-y]%'
   AND s.ParentID_i = 1
   AND EXISTS ( SELECT *
                  FROM SitePath_T usp WITH ( INDEX ( [IDX_SitePath_Path+SiteID] ) ) 
                     , UserSiteRight_t usr WITH ( INDEX ( [IDX_UserSiteRight_UserID+SiteID] ) )
                 WHERE usr.SiteID_i = usp.SiteID_i
                   AND usp.Path_v LIKE sp.Path_v + '%'
                   AND usr.UserID_i = @UserID_i )

The duration is now over 7 seconds !!!

持续时间现在超过7秒!

CPU     Reads   Writes  Duration
7421    149984  35      7625

And the execution plan :

和执行计划:

  |--Compute Scalar(DEFINE:([Expr1014]=[Expr1014]))
       |--Nested Loops(Left Outer Join, OUTER REFERENCES:([sp].[Path_v]))
            |--Nested Loops(Left Semi Join, WHERE:([dbo].[SitePath_T].[Path_v] as [usp].[Path_v] like [Expr1016]))
            |    |--Merge Join(Inner Join, MERGE:([sp].[SiteID_i])=([s].[ID_i]), RESIDUAL:([dbo].[SitePath_T].[SiteID_i] as [sp].[SiteID_i]=[dbo].[Site_T].[ID_i] as [s].[ID_i]))
            |    |    |--Compute Scalar(DEFINE:([Expr1016]=[dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%', [Expr1017]=LikeRangeStart([dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%'), [Expr1018]=LikeRangeEnd([dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%'), [Expr1019]=LikeRangeInfo([dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%')))
            |    |    |    |--Index Scan(OBJECT:([dbo].[SitePath_T].[IDX_SitePath_SiteID+Path] AS [sp]), ORDERED FORWARD)
            |    |    |--Sort(ORDER BY:([s].[ID_i] ASC))
            |    |         |--Clustered Index Seek(OBJECT:([dbo].[Site_T].[IDXC_Site_ParentID+ShortName+ID] AS [s]), SEEK:([s].[ParentID_i]=(1) AND [s].[ShortName_v] >= '9þþþþþ' AND [s].[ShortName_v] < 'Z'),  WHERE:([dbo].[Site_T].[ShortName_v] as [s].[ShortName_v] like '[a-y]%') ORDERED FORWARD)
            |    |--Table Spool
            |         |--Hash Match(Inner Join, HASH:([usr].[SiteID_i])=([usp].[SiteID_i]))
            |              |--Index Seek(OBJECT:([dbo].[UserSiteRight_T].[IDX_UserSiteRight_UserID+SiteID] AS [usr]), SEEK:([usr].[UserID_i]=[@UserID_i]) ORDERED FORWARD)
            |              |--Index Scan(OBJECT:([dbo].[SitePath_T].[IDX_SitePath_Path+SiteID] AS [usp]))
            |--Compute Scalar(DEFINE:([Expr1014]=(1)))
                 |--Top(TOP EXPRESSION:((1)))
                      |--Nested Loops(Inner Join, WHERE:([dbo].[UserSiteRight_T].[SiteID_i] as [usr].[SiteID_i]=[dbo].[SitePath_T].[SiteID_i] as [usp].[SiteID_i]))
                           |--Nested Loops(Inner Join, OUTER REFERENCES:([Expr1020], [Expr1021], [Expr1022]))
                           |    |--Compute Scalar(DEFINE:([Expr1020]=LikeRangeStart([dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%_'), [Expr1021]=LikeRangeEnd([dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%_'), [Expr1022]=LikeRangeInfo([dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%_')))
                           |    |    |--Constant Scan
                           |    |--Index Seek(OBJECT:([dbo].[SitePath_T].[IDX_SitePath_Path+SiteID] AS [usp]), SEEK:([usp].[Path_v] > [Expr1020] AND [usp].[Path_v] < [Expr1021]),  WHERE:([dbo].[SitePath_T].[Path_v] as [usp].[Path_v] like [dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%_') ORDERED FORWARD)
                           |--Table Spool
                                |--Index Seek(OBJECT:([dbo].[UserSiteRight_T].[IDX_UserSiteRight_UserID+SiteID] AS [usr]), SEEK:([usr].[UserID_i]=[@UserID_i]) ORDERED FORWARD)

The execution plan is changing completely when I'm using a variable instead of hard coding the UserID_i value !

当我使用变量而不是硬编码UserID_i值时,执行计划正在完全改变!

Why the query optimizer is behaving like this ?

How can I enforce the execution plan to be the same as the second fast query ?

Thank you.

谢谢。


UPDATE 1


Deleted ( irrelevant )

删除(不相关)


UPDATE 2


It seems that I am not the only one having this problem.

看来我不是唯一一个遇到这个问题的人。

Please check the following topics :
Why does the SqlServer optimizer get so confused with parameters?
Known issue?: SQL Server 2005 stored procedure fails to complete with a parameter

请检查以下主题:为什么SqlServer优化器与参数混淆?已知问题?:SQL Server 2005存储过程无法使用参数完成


UPDATE 3


An excellent article from SQL Server Query Optimization Team covering parameter sniffing : I Smell a Parameter !

SQL Server查询优化团队的一篇优秀文章,涵盖了参数嗅探:我闻到了一个参数!

4 个解决方案

#1


2  

Is there a reason you can't use index hints (as in your second query) when you're using the variable (in the third query)? It's odd that the query optimizer makes such a bad decision when there's an index available, but it only knows a limited amount about your data, and it chooses as best it can.

当您使用变量(在第三个查询中)时,是否有理由不能使用索引提示(如在第二个查询中)?奇怪的是,当有可用的索引时,查询优化器会做出如此糟糕的决定,但它只知道有关您数据的数量有限,并且它会尽可能地选择。

Some statistics on the indexed columns might help you out, actually - they keep track of the data, the data layout, and some other information about what the table actually contains, whereas the indexes themselves are only built on top of the table metadata, and the query optimizer doesn't choose on the data itself (unless there are statistics there to help it do so).

有关索引列的一些统计信息可能会帮助您,实际上 - 它们会跟踪数据,数据布局以及有关表实际包含内容的其他一些信息,而索引本身仅构建在表元数据之上,并且查询优化器不会选择数据本身(除非有统计数据可以帮助它这样做)。

Have you run the "Database Tuning Advisor" on the query? Highlighting the query and selecting "Analyze Query in Database Engine Tuning Advisor" from the "Query" menu in SSMS will use the table data to suggest some statistics for you - that might make a huge difference.

您是否在查询中运行“数据库调优顾问”?突出显示查询并从SSMS中的“查询”菜单中选择“分析数据库引擎优化顾问中的查询”将使用表数据为您建议一些统计信息 - 这可能会产生巨大的差异。

#2


1  

After reading the above articles ( provided in Update 2 and Update 3 ) I finally understood more about how Sql Server is treating / caching execution plans.

在阅读了上述文章(在Update 2和Update 3中提供)后,我终于更了解了Sql Server如何处理/缓存执行计划。

Adding OPTION ( RECOMPILE ) at the end of my SELECT statements will force Sql Server to recalculate the execution plan ( and not using the cached one ) every time the query will be run, thus choosing the best plan matching the variable.

在SELECT语句结束时添加OPTION(RECOMPILE)将强制Sql Server在每次运行查询时重新计算执行计划(而不是使用缓存的计划),从而选择与变量匹配的最佳计划。

#3


0  

As Peter was asking why I did not use recursion, I am providing below the recursive cte which will return the correct result :

当Peter问我为什么不使用递归时,我在递归cte下面提供了返回正确结果:

; WITH Site_R AS (
SELECT s.ID_i
  , sp.Path_v
     , s.ID_i AS SubSiteID_i
  , sp.Path_v AS SubPath_v
  , 0 AS Depth_i
  FROM Site_T s
     , SitePath_T sp
 WHERE sp.SiteID_i = s.ID_i
   AND s.ShortName_v LIKE '[a-y]%'
   AND s.ParentID_i = 1
 UNION ALL
SELECT sr.ID_i
  , sr.Path_v
     , s.ID_i
  , sp.Path_v
  , Depth_i+1
  FROM Site_T s
  , Site_R sr
  , SitePath_T sp
 WHERE sp.SiteID_i = s.ID_i
   AND s.ParentID_i = sr.SubSiteID_i
)
SELECT us.*
     , ( SELECT usr.UserID_i FROM UserSiteRight_T usr WHERE usr.SiteID_i = us.SubSiteID_i AND UseriD_i = 1 ) AS UserID_i
  FROM Site_R us

Result's first rows with added UserSiteRight_T.UserID_i column showing the access to the SubSiteID_i :

结果的第一行添加了UserSiteRight_T.UserID_i列,显示对SubSiteID_i的访问权限:

ID_i    Path_v      SubSiteID_i SubPath_v       Depth_i     UserSiteRight_T.UserID_i
------- ----------- ----------- --------------- ----------- -----------
2       1.2.        2           1.2.            0           1
3       1.3.        3           1.3.            0           NULL
3       1.3.        4           1.3.4.          1           1
3       1.3.        5           1.3.15863.      1           1
3       1.3.        6           1.3.6.          1           NULL
3       1.3.        7           1.3.6.7.        2           1
3       1.3.        8           1.3.8.          1           1
9       1.9.        9           1.9.            0           NULL
9       1.9.        10          1.9.10.         1           NULL
9       1.9.        11          1.9.10.11.      2           1
9       1.9.        12          1.9.10.12.      2           1
9       1.9.        13          1.9.13.         1           NULL
9       1.9.        14          1.9.13.14.      2           NULL
9       1.9.        15          1.9.13.14.15.   3           1
9       1.9.        16          1.9.13.14.16.   3           1
9       1.9.        17          1.9.13.17.      2           NULL
9       1.9.        18          1.9.13.17.18.   3           1
9       1.9.        19          1.9.19.         1           1
9       1.9.        20          1.9.20.         1           NULL

My final result should be a Group By on the first column having last column NOT NULL.
Or the following recursive query :

我的最终结果应该是第一列上的Group By,其中最后一列为NOT NULL。或者以下递归查询:

; WITH Site_R AS (
SELECT s.ID_i
  , sp.Path_v
     , s.ID_i AS SubSiteID_i
  , sp.Path_v AS SubPath_v
  , 0 AS Depth_i
  FROM Site_T s
     , SitePath_T sp
 WHERE sp.SiteID_i = s.ID_i
   AND s.ShortName_v LIKE '[a-y]%'
   AND s.ParentID_i = 1
 UNION ALL
SELECT sr.ID_i
  , sr.Path_v
     , s.ID_i
  , sp.Path_v
  , Depth_i+1
  FROM Site_T s
  , Site_R sr
  , SitePath_T sp
 WHERE sp.SiteID_i = s.ID_i
   AND s.ParentID_i = sr.SubSiteID_i
)
SELECT us.ID_i
  FROM Site_R us
  , UserSiteRight_T usr 
 WHERE usr.SiteID_i = us.SubSiteID_i
   AND UseriD_i = 1
 GROUP BY ID_i

which basically builds the whole tree and selects only the ancestors having SubSiteID_i accessible by UserID_i. Or :

它基本上构建整个树并仅选择具有UserID_i可访问的SubSiteID_i的祖先。要么 :

; WITH Site_R AS (
SELECT s.ID_i
     , s.ID_i AS SubSiteID_i
     , 0 AS Depth_i
     , ( SELECT 1 FROM UserSiteRight_T usr WHERE usr.SiteID_i = s.ID_i AND usr.UserID_i = @UserID_i ) AS HasRight_b
  FROM Site_T s
 WHERE s.ShortName_v LIKE '[a-y]%'
   AND s.ParentID_i = @ParentID_i
 UNION ALL
SELECT sr.ID_i
     , s.ID_i
     , Depth_i+1
     , ( SELECT 1 FROM UserSiteRight_T usr WHERE usr.SiteID_i = s.ID_i AND usr.UserID_i = @UserID_i )
  FROM Site_T s
     , Site_R sr
 WHERE s.ParentID_i = sr.SubSiteID_i
   AND ( sr.HasRight_b IS NULL OR Depth_i = 0 )
)
SELECT * FROM Site_R Where HasRight_b IS NOT NULL

#4


0  

EDIT:

编辑:

First, you need a covering index on (ParentID_i, ID_i). Do you have one?

首先,您需要一个覆盖索引(ParentID_i,ID_i)。你是否有一个?

Second:

第二:

I am trying to obtain all the sites with depth = 0 which are having subsites accessible by a user.

我试图获得所有深度= 0的网站,这些网站具有用户可访问的子网站。

This description doesn't match the queries you provided here.

此说明与您在此处提供的查询不符。

This will return all sites w/ depth = 0 (ie, no more parents) that have subsites accessible to the user:

这将返回具有深度= 0的所有站点(即,不再有父项),这些站点具有用户可访问的子站点:

; WITH Site_R AS (
SELECT s.ID_i
     , s.ParentID_i
  FROM Site_T s
     , UserSiteRight_T usr
 WHERE usr.SiteID_i = s.ID_i 
   AND usr.UserID_i = @UserID_i -- plus any other filters
 UNION ALL
SELECT s.ID_i
     , s.ParentID_i
  FROM Site_T s
     , Site_R sr
 WHERE s.ID_i = sr.ParentID_i
)
SELECT DISTINCT ID_i
  FROM Site_R 
 WHERE ParentID_i IS NULL

Is this the result set you want?

这是你想要的结果集吗?

Don't add any unnecessary columns to the recursive CTE. Join them in later, post-recurse, post-reduce.

不要向递归CTE添加任何不必要的列。加入他们之后,后递减,后减少。

#1


2  

Is there a reason you can't use index hints (as in your second query) when you're using the variable (in the third query)? It's odd that the query optimizer makes such a bad decision when there's an index available, but it only knows a limited amount about your data, and it chooses as best it can.

当您使用变量(在第三个查询中)时,是否有理由不能使用索引提示(如在第二个查询中)?奇怪的是,当有可用的索引时,查询优化器会做出如此糟糕的决定,但它只知道有关您数据的数量有限,并且它会尽可能地选择。

Some statistics on the indexed columns might help you out, actually - they keep track of the data, the data layout, and some other information about what the table actually contains, whereas the indexes themselves are only built on top of the table metadata, and the query optimizer doesn't choose on the data itself (unless there are statistics there to help it do so).

有关索引列的一些统计信息可能会帮助您,实际上 - 它们会跟踪数据,数据布局以及有关表实际包含内容的其他一些信息,而索引本身仅构建在表元数据之上,并且查询优化器不会选择数据本身(除非有统计数据可以帮助它这样做)。

Have you run the "Database Tuning Advisor" on the query? Highlighting the query and selecting "Analyze Query in Database Engine Tuning Advisor" from the "Query" menu in SSMS will use the table data to suggest some statistics for you - that might make a huge difference.

您是否在查询中运行“数据库调优顾问”?突出显示查询并从SSMS中的“查询”菜单中选择“分析数据库引擎优化顾问中的查询”将使用表数据为您建议一些统计信息 - 这可能会产生巨大的差异。

#2


1  

After reading the above articles ( provided in Update 2 and Update 3 ) I finally understood more about how Sql Server is treating / caching execution plans.

在阅读了上述文章(在Update 2和Update 3中提供)后,我终于更了解了Sql Server如何处理/缓存执行计划。

Adding OPTION ( RECOMPILE ) at the end of my SELECT statements will force Sql Server to recalculate the execution plan ( and not using the cached one ) every time the query will be run, thus choosing the best plan matching the variable.

在SELECT语句结束时添加OPTION(RECOMPILE)将强制Sql Server在每次运行查询时重新计算执行计划(而不是使用缓存的计划),从而选择与变量匹配的最佳计划。

#3


0  

As Peter was asking why I did not use recursion, I am providing below the recursive cte which will return the correct result :

当Peter问我为什么不使用递归时,我在递归cte下面提供了返回正确结果:

; WITH Site_R AS (
SELECT s.ID_i
  , sp.Path_v
     , s.ID_i AS SubSiteID_i
  , sp.Path_v AS SubPath_v
  , 0 AS Depth_i
  FROM Site_T s
     , SitePath_T sp
 WHERE sp.SiteID_i = s.ID_i
   AND s.ShortName_v LIKE '[a-y]%'
   AND s.ParentID_i = 1
 UNION ALL
SELECT sr.ID_i
  , sr.Path_v
     , s.ID_i
  , sp.Path_v
  , Depth_i+1
  FROM Site_T s
  , Site_R sr
  , SitePath_T sp
 WHERE sp.SiteID_i = s.ID_i
   AND s.ParentID_i = sr.SubSiteID_i
)
SELECT us.*
     , ( SELECT usr.UserID_i FROM UserSiteRight_T usr WHERE usr.SiteID_i = us.SubSiteID_i AND UseriD_i = 1 ) AS UserID_i
  FROM Site_R us

Result's first rows with added UserSiteRight_T.UserID_i column showing the access to the SubSiteID_i :

结果的第一行添加了UserSiteRight_T.UserID_i列,显示对SubSiteID_i的访问权限:

ID_i    Path_v      SubSiteID_i SubPath_v       Depth_i     UserSiteRight_T.UserID_i
------- ----------- ----------- --------------- ----------- -----------
2       1.2.        2           1.2.            0           1
3       1.3.        3           1.3.            0           NULL
3       1.3.        4           1.3.4.          1           1
3       1.3.        5           1.3.15863.      1           1
3       1.3.        6           1.3.6.          1           NULL
3       1.3.        7           1.3.6.7.        2           1
3       1.3.        8           1.3.8.          1           1
9       1.9.        9           1.9.            0           NULL
9       1.9.        10          1.9.10.         1           NULL
9       1.9.        11          1.9.10.11.      2           1
9       1.9.        12          1.9.10.12.      2           1
9       1.9.        13          1.9.13.         1           NULL
9       1.9.        14          1.9.13.14.      2           NULL
9       1.9.        15          1.9.13.14.15.   3           1
9       1.9.        16          1.9.13.14.16.   3           1
9       1.9.        17          1.9.13.17.      2           NULL
9       1.9.        18          1.9.13.17.18.   3           1
9       1.9.        19          1.9.19.         1           1
9       1.9.        20          1.9.20.         1           NULL

My final result should be a Group By on the first column having last column NOT NULL.
Or the following recursive query :

我的最终结果应该是第一列上的Group By,其中最后一列为NOT NULL。或者以下递归查询:

; WITH Site_R AS (
SELECT s.ID_i
  , sp.Path_v
     , s.ID_i AS SubSiteID_i
  , sp.Path_v AS SubPath_v
  , 0 AS Depth_i
  FROM Site_T s
     , SitePath_T sp
 WHERE sp.SiteID_i = s.ID_i
   AND s.ShortName_v LIKE '[a-y]%'
   AND s.ParentID_i = 1
 UNION ALL
SELECT sr.ID_i
  , sr.Path_v
     , s.ID_i
  , sp.Path_v
  , Depth_i+1
  FROM Site_T s
  , Site_R sr
  , SitePath_T sp
 WHERE sp.SiteID_i = s.ID_i
   AND s.ParentID_i = sr.SubSiteID_i
)
SELECT us.ID_i
  FROM Site_R us
  , UserSiteRight_T usr 
 WHERE usr.SiteID_i = us.SubSiteID_i
   AND UseriD_i = 1
 GROUP BY ID_i

which basically builds the whole tree and selects only the ancestors having SubSiteID_i accessible by UserID_i. Or :

它基本上构建整个树并仅选择具有UserID_i可访问的SubSiteID_i的祖先。要么 :

; WITH Site_R AS (
SELECT s.ID_i
     , s.ID_i AS SubSiteID_i
     , 0 AS Depth_i
     , ( SELECT 1 FROM UserSiteRight_T usr WHERE usr.SiteID_i = s.ID_i AND usr.UserID_i = @UserID_i ) AS HasRight_b
  FROM Site_T s
 WHERE s.ShortName_v LIKE '[a-y]%'
   AND s.ParentID_i = @ParentID_i
 UNION ALL
SELECT sr.ID_i
     , s.ID_i
     , Depth_i+1
     , ( SELECT 1 FROM UserSiteRight_T usr WHERE usr.SiteID_i = s.ID_i AND usr.UserID_i = @UserID_i )
  FROM Site_T s
     , Site_R sr
 WHERE s.ParentID_i = sr.SubSiteID_i
   AND ( sr.HasRight_b IS NULL OR Depth_i = 0 )
)
SELECT * FROM Site_R Where HasRight_b IS NOT NULL

#4


0  

EDIT:

编辑:

First, you need a covering index on (ParentID_i, ID_i). Do you have one?

首先,您需要一个覆盖索引(ParentID_i,ID_i)。你是否有一个?

Second:

第二:

I am trying to obtain all the sites with depth = 0 which are having subsites accessible by a user.

我试图获得所有深度= 0的网站,这些网站具有用户可访问的子网站。

This description doesn't match the queries you provided here.

此说明与您在此处提供的查询不符。

This will return all sites w/ depth = 0 (ie, no more parents) that have subsites accessible to the user:

这将返回具有深度= 0的所有站点(即,不再有父项),这些站点具有用户可访问的子站点:

; WITH Site_R AS (
SELECT s.ID_i
     , s.ParentID_i
  FROM Site_T s
     , UserSiteRight_T usr
 WHERE usr.SiteID_i = s.ID_i 
   AND usr.UserID_i = @UserID_i -- plus any other filters
 UNION ALL
SELECT s.ID_i
     , s.ParentID_i
  FROM Site_T s
     , Site_R sr
 WHERE s.ID_i = sr.ParentID_i
)
SELECT DISTINCT ID_i
  FROM Site_R 
 WHERE ParentID_i IS NULL

Is this the result set you want?

这是你想要的结果集吗?

Don't add any unnecessary columns to the recursive CTE. Join them in later, post-recurse, post-reduce.

不要向递归CTE添加任何不必要的列。加入他们之后,后递减,后减少。