SQL Server一个特殊的阻塞案例分析2

时间:2023-03-09 12:45:28
SQL Server一个特殊的阻塞案例分析2

最近发现一个非常奇怪的阻塞问题,如下截图所示(来自监控工具DPA),会话583被会话1036阻塞,而且阻塞发生在tempdb,被阻塞的SQL如下截图所示,会话等待类型为LCK_M_S

SQL Server一个特殊的阻塞案例分析2

SQL Server一个特殊的阻塞案例分析2

因为DPA工具不好截取全部信息,使用下面SQL语句获取了阻塞会话的详细信息如下,来自Microsoft SQL Server Management Studio - Transact-SQL IntelliSense的的SQL被阻塞了

SELECT  s.session_id ,

        r.status ,

        r.blocking_session_id 'Blk by' ,

        r.wait_type ,

        wait_resource ,

        r.wait_time / ( 1000 * 60 ) 'Wait M' ,

        r.cpu_time ,

        r.logical_reads ,

        r.reads ,

        r.writes ,

        r.total_elapsed_time / ( 1000 * 60 ) 'Elaps M' ,

        SUBSTRING(st.text, ( r.statement_start_offset / 2 ) + 1,

                  ( ( CASE r.statement_end_offset

                        WHEN -1 THEN DATALENGTH(st.text)

                        ELSE r.statement_end_offset

                      END - r.statement_start_offset ) / 2 ) + 1) AS statement_text ,

        COALESCE(QUOTENAME(DB_NAME(st.dbid)) + N'.'

                 + QUOTENAME(OBJECT_SCHEMA_NAME(st.objectid, st.dbid)) + N'.'

                 + QUOTENAME(OBJECT_NAME(st.objectid, st.dbid)), '') AS command_text ,

        r.command ,

        s.login_name ,

        s.host_name ,

        s.program_name ,

        s.last_request_end_time ,

        s.login_time ,

        r.open_transaction_count

FROM    sys.dm_exec_sessions AS s

        JOIN sys.dm_exec_requests AS r ON r.session_id = s.session_id

        CROSS APPLY sys.dm_exec_sql_text(r.sql_handle) AS st

WHERE r.session_id =583

ORDER BY r.cpu_time DESC;

SQL Server一个特殊的阻塞案例分析2

而会话1036处于睡眠(sleeping)状态,有一个Open的事务。通过wait_resource的值“KEY: 2:1125899909070848 (79c68d10aa42)”,我们定位到是一个系统表sysschobj.它是sys.triggers这个系统视图的基表。如下所示:

SQL Server一个特殊的阻塞案例分析2

SQL Server一个特殊的阻塞案例分析2

SET QUOTED_IDENTIFIER ON

SET ANSI_NULLS ON

GO

CREATE VIEW sys.triggers AS

    SELECT o.name,

        object_id = o.id,

        parent_class = o.pclass,

        parent_class_desc = pc.name,

        parent_id = o.pid,

        type = o.type,

        type_desc = n.name,

        create_date = o.created,

        modify_date = o.modified,

        is_ms_shipped = sysconv(bit, o.status & 1),         -- OBJALL_MSSHIPPED

        is_disabled = sysconv(bit, o.status & 256),         -- OBJTRG_DISABLED

        is_not_for_replication = sysconv(bit, o.status & 512), -- OBJTRG_NOTFORREPL

        is_instead_of_trigger = sysconv(bit, o.status & 1024)     -- OBJTRG_INSTEADOF

    FROM sys.sysschobjs o

    LEFT JOIN sys.syspalnames n ON n.class = 'OBTY' AND n.value = o.type

    LEFT JOIN sys.syspalvalues pc ON pc.class = 'UNCL' AND pc.value = o.pclass

    WHERE o.type IN ('TA','TR') AND o.pclass <> 100

        AND has_access('TR', o.id, o.pid, o.nsclass) = 1

 

GO

使用下面脚本可以查到对应的锁信息(其实,正确的做法应该是加上条件过滤spid),从而可以判断,会话1036可能因为事务中,间接涉及对基表sysschobj的DML操作(有可能是衍生的系统视图),而由于事务没有提交也没有释放,导致在基表上的锁没有释放,从而导致出现这样一个阻塞。个人也很好奇是什么样的SQL会导致这样一个阻塞出现,不过这种阻塞源头是Sleeping状态的会话,已经无法定位阻塞源头SQL,只能通过Profile去跟踪捕获。但是这种阻塞出现时机和场景不清楚,很难一下子捕获到。

 

USE tempdb

GO

SELECT  request_session_id AS spid ,

        DB_NAME(resource_database_id) AS dbname ,

        CASE WHEN resource_type = 'OBJECT'

             THEN OBJECT_NAME(resource_associated_entity_id)

             WHEN resource_associated_entity_id = 0 THEN 'n/a'

             ELSE OBJECT_NAME(p.object_id)

        END AS entity_name ,

        index_id ,

        resource_type AS resource ,

        resource_description AS description ,

        request_mode AS mode ,

        request_status AS status

FROM    sys.dm_tran_locks t

LEFT JOIN sys.partitions p

ON      p.partition_id = t.resource_associated_entity_id

WHERE   resource_database_id = DB_ID()

SQL Server一个特殊的阻塞案例分析2

今天早上有发现同样的阻塞,猜测是SQL里面有创建临时表或对临时表做DDL或DML操作的迹象。不过依然没有找到源头SQL,只能静候下次机会。找出“罪魁祸首”。不过让我头痛的是,对于这种特殊阻塞情况,即使是使用Profile跟踪也是相当麻烦,因为不知道它什么时间点出现、出现在那个会话。

SELECT  tl.resource_type AS [Resource Type] ,

        DB_NAME(tl.resource_database_id) AS [DB Name] ,

        CASE tl.resource_type

          WHEN 'OBJECT'

          THEN OBJECT_NAME(tl.resource_associated_entity_id,

                           tl.resource_database_id)

          WHEN 'DATABASE' THEN 'DB'

          ELSE CASE WHEN tl.resource_database_id = DB_ID()

                    THEN ( SELECT   OBJECT_NAME(object_id,

                                                tl.resource_database_id)

                           FROM     sys.partitions

                           WHERE    hobt_id = tl.resource_associated_entity_id

                         )

                    ELSE '(Run under DB context)'

               END

        END AS [Object] ,

        tl.resource_description AS [Resource] ,

        tl.request_session_id AS [Session] ,

        tl.request_mode AS [Mode] ,

        tl.request_status AS [Status] ,

        wt.wait_duration_ms AS [Wait (ms)] ,

        qi.sql ,

        qi.query_plan

FROM    sys.dm_tran_locks tl WITH ( NOLOCK )

LEFT OUTER JOIN sys.dm_os_waiting_tasks wt WITH ( NOLOCK )

ON      tl.lock_owner_address = wt.resource_address

        AND tl.request_status = 'WAIT'

OUTER APPLY ( SELECT    SUBSTRING(s.text,

                                  ( er.statement_start_offset / 2 ) + 1,

                                  ( ( CASE er.statement_end_offset

                                        WHEN -1 THEN DATALENGTH(s.text)

                                        ELSE er.statement_end_offset

                                      END - er.statement_start_offset ) / 2 )

                                  + 1) AS sql ,

                        qp.query_plan

              FROM      sys.dm_exec_requests er WITH ( NOLOCK )

              CROSS APPLY sys.dm_exec_sql_text(er.sql_handle) s

              OUTER APPLY sys.dm_exec_query_plan(er.plan_handle) qp

              WHERE     tl.request_session_id = er.session_id

            ) qi

WHERE   tl.request_session_id =134

SQL Server一个特殊的阻塞案例分析2

链接资料

一个特殊的SQL Server阻塞案例分析

SQL Server一个特殊的阻塞案例分析2