Inside Microsoft SQL Server 2008: T-SQL Querying 读书笔记之查询优化

时间:2023-03-08 19:36:24

一. 自顶向下优化方法论

1. 分析实例级别的等待

在实例级找出什么类型的等待占用大部分的时间,通过sys.dm_os_wait_stats

select
wait_type, --等待类型
waiting_tasks_count, --等待次数
wait_time_ms, --等待目前为止时间累积
max_wait_time_ms, --最长的一次等待时间
signal_wait_time_ms --线程收到资源可用到得到CPU的时间
from sys.dm_os_wait_stats
order by wait_type

常见有问题等待类型:

I/O(IOLATCH 等)

网络(ASYNC_NETWORK_IO)

CPU(CMEMTHREAD)

并发(CXPACKET)

日志(WRITELOG)

临时表(PAGE_LATCH_UP)

分离重级等待

-- Isolate top waits
WITH Waits AS
(
SELECT
wait_type,
wait_time_ms / 1000. AS wait_time_s,
100. * wait_time_ms / SUM(wait_time_ms) OVER() AS pct,
ROW_NUMBER() OVER(ORDER BY wait_time_ms DESC) AS rn,
100. * signal_wait_time_ms / wait_time_ms as signal_pct
FROM sys.dm_os_wait_stats
WHERE wait_time_ms > 0
AND wait_type NOT LIKE N'%SLEEP%'
AND wait_type NOT LIKE N'%IDLE%'
AND wait_type NOT LIKE N'%QUEUE%'
AND wait_type NOT IN( N'CLR_AUTO_EVENT'
, N'REQUEST_FOR_DEADLOCK_SEARCH'
, N'SQLTRACE_BUFFER_FLUSH'
/* filter out additional irrelevant waits */ )
)
SELECT
W1.wait_type,
CAST(W1.wait_time_s AS NUMERIC(12, 2)) AS wait_time_s,
CAST(W1.pct AS NUMERIC(5, 2)) AS pct,
CAST(SUM(W2.pct) AS NUMERIC(5, 2)) AS running_pct,
CAST(W1.signal_pct AS NUMERIC(5, 2)) AS signal_pct
FROM Waits AS W1
JOIN Waits AS W2
ON W2.rn <= W1.rn
GROUP BY W1.rn, W1.wait_type, W1.wait_time_s, W1.pct, W1.signal_pct
HAVING SUM(W2.pct) - W1.pct < 80 -- percentage threshold
OR W1.rn <= 5
ORDER BY W1.rn;
GO

a)创建一个等待表用于保存一个时间点的等待信息

b)设计一个任务定时往等待表中插入信息

c)使用Excel 数据透视图观察最重等待

2. 关联等待和队列

根据第一步确定的等待类型,查看相关资源的等待队列情况, 通过系统的性能监视器,SQL Server 数据收集组件,

DMV视图(sys.dm_os_performance_counters)了解详细情况.

3.确定方案

在确定等待类型和涉及的相关资源后, 进一步确定行动方案

I/O 相关进一步确定的数据库级别

编译/重编译等CPU相关进一步采取其它方案

4.细化到数据库/文件级

通过DMF sys.dm_io_virtual_file_stats 动态视图函数确定是那一个数据库I/O最高

-- Analyze DB IO
WITH DBIO AS
(
SELECT
DB_NAME(IVFS.database_id) AS db,
MF.type_desc,
SUM(IVFS.num_of_bytes_read + IVFS.num_of_bytes_written) AS io_bytes,
SUM(IVFS.io_stall) AS io_stall_ms
FROM sys.dm_io_virtual_file_stats(NULL, NULL) AS IVFS
JOIN sys.master_files AS MF
ON IVFS.database_id = MF.database_id
AND IVFS.file_id = MF.file_id
GROUP BY DB_NAME(IVFS.database_id), MF.type_desc
)
SELECT db, type_desc,
CAST(1. * io_bytes / (1024 * 1024) AS NUMERIC(12, 2)) AS io_mb,
CAST(io_stall_ms / 1000. AS NUMERIC(12, 2)) AS io_stall_s,
CAST(100. * io_stall_ms / SUM(io_stall_ms) OVER()
AS NUMERIC(10, 2)) AS io_stall_pct,
ROW_NUMBER() OVER(ORDER BY io_stall_ms DESC) AS rn
FROM DBIO
ORDER BY io_stall_ms DESC;

5.细化到进程级

到这一级别主要是找出那些查询语句有问题, 有两种方法:

a)启用跟踪,记录每条语句的执行成本,最后统计找出最耗时间的查询

SET NOCOUNT ON;USE master;GO 
IF OBJECT_ID('dbo.PerfworkloadTraceStart', 'P') IS NOT NULL  DROP PROC dbo.PerfworkloadTraceStart;GO
CREATE PROC dbo.PerfworkloadTraceStart  @dbid      AS INT,  @tracefile AS NVARCHAR(245),  @traceid   AS INT OUTPUTAS
-- Create a QueueDECLARE @rc          AS INT;DECLARE @maxfilesize AS BIGINT;
SET @maxfilesize = 5;
EXEC @rc = sp_trace_create @traceid OUTPUT, 0, @tracefile, @maxfilesize, NULL IF (@rc != 0) GOTO error;
-- Set the eventsDECLARE @on AS BIT;SET @on = 1;
-- RPC:Completedexec sp_trace_setevent @traceid, 10, 15, @on;exec sp_trace_setevent @traceid, 10, 8, @on;exec sp_trace_setevent @traceid, 10, 16, @on;exec sp_trace_setevent @traceid, 10, 48, @on;exec sp_trace_setevent @traceid, 10, 1, @on;exec sp_trace_setevent @traceid, 10, 17, @on;exec sp_trace_setevent @traceid, 10, 10, @on;exec sp_trace_setevent @traceid, 10, 18, @on;exec sp_trace_setevent @traceid, 10, 11, @on;exec sp_trace_setevent @traceid, 10, 12, @on;exec sp_trace_setevent @traceid, 10, 13, @on;exec sp_trace_setevent @traceid, 10, 6, @on;exec sp_trace_setevent @traceid, 10, 14, @on;
-- SP:Completedexec sp_trace_setevent @traceid, 43, 15, @on;exec sp_trace_setevent @traceid, 43, 8, @on;exec sp_trace_setevent @traceid, 43, 48, @on;exec sp_trace_setevent @traceid, 43, 1, @on;exec sp_trace_setevent @traceid, 43, 10, @on;exec sp_trace_setevent @traceid, 43, 11, @on;exec sp_trace_setevent @traceid, 43, 12, @on;exec sp_trace_setevent @traceid, 43, 13, @on;exec sp_trace_setevent @traceid, 43, 6, @on;exec sp_trace_setevent @traceid, 43, 14, @on;
-- SP:StmtCompletedexec sp_trace_setevent @traceid, 45, 8, @on;exec sp_trace_setevent @traceid, 45, 16, @on;exec sp_trace_setevent @traceid, 45, 48, @on;exec sp_trace_setevent @traceid, 45, 1, @on;exec sp_trace_setevent @traceid, 45, 17, @on;exec sp_trace_setevent @traceid, 45, 10, @on;exec sp_trace_setevent @traceid, 45, 18, @on;exec sp_trace_setevent @traceid, 45, 11, @on;exec sp_trace_setevent @traceid, 45, 12, @on;exec sp_trace_setevent @traceid, 45, 13, @on;exec sp_trace_setevent @traceid, 45, 6, @on;exec sp_trace_setevent @traceid, 45, 14, @on;exec sp_trace_setevent @traceid, 45, 15, @on;
-- SQL:BatchCompletedexec sp_trace_setevent @traceid, 12, 15, @on;exec sp_trace_setevent @traceid, 12, 8, @on;exec sp_trace_setevent @traceid, 12, 16, @on;exec sp_trace_setevent @traceid, 12, 48, @on;exec sp_trace_setevent @traceid, 12, 1, @on;exec sp_trace_setevent @traceid, 12, 17, @on;exec sp_trace_setevent @traceid, 12, 6, @on;exec sp_trace_setevent @traceid, 12, 10, @on;exec sp_trace_setevent @traceid, 12, 14, @on;exec sp_trace_setevent @traceid, 12, 18, @on;exec sp_trace_setevent @traceid, 12, 11, @on;exec sp_trace_setevent @traceid, 12, 12, @on;exec sp_trace_setevent @traceid, 12, 13, @on;
-- SQL:StmtCompletedexec sp_trace_setevent @traceid, 41, 15, @on;exec sp_trace_setevent @traceid, 41, 8, @on;exec sp_trace_setevent @traceid, 41, 16, @on;exec sp_trace_setevent @traceid, 41, 48, @on;exec sp_trace_setevent @traceid, 41, 1, @on;exec sp_trace_setevent @traceid, 41, 17, @on;exec sp_trace_setevent @traceid, 41, 10, @on;exec sp_trace_setevent @traceid, 41, 18, @on;exec sp_trace_setevent @traceid, 41, 11, @on;exec sp_trace_setevent @traceid, 41, 12, @on;exec sp_trace_setevent @traceid, 41, 13, @on;exec sp_trace_setevent @traceid, 41, 6, @on;exec sp_trace_setevent @traceid, 41, 14, @on;
-- Set the Filters
-- Application name filterEXEC sp_trace_setfilter @traceid, 10, 0, 7, N'SQL Server Profiler%';-- Database ID filterEXEC sp_trace_setfilter @traceid, 3, 0, 0, @dbid;
-- Set the trace status to startEXEC sp_trace_setstatus @traceid, 1;
-- Print trace id and file name for future referencesPRINT 'Trace ID: ' + CAST(@traceid AS VARCHAR(10))  + ', Trace File: ''' + @tracefile + '.trc''';
GOTO finish;
error: PRINT 'Error Code: ' + CAST(@rc AS VARCHAR(10));
finish: GO
-- Start the traceDECLARE @dbid AS INT, @traceid AS INT;SET @dbid = DB_ID('Performance');
EXEC master.dbo.PerfworkloadTraceStart  @dbid      = @dbid,  @tracefile = 'D:\SQLServer2008\Trace\trace20130525',  @traceid   = @traceid OUTPUT;GO
-- Stop the trace (assuming trace id was 2)EXEC sp_trace_setstatus 2, 0;EXEC sp_trace_setstatus 2, 2;GO

----------------------------------------------------------------------- Analyze Trace Data---------------------------------------------------------------------
-- Load trace data to tableSET NOCOUNT ON;USE Performance;IF OBJECT_ID('dbo.Workload', 'U') IS NOT NULL DROP TABLE dbo.Workload;GO
SELECT CAST(TextData AS NVARCHAR(MAX)) AS tsql_code,  Duration AS durationINTO dbo.WorkloadFROM sys.fn_trace_gettable('D:\SQLServer2008\Trace\trace20130525.trc', NULL) AS TWHERE Duration > 0  AND EventClass IN(41, 45);GO
select * from dbo.Workloadorder by duration desc
-- Aggregate trace data by querySELECT  tsql_code,  SUM(duration) AS total_durationFROM dbo.WorkloadGROUP BY tsql_code;

-- Query Signature
-- Query templateDECLARE @my_templatetext AS NVARCHAR(MAX);DECLARE @my_parameters   AS NVARCHAR(MAX);
EXEC sp_get_query_template   N'SELECT * FROM dbo.T1 WHERE col1 = 3 AND col2 > 78',  @my_templatetext OUTPUT,  @my_parameters OUTPUT;
SELECT @my_templatetext AS querysig, @my_parameters AS params;GO

-- Creation Script for the SQLSig UDFIF OBJECT_ID('dbo.SQLSig', 'FN') IS NOT NULL  DROP FUNCTION dbo.SQLSig;GO
CREATE FUNCTION dbo.SQLSig   (@p1 NTEXT, @parselength INT = 4000)RETURNS NVARCHAR(4000)
---- This function is provided "AS IS" with no warranties,-- and confers no rights. -- Use of included script samples are subject to the terms specified at-- http://www.microsoft.com/info/cpyright.htm-- -- Strips query stringsASBEGIN   DECLARE @pos AS INT;  DECLARE @mode AS CHAR(10);  DECLARE @maxlength AS INT;  DECLARE @p2 AS NCHAR(4000);  DECLARE @currchar AS CHAR(1), @nextchar AS CHAR(1);  DECLARE @p2len AS INT;
  SET @maxlength = LEN(RTRIM(SUBSTRING(@p1,1,4000)));  SET @maxlength = CASE WHEN @maxlength > @parselength                      THEN @parselength ELSE @maxlength END;  SET @pos = 1;  SET @p2 = '';  SET @p2len = 0;  SET @currchar = '';  set @nextchar = '';  SET @mode = 'command';
  WHILE (@pos <= @maxlength)  BEGIN    SET @currchar = SUBSTRING(@p1,@pos,1);    SET @nextchar = SUBSTRING(@p1,@pos+1,1);    IF @mode = 'command'    BEGIN      SET @p2 = LEFT(@p2,@p2len) + @currchar;      SET @p2len = @p2len + 1 ;      IF @currchar IN (',','(',' ','=','<','>','!')        AND @nextchar BETWEEN '0' AND '9'      BEGIN        SET @mode = 'number';        SET @p2 = LEFT(@p2,@p2len) + '#';        SET @p2len = @p2len + 1;      END       IF @currchar = ''''      BEGIN        SET @mode = 'literal';        SET @p2 = LEFT(@p2,@p2len) + '#''';        SET @p2len = @p2len + 2;      END    END    ELSE IF @mode = 'number' AND @nextchar IN (',',')',' ','=','<','>','!')      SET @mode= 'command';    ELSE IF @mode = 'literal' AND @currchar = ''''      SET @mode= 'command';
    SET @pos = @pos + 1;  END  RETURN @p2;ENDGO
-- Test SQLSig FunctionSELECT dbo.SQLSig  (N'SELECT * FROM dbo.T1 WHERE col1 = 3 AND col2 > 78', 4000);GO
-- Listing 4-3: RegexReplace Function/*using Microsoft.SqlServer.Server;using System.Data.SqlTypes;using System.Text.RegularExpressions;
public partial class RegExp{  [SqlFunction(IsDeterministic = true, DataAccess = DataAccessKind.None)]  public static SqlString RegexReplace(    SqlString input, SqlString pattern, SqlString replacement)  {    return (SqlString)Regex.Replace(      input.Value, pattern.Value, replacement.Value);  }}*/
-- Enable CLREXEC sp_configure 'clr enabled', 1;RECONFIGURE;GO
-- Create assembly USE Performance; CREATE ASSEMBLY RegExp FROM 'D:\TestProject\RegExp\RegExp\bin\Debug\RegExp.dll';GO

-- Create RegexReplace functionCREATE FUNCTION dbo.RegexReplace(  @input       AS NVARCHAR(MAX),  @pattern     AS NVARCHAR(MAX),  @replacement AS NVARCHAR(MAX))RETURNS NVARCHAR(MAX)WITH RETURNS NULL ON NULL INPUT EXTERNAL NAME RegExp.RegExp.RegexReplace;GO
-- Return trace data with query signatureSELECT   dbo.RegexReplace(tsql_code,    N'([\s,(=<>!](?![^\]]+[\]]))(?:(?:(?:(?#    expression coming     )(?:([N])?('')(?:[^'']|'''')*(''))(?#      character     )|(?:0x[\da-fA-F]*)(?#                     binary     )|(?:[-+]?(?:(?:[\d]*\.[\d]*|[\d]+)(?#     precise number     )(?:[eE]?[\d]*)))(?#                       imprecise number     )|(?:[~]?[-+]?(?:[\d]+))(?#                integer     ))(?:[\s]?[\+\-\*\/\%\&\|\^][\s]?)?)+(?#   operators     ))',    N'$1$2$3#$4') AS sig,  durationFROM dbo.Workload;

-- Add cs column to Workload tableALTER TABLE dbo.Workload ADD cs AS CHECKSUM(dbo.RegexReplace(tsql_code,    N'([\s,(=<>!](?![^\]]+[\]]))(?:(?:(?:(?#    expression coming     )(?:([N])?('')(?:[^'']|'''')*(''))(?#      character     )|(?:0x[\da-fA-F]*)(?#                     binary     )|(?:[-+]?(?:(?:[\d]*\.[\d]*|[\d]+)(?#     precise number     )(?:[eE]?[\d]*)))(?#                       imprecise number     )|(?:[~]?[-+]?(?:[\d]+))(?#                integer     ))(?:[\s]?[\+\-\*\/\%\&\|\^][\s]?)?)+(?#   operators     ))',    N'$1$2$3#$4')) PERSISTED;
select * from dbo.Workload

-- Aggregate data by query signature checksum
-- Load aggregate data into temporary tableIF OBJECT_ID('tempdb..#AggQueries', 'U') IS NOT NULL DROP TABLE #AggQueries;
SELECT cs, SUM(duration) AS total_duration,  100. * SUM(duration) / SUM(SUM(duration)) OVER() AS pct,  ROW_NUMBER() OVER(ORDER BY SUM(duration) DESC) AS rnINTO #AggQueriesFROM dbo.WorkloadGROUP BY cs;
CREATE CLUSTERED INDEX idx_cl_cs ON #AggQueries(cs);GO
-- Show aggregate dataSELECT cs, total_duration, pct, rnFROM #AggQueriesORDER BY rn;
-- Show running totalsSELECT AQ1.cs,  CAST(AQ1.total_duration / 1000000.    AS NUMERIC(12, 2)) AS total_s,   CAST(SUM(AQ2.total_duration) / 1000000.    AS NUMERIC(12, 2)) AS running_total_s,   CAST(AQ1.pct AS NUMERIC(12, 2)) AS pct,   CAST(SUM(AQ2.pct) AS NUMERIC(12, 2)) AS run_pct,   AQ1.rnFROM #AggQueries AS AQ1  JOIN #AggQueries AS AQ2    ON AQ2.rn <= AQ1.rnGROUP BY AQ1.cs, AQ1.total_duration, AQ1.pct, AQ1.rnHAVING SUM(AQ2.pct) - AQ1.pct <= 80 -- percentage threshold--  OR AQ1.rn <= 5ORDER BY AQ1.rn;
-- Isolate top offendersWITH RunningTotals AS(  SELECT AQ1.cs,    CAST(AQ1.total_duration / 1000000.      AS NUMERIC(12, 2)) AS total_s,     CAST(SUM(AQ2.total_duration) / 1000000.      AS NUMERIC(12, 2)) AS running_total_s,     CAST(AQ1.pct AS NUMERIC(12, 2)) AS pct,     CAST(SUM(AQ2.pct) AS NUMERIC(12, 2)) AS run_pct,     AQ1.rn  FROM #AggQueries AS AQ1    JOIN #AggQueries AS AQ2      ON AQ2.rn <= AQ1.rn  GROUP BY AQ1.cs, AQ1.total_duration, AQ1.pct, AQ1.rn  HAVING SUM(AQ2.pct) - AQ1.pct <= 80 -- percentage threshold--  OR AQ1.rn <= 5)SELECT RT.rn, RT.pct, W.tsql_codeFROM RunningTotals AS RT  JOIN dbo.Workload AS W    ON W.cs = RT.csORDER BY RT.rn;

-- Isolate sig of top offenders and a sample query of each sigWITH RunningTotals AS(  SELECT AQ1.cs,    CAST(AQ1.total_duration / 1000000.      AS NUMERIC(12, 2)) AS total_s,     CAST(SUM(AQ2.total_duration) / 1000000.      AS NUMERIC(12, 2)) AS running_total_s,     CAST(AQ1.pct AS NUMERIC(12, 2)) AS pct,     CAST(SUM(AQ2.pct) AS NUMERIC(12, 2)) AS run_pct,     AQ1.rn  FROM #AggQueries AS AQ1    JOIN #AggQueries AS AQ2      ON AQ2.rn <= AQ1.rn  GROUP BY AQ1.cs, AQ1.total_duration, AQ1.pct, AQ1.rn  HAVING SUM(AQ2.pct) - AQ1.pct <= 80 -- percentage threshold)SELECT RT.rn, RT.pct, S.sig, S.tsql_code AS sample_queryFROM RunningTotals AS RT  CROSS APPLY    (SELECT TOP(1) tsql_code, dbo.RegexReplace(tsql_code,       N'([\s,(=<>!](?![^\]]+[\]]))(?:(?:(?:(?#    expression coming        )(?:([N])?('')(?:[^'']|'''')*(''))(?#      character        )|(?:0x[\da-fA-F]*)(?#                     binary        )|(?:[-+]?(?:(?:[\d]*\.[\d]*|[\d]+)(?#     precise number        )(?:[eE]?[\d]*)))(?#                       imprecise number        )|(?:[~]?[-+]?(?:[\d]+))(?#                integer        ))(?:[\s]?[\+\-\*\/\%\&\|\^][\s]?)?)+(?#   operators        ))',       N'$1$2$3#$4') AS sig     FROM dbo.Workload AS W     WHERE W.cs = RT.cs) AS SORDER BY RT.rn;GO

b)查看动态查询统计视图(sys.dm_exec_query_stats)

-- Query Statistics
SELECT TOP (5)
MAX(query) AS sample_query,
SUM(execution_count) AS cnt,
SUM(total_worker_time) AS cpu,
SUM(total_physical_reads) AS reads,
SUM(total_logical_reads) AS logical_reads,
SUM(total_elapsed_time) AS duration
FROM (SELECT
QS.*,
SUBSTRING(ST.text, (QS.statement_start_offset/2) + 1,
((CASE statement_end_offset
WHEN -1 THEN DATALENGTH(ST.text)
ELSE QS.statement_end_offset END
- QS.statement_start_offset)/2) + 1
) AS query
FROM sys.dm_exec_query_stats AS QS
CROSS APPLY sys.dm_exec_sql_text(QS.sql_handle) AS ST
CROSS APPLY sys.dm_exec_plan_attributes(QS.plan_handle) AS PA
WHERE PA.attribute = 'dbid'
AND PA.value = DB_ID('Performance')) AS D
GROUP BY query_hash
ORDER BY duration DESC;

6.优化索引/查询

二. 优化工具介绍

1.查询计询相关

a.查询计划的缓存 sys.dm_exec_cached_plan

b.查询计划属性sys.dm_exec_plan_attributes

c.查询相关的文本 sys.dm_exec_sql_text

d.查询计划xml格式

2.清空缓存

--clear buffer
dbcc dropcleanbuffers

--clear query plan
dbcc freeproccache

--clear database paln
dbcc flushprocindb(<db_id>)

--clear stroe query plan
dbcc freesystemcache

3.动态管理对象

4.STATISTICS IO

SET STATISTICS IO ON;

SET STATISTICS IO OFF

5. 测量查询时间

SET STATISTICS TIME ON;

SET STATISTICS TIME OFF;

SYSDATETIME

6.分析执行计划

估计执行计划 CTRL + L

实际执行计划 CTRL + M

SET SHOWPLAN_TEXT ON

SET SHOWPLAN_XML ON

7.提示Hint

8.Profiler

9.优化顾问

10. 数据收集,管理数据仓库

三. 表和索引基本知识

页: SQL Server存储的基本单位 8KB

区:物理8个连续的页

表的组织方式

堆: 无聚集索引

B树:有聚集索引

索引级别估算:

rows_per_leaf_page = (page_size - head_size)  * page_density / leaf_row_size

num_leaf_pages = num_rows / row_per_leaf_page

rows_per_non_leaf_page = floor((page_size - head_size) / non_leaf_row_size)

L - 1 = ceiling(log rows_per_non_leaf_page(num_leaf_pages))

四.索引访问方法

a.表扫描 根据IAM分配顺序扫描

b.无序聚集索引扫描

c. 无序覆盖非聚集索引扫描

d.有序聚集索引扫描

e.有序覆盖非聚集索引扫描

存储引擎扫描处理

1. 分配顺序扫描(IAM位图) 在平衡树上进行扫描时,由于可能页拆分导致多次返回某些行或忽略某些行

2. 索引顺序扫描(沿着索引链表) 由于键更新导致该行数据读取2次或忽略

f.非聚集索引 + 有序局部查找 + lookups

g.无序聚集索引 + Lookups

h.聚集索引查找到 + 有序局部扫描

i.非聚集索引查找 + 有序局部扫描

五.索引视图

创建聚集索引视图优化统计分组