SQL Server 内存管理

时间:2023-03-09 12:59:13
SQL Server 内存管理

windows memory:  Memory: Cache Bytes 是系统的working set, 也就是系统使用的物理内存数目。 可以观察Windows用了多少物理内存。

  1. System Cache Resident Bytes

  2. System Driver Resident Bytes

  3. System Code Resident Bytes

  4. Pool Paged Resident Bytes

SQL Server 动态管理内存:

SQL Server 是通过以下的API去感知windows是否有内存压力:

APIQueryMemoryResourceNotification -》 windows memory -》 decrease target server memory.

Total server memory : SQL Server启动账户拥有 Lock pages in memory 权限, 锁定内存,避免windows抢夺内存

Target server memory:  SQL在启动时候, 比较AWE, max server memory, physical memory 三者选一个最小的值作为Target server memory.

comparison

Result

Remark

Total <   target

Windows   has enough memory, SQL can allocate new memory for new data

Total is   increasing

Total =   target

SQL   used all of memory, SQL don’t allocate new memory for new data

SQL   clean up memory for new data, such as Lazy writer

Total >   target

Windows   has memory pressure, SQL decrease Target

SQL   clean up memory for new data, such as Lazy writer to release buffer pool and   cached plan

SQL server 内存管理概念: DBCC memorystatus -- 查看内存使用情况

1. Reserved memory:

2. Committed memory: = Physical memory + Page file = Shared memory + Private Bytes

3. Shared memory:

4. Private Bytes:

5. Working Set: = shared memory + Private Bytes - Page file

6. Page Fault(soft/hard)

32位SQL:

memToLeave: 256MB+256thread*521KB=384MB (SQL 启动的时候就计算好了,不能变大): extended stored procedure, third party dirver, and linked server : 启动参数 -g512 = 512MB+256thread*521KB

BufferPool = 2G-384MB=1664MB (包括 database cache + stolen memory)

AWE开启以后, 只能给先reserve再commit的部分使用,即只能给database cache来扩展使用(Physical memory-2GB就是AWE扩展后database cache所能用的内存数量),stolen memory只能用1664MB里的内存。

SQL内存使用情况分析:

1. SQL性能计数器:

  memory manager:

    Total server memory,

    target server memory,

    Optimizer memory

    SQL cache memory

    Lock memory

    Connection memory

    Granted workspace memory:  hash, sort, bulk-insert, create index...

    Memory grants pending: 等待工作空间内存授权的进程总数, 不等于0, 意味着比较严重的内存瓶颈

  Buffer manager:

    Buffer Cache Hit Ratio: 应该>99%, 如果<95%, 通常就是内存不足的问题

    Checkpoint pages/sec: 这个和内存压力没有关系,和用户的行为有关。用户做很多insert/update/delete, 这个值就会很大,脏数据多。

    database pages: 就是 database cache

    free pages:

    Lazy writes/sec: 如果SQL 内存压力不大,不会经常触发lazy writer。 如果经常触发, 那么就应该是有内存的瓶颈

    page life expectancy: 页不被引用,将在缓冲池中停留的秒数。只有Lazy writer 被触发, Page life expectancy 才会突然下降。如果总是高高低低的, 应该是有内存压力

    page reads/sec: 这个值正常情况下应该很低。 如果比较高, 一般page life expectancy 会下降, Lazy writes/sec 会上升。

    stolen pages: 所有非database pages, 包括执行计划缓存。

内存DMV:sys.dm_os_memory_clerk

  select type,

    sum(virtual_memory_reserved_kb) as [vm reserved],

    sum(virtual_memory_committed_kb) as [vm committed],

    sum(awe_allocated_kb) as [AWE allocated],

    sum(shared_memory_reserved_kb) as [SM reserved],

    sum(shared_memory_committed_kb) as [SM committed],

    sum(multi_pages_kb) as [Multipage allocator],

    sum(single_pages_kb) as [Singlepage allocator]

  from sys.dm_os_memory_clerks

  group by type

  order by type

内存中的数据页面由哪些表格组成,各占多少: sys.dm_os_buffer_descriptors

  dbcc dropcleanbuffers  --clean up data buffer

  go

  

  

declare @name varchar(100)

declare @cmd varchar(5000)

declare c1 cursor for select name from sys.sysdatabases

open c1

fetch next from c1 into @name

while @@FETCH_STATUS=0

begin

print @name

set @cmd= 'select b.database_id, db=db_name(b.database_id), object_name(p.object_id), p.index_id, buffer_count_kb=count(*)*8 from '  +@name+'.sys.allocation_units a, '+@name+'.sys.dm_os_buffer_descriptors b,'+@name  +'.sys.partitions p where a.allocation_unit_id=b.allocation_unit_id and a.container_id=p.hobt_id '  +' and b.database_id=db_id('''+@name+''')'  +' group by b.database_id, p.object_id, p.index_id '  +' order by b.database_id, buffer_count_kb desc '

print @cmd

exec(@cmd)

fetch next from c1 into @name

end

close c1

deallocate c1

go

执行计划都缓存了些什么? 哪些比较占内存?

  select objtype, sum(size_in_bytes) as sum_size_in_bytes, count(bucketid) as cache_counts

  from sys.dm_exec_cached_plans

  group by objtype

  select usecounts,refcounts, size_in_bytes, cacheobjtype, objtype, text

  from sys.dm_exec_cached_plans cp

  cross apply sys.dm_exec_sql_text(plan_handle)

  order by objtype desc;

  go

-----------------华丽的分割线: 下面开始分析 数据页面(database page),  buffer pool 里的stolen, multi-page 这三部分------------------------------------

数据缓冲区压力分析:

  Lazy writes/sec 高

  Page life expectancy 低

  Page reads/sec 高: 正常这个值应该接近0, 指从数据文件读取的数据量

  Stolen pages 降低

  sys.sysprocesses 中出现一些连接等待I/O完成的现象: PAGEIOLATCH_SH

  使用DMV分析启动以来做read最多的语句:

  --按照物理读的页面数排序,前50名:

  select top 50

  qs.total_physical_reads, qs.execution_count,

  qs.total_physical_reads / qs.execution_count as [avg IO],

  substring(qt.text, qs.statement_start_offset/2,

    (case when qs.statement_end_offset = -1 then len(convert(nvarchar(max), qt.text)) * 2

      else qs.statement_end_offset end - qs.statement_start_offset)/2) as query_text,

  qt.dbid, dbname=db_name(qt.dbid),

  qt.objectid, qs.sql_handle, qs.plan_handle

  from sys.dm_exec_query_stats qs

  cross apply sys.dm_exec_sql_text(qs.sql_handle) as qt

  order by qs.total_physical_reads desc

  --按照逻辑读的页面数排序,前50名

  select top 50

  qs.total_logical_reads, qs.execution_count,

  qs.total_logical_reads / qs.execution_count as [avg IO],

  substring(qt.text, qs.statement_start_offset/2,

    (case when qs.statement_end_offset = -1 then len(convert(nvarchar(max), qt.text)) * 2

      else qs.statement_end_offset end - qs.statement_start_offset)/2) as query_text,

  qt.dbid, dbname=db_name(qt.dbid),

  qt.objectid, qs.sql_handle, qs.plan_handle

  from sys.dm_exec_query_stats qs

  cross apply sys.dm_exec_sql_text(qs.sql_handle) as qt

  order by qs.total_logical_reads desc

  --使用trace 文件分析做read最多的语句

  select * into sample

  from fn_trace_gettable('c:\sample.trc',default)

  where eventclass in (10, 12)  -- 10: RPC completed    12: SQL batch completed

  select top 1000 textdata, databaseid, hostname, applicationname, loginname, spid

    starttime, endtime,duration,reads,writes,cpu

  from sample

  order by reads desc

  --使用ReadTrace工具分析trace文件,找出使用大量系统资源的语句

  使用方法见稍后发出。。。

Stolen Memory缓存压力分析:

  Stolen memory 凡是以8K为分配单位的,保存在buffer pool里。 大于8K的,保存在MemToLeave里。

  缓存: 执行计划,用户安全上下文,连接的数据结构和输入/输出缓冲区

  没有缓存: 语义分析,优化,排序,Hash,计算

  Stolen memory 在不同的SQL版本,最大的限制是不同的,见下表:

  SQL Server 内存管理

  表征与解决方法:查看 sys.sysprocesses 里面的连接等待waittype字段不等于0x0000: (可以手动 DBCC FREEPROCCACHE)

  1. CMEMTHREAD (0x00B9)

  2. SOS_RESERVEDMEMBLOCKLIST (0x007B)

  3. RESOURCE_SEMAPHORE_QUERY_COMPILE(0x011A)