使用动态管理视图监视 Microsoft Azure SQL 托管实例的性能
适用于:Azure SQL 托管实例
Microsoft Azure SQL 托管实例支持通过一部分动态管理视图 (DMV) 来诊断性能问题,这些问题可能由查询受阻或长时间运行、资源瓶颈、不良查询计划等原因造成。 本文介绍有关如何通过使用动态管理视图检测常见性能问题。
本文介绍 Azure SQL 托管实例,另请参阅使用动态管理视图监视 Microsoft Azure SQL 数据库性能。
权限
在 Azure SQL 托管实例中,查询动态管理视图需要 VIEW SERVER STATE 权限。
GRANT VIEW SERVER STATE TO database_user;
在 SQL Server 实例和 Azure SQL 托管实例中,动态管理视图会返回服务器状态信息。
识别 CPU 性能问题
如果 CPU 使用率长时间超过 80%,请考虑以下故障排除步骤:
目前正在发生 CPU 问题
如果目前已出现问题,则可能存在两种情况:
存在许多单独的查询,它们共同消耗了很多的 CPU 资源
使用以下查询来识别消耗量靠前的查询哈希:
PRINT '-- top 10 Active CPU Consuming Queries (aggregated)--';
SELECT TOP 10 GETDATE() runtime, *
FROM (SELECT query_stats.query_hash, SUM(query_stats.cpu_time) 'Total_Request_Cpu_Time_Ms', SUM(logical_reads) 'Total_Request_Logical_Reads', MIN(start_time) 'Earliest_Request_start_Time', COUNT(*) 'Number_Of_Requests', SUBSTRING(REPLACE(REPLACE(MIN(query_stats.statement_text), CHAR(10), ' '), CHAR(13), ' '), 1, 256) AS "Statement_Text"
FROM (SELECT req.*, SUBSTRING(ST.text, (req.statement_start_offset / 2)+1, ((CASE statement_end_offset WHEN -1 THEN DATALENGTH(ST.text)ELSE req.statement_end_offset END-req.statement_start_offset)/ 2)+1) AS statement_text
FROM sys.dm_exec_requests AS req
CROSS APPLY sys.dm_exec_sql_text(req.sql_handle) AS ST ) AS query_stats
GROUP BY query_hash) AS t
ORDER BY Total_Request_Cpu_Time_Ms DESC;
消耗 CPU 的长时间运行的查询仍在运行
使用以下查询来识别这些查询:
PRINT '--top 10 Active CPU Consuming Queries by sessions--';
SELECT TOP 10 req.session_id, req.start_time, cpu_time 'cpu_time_ms', OBJECT_NAME(ST.objectid, ST.dbid) 'ObjectName', SUBSTRING(REPLACE(REPLACE(SUBSTRING(ST.text, (req.statement_start_offset / 2)+1, ((CASE statement_end_offset WHEN -1 THEN DATALENGTH(ST.text)ELSE req.statement_end_offset END-req.statement_start_offset)/ 2)+1), CHAR(10), ' '), CHAR(13), ' '), 1, 512) AS statement_text
FROM sys.dm_exec_requests AS req
CROSS APPLY sys.dm_exec_sql_text(req.sql_handle) AS ST
ORDER BY cpu_time DESC;
GO
过去发生了 CPU 问题
如果问题是在过去发生的,你想要执行根本原因分析,请使用查询存储。 拥有数据库访问权限的用户可以使用 T-SQL 对查询存储数据执行查询。 查询存储的默认配置使用 1 小时粒度。 使用以下查询来查看 CPU 消耗量较高的查询的活动。 此查询将返回 CPU 消耗量最高的 15 个查询。 请记得更改 rsi.start_time >= DATEADD(hour, -2, GETUTCDATE()
:
-- Top 15 CPU consuming queries by query hash
-- note that a query hash can have many query id if not parameterized or not parameterized properly
-- it grabs a sample query text by min
WITH AggregatedCPU AS (SELECT q.query_hash, SUM(count_executions * avg_cpu_time / 1000.0) AS total_cpu_millisec, SUM(count_executions * avg_cpu_time / 1000.0)/ SUM(count_executions) AS avg_cpu_millisec, MAX(rs.max_cpu_time / 1000.00) AS max_cpu_millisec, MAX(max_logical_io_reads) max_logical_reads, COUNT(DISTINCT p.plan_id) AS number_of_distinct_plans, COUNT(DISTINCT p.query_id) AS number_of_distinct_query_ids, SUM(CASE WHEN rs.execution_type_desc='Aborted' THEN count_executions ELSE 0 END) AS Aborted_Execution_Count, SUM(CASE WHEN rs.execution_type_desc='Regular' THEN count_executions ELSE 0 END) AS Regular_Execution_Count, SUM(CASE WHEN rs.execution_type_desc='Exception' THEN count_executions ELSE 0 END) AS Exception_Execution_Count, SUM(count_executions) AS total_executions, MIN(qt.query_sql_text) AS sampled_query_text
FROM sys.query_store_query_text AS qt
JOIN sys.query_store_query AS q ON qt.query_text_id=q.query_text_id
JOIN sys.query_store_plan AS p ON q.query_id=p.query_id
JOIN sys.query_store_runtime_stats AS rs ON rs.plan_id=p.plan_id
JOIN sys.query_store_runtime_stats_interval AS rsi ON rsi.runtime_stats_interval_id=rs.runtime_stats_interval_id
WHERE rs.execution_type_desc IN ('Regular', 'Aborted', 'Exception')AND rsi.start_time>=DATEADD(HOUR, -2, GETUTCDATE())
GROUP BY q.query_hash), OrderedCPU AS (SELECT query_hash, total_cpu_millisec, avg_cpu_millisec, max_cpu_millisec, max_logical_reads, number_of_distinct_plans, number_of_distinct_query_ids, total_executions, Aborted_Execution_Count, Regular_Execution_Count, Exception_Execution_Count, sampled_query_text, ROW_NUMBER() OVER (ORDER BY total_cpu_millisec DESC, query_hash ASC) AS RN
FROM AggregatedCPU)
SELECT OD.query_hash, OD.total_cpu_millisec, OD.avg_cpu_millisec, OD.max_cpu_millisec, OD.max_logical_reads, OD.number_of_distinct_plans, OD.number_of_distinct_query_ids, OD.total_executions, OD.Aborted_Execution_Count, OD.Regular_Execution_Count, OD.Exception_Execution_Count, OD.sampled_query_text, OD.RN
FROM OrderedCPU AS OD
WHERE OD.RN<=15
ORDER BY total_cpu_millisec DESC;
识别出有问题的查询后,可以优化这些查询,以降低 CPU 利用率。 如果没有时间优化查询,也可以选择升级托管实例的 SLO 来解决问题。
识别 IO 性能问题
识别 IO 性能问题时,与 IO 问题最相关的等待类型包括:
PAGEIOLATCH_*
数据文件 IO 问题(包括
PAGEIOLATCH_SH
、PAGEIOLATCH_EX
、PAGEIOLATCH_UP
)。 如果等待类型名称中包含 IO,则此类型与某个 IO 问题相关。 如果页面闩锁等待名称中不包含 IO,则此类型与不同类型的问题(例如tempdb
争用)相关。WRITE_LOG
事务日志 IO 问题。
如果目前已经出现了 IO 问题
使用 sys.dm_exec_requests 或 sys.dm_os_waiting_tasks 查看 wait_type
和 wait_time
。
使用查询存储查看缓冲区相关的 IO
对于选项 2,可以针对缓冲区相关 IO 的查询存储使用以下查询,以查看过去两个小时的跟踪活动:
-- top queries that waited on buffer
-- note these are finished queries
WITH Aggregated AS (SELECT q.query_hash, SUM(total_query_wait_time_ms) total_wait_time_ms, SUM(total_query_wait_time_ms / avg_query_wait_time_ms) AS total_executions, MIN(qt.query_sql_text) AS sampled_query_text, MIN(wait_category_desc) AS wait_category_desc
FROM sys.query_store_query_text AS qt
JOIN sys.query_store_query AS q ON qt.query_text_id=q.query_text_id
JOIN sys.query_store_plan AS p ON q.query_id=p.query_id
JOIN sys.query_store_wait_stats AS waits ON waits.plan_id=p.plan_id
JOIN sys.query_store_runtime_stats_interval AS rsi ON rsi.runtime_stats_interval_id=waits.runtime_stats_interval_id
WHERE wait_category_desc='Buffer IO' AND rsi.start_time>=DATEADD(HOUR, -2, GETUTCDATE())
GROUP BY q.query_hash), Ordered AS (SELECT query_hash, total_executions, total_wait_time_ms, sampled_query_text, wait_category_desc, ROW_NUMBER() OVER (ORDER BY total_wait_time_ms DESC, query_hash ASC) AS RN
FROM Aggregated)
SELECT OD.query_hash, OD.total_executions, OD.total_wait_time_ms, OD.sampled_query_text, OD.wait_category_desc, OD.RN
FROM Ordered AS OD
WHERE OD.RN<=15
ORDER BY total_wait_time_ms DESC;
GO
查看 WRITELOG 等待类型的日志 IO 总计
如果等待类型为 WRITELOG
,请使用以下查询按语句查看日志 IO 总计:
-- Top transaction log consumers
-- Adjust the time window by changing
-- rsi.start_time >= DATEADD(hour, -2, GETUTCDATE())
WITH AggregatedLogUsed
AS (SELECT q.query_hash,
SUM(count_executions * avg_cpu_time / 1000.0) AS total_cpu_millisec,
SUM(count_executions * avg_cpu_time / 1000.0) / SUM(count_executions) AS avg_cpu_millisec,
SUM(count_executions * avg_log_bytes_used) AS total_log_bytes_used,
MAX(rs.max_cpu_time / 1000.00) AS max_cpu_millisec,
MAX(max_logical_io_reads) max_logical_reads,
COUNT(DISTINCT p.plan_id) AS number_of_distinct_plans,
COUNT(DISTINCT p.query_id) AS number_of_distinct_query_ids,
SUM( CASE
WHEN rs.execution_type_desc = 'Aborted' THEN
count_executions
ELSE
0
END
) AS Aborted_Execution_Count,
SUM( CASE
WHEN rs.execution_type_desc = 'Regular' THEN
count_executions
ELSE
0
END
) AS Regular_Execution_Count,
SUM( CASE
WHEN rs.execution_type_desc = 'Exception' THEN
count_executions
ELSE
0
END
) AS Exception_Execution_Count,
SUM(count_executions) AS total_executions,
MIN(qt.query_sql_text) AS sampled_query_text
FROM sys.query_store_query_text AS qt
JOIN sys.query_store_query AS q
ON qt.query_text_id = q.query_text_id
JOIN sys.query_store_plan AS p
ON q.query_id = p.query_id
JOIN sys.query_store_runtime_stats AS rs
ON rs.plan_id = p.plan_id
JOIN sys.query_store_runtime_stats_interval AS rsi
ON rsi.runtime_stats_interval_id = rs.runtime_stats_interval_id
WHERE rs.execution_type_desc IN ( 'Regular', 'Aborted', 'Exception' )
AND rsi.start_time >= DATEADD(HOUR, -2, GETUTCDATE())
GROUP BY q.query_hash),
OrderedLogUsed
AS (SELECT query_hash,
total_log_bytes_used,
number_of_distinct_plans,
number_of_distinct_query_ids,
total_executions,
Aborted_Execution_Count,
Regular_Execution_Count,
Exception_Execution_Count,
sampled_query_text,
ROW_NUMBER() OVER (ORDER BY total_log_bytes_used DESC, query_hash ASC) AS RN
FROM AggregatedLogUsed)
SELECT OD.total_log_bytes_used,
OD.number_of_distinct_plans,
OD.number_of_distinct_query_ids,
OD.total_executions,
OD.Aborted_Execution_Count,
OD.Regular_Execution_Count,
OD.Exception_Execution_Count,
OD.sampled_query_text,
OD.RN
FROM OrderedLogUsed AS OD
WHERE OD.RN <= 15
ORDER BY total_log_bytes_used DESC;
GO
识别 tempdb
性能问题
识别 IO 性能问题时,与 tempdb
问题最相关的等待类型是 PAGELATCH_*
(而不是 PAGEIOLATCH_*
)。 但是,出现 PAGELATCH_*
等待并不总是意味着发生了 tempdb
争用。 这种等待可能还意味着,由于并发请求面向相同的数据页面,发生了用户对象数据页面争用。 若要进一步确认 tempdb
争用,请使用 sys.dm_exec_requests 确认 wait_resource 值是否以 2:x:y
开头,其中,tempdb
是数据库 ID,x
是文件 ID,y
是页 ID。
对于 tempdb
争用,常用的方法是减少或重写依赖于 tempdb
的应用程序代码。 常见的 tempdb
使用区域包括:
- 临时表
- 表变量
- 表值参数
- 版本存储使用(与长时间运行的事务关联的用法)
- 包含使用排序、哈希联接和 spool 的查询计划的查询
使用表变量和临时表的最相关查询
使用以下查询来识别使用表变量和临时表的最相关查询:
SELECT plan_handle, execution_count, query_plan
INTO #tmpPlan
FROM sys.dm_exec_query_stats
CROSS APPLY sys.dm_exec_query_plan(plan_handle);
GO
WITH XMLNAMESPACES('http://schemas.microsoft.com/sqlserver/2004/07/showplan' AS sp)
SELECT plan_handle, stmt.stmt_details.value('@Database', 'varchar(max)') 'Database', stmt.stmt_details.value('@Schema', 'varchar(max)') 'Schema', stmt.stmt_details.value('@Table', 'varchar(max)') 'table'
INTO #tmp2
FROM(SELECT CAST(query_plan AS XML) sqlplan, plan_handle FROM #tmpPlan) AS p
CROSS APPLY sqlplan.nodes('//sp:Object') AS stmt(stmt_details);
GO
SELECT t.plan_handle, [Database], [Schema], [table], execution_count
FROM(SELECT DISTINCT plan_handle, [Database], [Schema], [table]
FROM #tmp2
WHERE [table] LIKE '%@%' OR [table] LIKE '%#%') AS t
JOIN #tmpPlan AS t2 ON t.plan_handle=t2.plan_handle;
识别长时间运行的事务
使用以下查询来识别长时间运行的事务。 长时间运行的事务会阻止版本存储的清理。
SELECT DB_NAME(dtr.database_id) 'database_name',
sess.session_id,
atr.name AS 'tran_name',
atr.transaction_id,
transaction_type,
transaction_begin_time,
database_transaction_begin_time,
transaction_state,
is_user_transaction,
sess.open_transaction_count,
TRIM(REPLACE(
REPLACE(
SUBSTRING(
SUBSTRING(
txt.text,
(req.statement_start_offset / 2) + 1,
((CASE req.statement_end_offset
WHEN -1 THEN
DATALENGTH(txt.text)
ELSE
req.statement_end_offset
END - req.statement_start_offset
) / 2
) + 1
),
1,
1000
),
CHAR(10),
' '
),
CHAR(13),
' '
)
) Running_stmt_text,
recenttxt.text 'MostRecentSQLText'
FROM sys.dm_tran_active_transactions AS atr
INNER JOIN sys.dm_tran_database_transactions AS dtr
ON dtr.transaction_id = atr.transaction_id
LEFT JOIN sys.dm_tran_session_transactions AS sess
ON sess.transaction_id = atr.transaction_id
LEFT JOIN sys.dm_exec_requests AS req
ON req.session_id = sess.session_id
AND req.transaction_id = sess.transaction_id
LEFT JOIN sys.dm_exec_connections AS conn
ON sess.session_id = conn.session_id
OUTER APPLY sys.dm_exec_sql_text(req.sql_handle) AS txt
OUTER APPLY sys.dm_exec_sql_text(conn.most_recent_sql_handle) AS recenttxt
WHERE atr.transaction_type != 2
AND sess.session_id != @@spid
ORDER BY start_time ASC;
识别内存授予等待性能问题
如果最相关等待类型为 RESOURCE_SEMAHPORE
,但你未遇到 CPU 使用率偏高的问题,则可能是出现了内存授予等待问题。
确定 RESOURCE_SEMAHPORE
等待是否为最相关的等待
使用以下查询来确定 RESOURCE_SEMAHPORE
等待是否为最相关的等待
SELECT wait_type,
SUM(wait_time) AS total_wait_time_ms
FROM sys.dm_exec_requests AS req
JOIN sys.dm_exec_sessions AS sess
ON req.session_id = sess.session_id
WHERE is_user_process = 1
GROUP BY wait_type
ORDER BY SUM(wait_time) DESC;
识别内存消耗量较高的语句
如果遇到内存不足错误,请查看 sys.dm_os_out_of_memory_events。
使用以下查询来识别内存消耗量较高的语句:
SELECT IDENTITY(INT, 1, 1) rowId,
CAST(query_plan AS XML) query_plan,
p.query_id
INTO #tmp
FROM sys.query_store_plan AS p
JOIN sys.query_store_runtime_stats AS r
ON p.plan_id = r.plan_id
JOIN sys.query_store_runtime_stats_interval AS i
ON r.runtime_stats_interval_id = i.runtime_stats_interval_id
WHERE start_time > '2018-10-11 14:00:00.0000000'
AND end_time < '2018-10-17 20:00:00.0000000';
GO
;WITH cte
AS (SELECT query_id,
query_plan,
m.c.value('@SerialDesiredMemory', 'INT') AS SerialDesiredMemory
FROM #tmp AS t
CROSS APPLY t.query_plan.nodes('//*:MemoryGrantInfo[@SerialDesiredMemory[. > 0]]') AS m(c) )
SELECT TOP 50
cte.query_id,
t.query_sql_text,
cte.query_plan,
CAST(SerialDesiredMemory / 1024. AS DECIMAL(10, 2)) SerialDesiredMemory_MB
FROM cte
JOIN sys.query_store_query AS q
ON cte.query_id = q.query_id
JOIN sys.query_store_query_text AS t
ON q.query_text_id = t.query_text_id
ORDER BY SerialDesiredMemory DESC;
识别内存授予
使用以下查询来识别最活跃的 10 个内存授予:
SELECT TOP 10
CONVERT(VARCHAR(30), GETDATE(), 121) AS runtime,
r.session_id,
r.blocking_session_id,
r.cpu_time,
r.total_elapsed_time,
r.reads,
r.writes,
r.logical_reads,
r.row_count,
wait_time,
wait_type,
r.command,
OBJECT_NAME(txt.objectid, txt.dbid) 'Object_Name',
TRIM(REPLACE(
REPLACE(
SUBSTRING(
SUBSTRING(
text,
(r.statement_start_offset / 2) + 1,
((CASE r.statement_end_offset
WHEN -1 THEN
DATALENGTH(text)
ELSE
r.statement_end_offset
END - r.statement_start_offset
) / 2
) + 1
),
1,
1000
),
CHAR(10),
' '
),
CHAR(13),
' '
)
) stmt_text,
mg.dop, --Degree of parallelism
mg.request_time, --Date and time when this query requested the memory grant.
mg.grant_time, --NULL means memory has not been granted
mg.requested_memory_kb / 1024.0 requested_memory_mb, --Total requested amount of memory in megabytes
mg.granted_memory_kb / 1024.0 AS granted_memory_mb, --Total amount of memory actually granted in megabytes. NULL if not granted
mg.required_memory_kb / 1024.0 AS required_memory_mb, --Minimum memory required to run this query in megabytes.
max_used_memory_kb / 1024.0 AS max_used_memory_mb,
mg.query_cost, --Estimated query cost.
mg.timeout_sec, --Time-out in seconds before this query gives up the memory grant request.
mg.resource_semaphore_id, --Non-unique ID of the resource semaphore on which this query is waiting.
mg.wait_time_ms, --Wait time in milliseconds. NULL if the memory is already granted.
CASE mg.is_next_candidate --Is this process the next candidate for a memory grant
WHEN 1 THEN
'Yes'
WHEN 0 THEN
'No'
ELSE
'Memory has been granted'
END AS 'Next Candidate for Memory Grant',
qp.query_plan
FROM sys.dm_exec_requests AS r
JOIN sys.dm_exec_query_memory_grants AS mg
ON r.session_id = mg.session_id
AND r.request_id = mg.request_id
CROSS APPLY sys.dm_exec_sql_text(mg.sql_handle) AS txt
CROSS APPLY sys.dm_exec_query_plan(r.plan_handle) AS qp
ORDER BY mg.granted_memory_kb DESC;
计算数据库和对象大小
下面的查询将返回数据库的大小(以 MB 为单位):
-- Calculates the size of the database.
SELECT SUM(CAST(FILEPROPERTY(name, 'SpaceUsed') AS bigint) * 8192.) / 1024 / 1024 AS DatabaseSizeInMB
FROM sys.database_files
WHERE type_desc = 'ROWS';
GO
下面的查询将返回数据库中各个对象的大小(以 MB 为单位):
-- Calculates the size of individual database objects.
SELECT sys.objects.name, SUM(reserved_page_count) * 8.0 / 1024
FROM sys.dm_db_partition_stats, sys.objects
WHERE sys.dm_db_partition_stats.object_id = sys.objects.object_id
GROUP BY sys.objects.name;
GO
监视连接
可以使用 sys.dm_exec_connections 视图检索与特定托管实例建立的连接的相关信息和每个连接的详细信息。 此外,sys.dm_exec_sessions 视图在检索有关所有活动用户连接和内部任务的信息时非常有用。
下面的查询将检索当前连接上的信息:
SELECT
c.session_id, c.net_transport, c.encrypt_option,
c.auth_scheme, s.host_name, s.program_name,
s.client_interface_name, s.login_name, s.nt_domain,
s.nt_user_name, s.original_login_name, c.connect_time,
s.login_time
FROM sys.dm_exec_connections AS c
JOIN sys.dm_exec_sessions AS s
ON c.session_id = s.session_id
WHERE c.session_id = @@SPID;
监视资源使用情况
可以使用查询存储监视资源使用情况,就像在 SQL Server 中一样。
还可以使用 sys.dm_db_resource_stats 和 sys.server_resource_stats 监视使用情况。
sys.dm_db_resource_stats
可以在每个数据库中使用 sys.dm_db_resource_stats 视图。 sys.dm_db_resource_stats
视图显示相对于服务层级的最新资源使用数据。 CPU、数据 IO、日志写入以及内存的平均百分比每 15 秒记录一次,并保留 1 小时。
由于此视图提供了更精细的资源使用情况,因此首先将 sys.dm_db_resource_stats
用于任何当前状态分析或故障排除。 例如,此查询显示过去一小时的当前数据库平均和最大资源使用情况:
SELECT
AVG(avg_cpu_percent) AS 'Average CPU use in percent',
MAX(avg_cpu_percent) AS 'Maximum CPU use in percent',
AVG(avg_data_io_percent) AS 'Average data IO in percent',
MAX(avg_data_io_percent) AS 'Maximum data IO in percent',
AVG(avg_log_write_percent) AS 'Average log write use in percent',
MAX(avg_log_write_percent) AS 'Maximum log write use in percent',
AVG(avg_memory_usage_percent) AS 'Average memory use in percent',
MAX(avg_memory_usage_percent) AS 'Maximum memory use in percent'
FROM sys.dm_db_resource_stats;
有关其他查询,请参阅 sys.dm_db_resource_stats 中的示例。
sys.server_resource_stats
可以使用 sys.server_resource_stats 返回 Azure SQL 托管实例的 CPU 使用率、IO 和存储数据。 在五分钟间隔内收集并聚合数据。 其中有一行用于显示每隔 15 秒报告的信息。 返回的数据包括 CPU 使用率、存储大小、IO 利用率和托管实例 SKU。 历史数据保留大约 14 天。
以下示例演示可以用不同方式使用 sys.server_resource_stats
目录视图,以获取有关实例如何使用资源的信息。
以下示例返回过去七天内的平均 CPU 使用率:
DECLARE @s datetime; DECLARE @e datetime; SET @s= DateAdd(d,-7,GetUTCDate()); SET @e= GETUTCDATE(); SELECT AVG(avg_cpu_percent) AS Average_Compute_Utilization FROM sys.server_resource_stats WHERE start_time BETWEEN @s AND @e; GO
以下示例返回实例每天使用的平均存储空间,以便进行增长趋势分析:
DECLARE @s datetime; DECLARE @e datetime; SET @s= DateAdd(d,-7,GetUTCDate()); SET @e= GETUTCDATE(); SELECT Day = convert(date, start_time), AVG(storage_space_used_mb) AS Average_Space_Used_mb FROM sys.server_resource_stats WHERE start_time BETWEEN @s AND @e GROUP BY convert(date, start_time) ORDER BY convert(date, start_time); GO
最大并发请求数
若要查看当前并发请求数,请在数据库中运行以下 Transact-SQL 查询:
SELECT COUNT(*) AS [Concurrent_Requests]
FROM sys.dm_exec_requests R;
若要分析单个数据库的工作负载,请修改此查询,针对要分析的特定数据库进行筛选。 例如,如果有一个名为 MyDatabase
的数据库,则以下 Transact-SQL 查询返回该数据库中并发请求的计数:
SELECT COUNT(*) AS [Concurrent_Requests]
FROM sys.dm_exec_requests R
INNER JOIN sys.databases D ON D.database_id = R.database_id
AND D.name = 'MyDatabase';
这只是某一时刻的快照。 若要更好地了解工作负荷和并发请求需求,需在一定时间内收集多个样本。
最大并发登录数
可以通过分析用户和应用程序模式来了解登录频率。 还可以在测试环境中运行实际负荷,确保不会超过本文所介绍的这样或那样的限制。 无法通过单一查询或动态管理视图 (DMV) 了解并发登录计数或历史记录。
如果多个客户端使用相同的连接字符串,该服务也会对每个登录名进行身份验证。 如果 10 个用户使用相同的用户名和密码同时连接到数据库,则会有 10 个并发登录。 此限制仅针对使用登录名进行身份验证的那段时间。 如果这 10 个用户依次连接到数据库,则并发登录数始终不会超过 1。
最大会话数
若要查看当前的活动会话数,请在数据库中运行以下 Transact-SQL 查询:
SELECT COUNT(*) AS [Sessions]
FROM sys.dm_exec_connections;
若要分析 SQL Server 工作负荷,可以对查询进行修改,使之专注于特定的数据库。 此查询有助于确定数据库可能的会话需求(如果考虑将其移至 Azure)。
SELECT COUNT(*) AS [Sessions]
FROM sys.dm_exec_connections C
INNER JOIN sys.dm_exec_sessions S ON (S.session_id = C.session_id)
INNER JOIN sys.databases D ON (D.database_id = S.database_id)
WHERE D.name = 'MyDatabase';
同样,这些查询返回时间点计数。 如果在一段时间内收集多个样本,则可更好地了解会话使用情况。
监视查询性能
缓慢或长时间运行的查询会消耗大量系统资源。 本部分演示如何使用动态管理视图来检测一些常见的查询性能问题。
查找前 n 个查询
下列示例返回了按平均 CPU 时间排名的前五个查询的信息。 该示例根据查询散列收集了查询,以便逻辑上等值的查询能够根据累积资源消耗分组。
SELECT TOP 5 query_stats.query_hash AS "Query Hash",
SUM(query_stats.total_worker_time) / SUM(query_stats.execution_count) AS "Avg CPU Time",
MIN(query_stats.statement_text) AS "Statement Text"
FROM
(SELECT QS.*,
SUBSTRING(ST.text, (QS.statement_start_offset/2) + 1,
((CASE statement_end_offset
WHEN -1 THEN DATALENGTH(ST.text)
ELSE QS.statement_end_offset END
- QS.statement_start_offset)/2) + 1) AS statement_text
FROM sys.dm_exec_query_stats AS QS
CROSS APPLY sys.dm_exec_sql_text(QS.sql_handle) as ST) as query_stats
GROUP BY query_stats.query_hash
ORDER BY 2 DESC;
监视受阻的查询
缓慢或长时间运行的查询会造成过多的资源消耗并会导致查询受阻。 受阻的原因可能是应用程序设计欠佳、查询计划不良、缺乏有用的索引等。 可以使用 sys.dm_tran_locks 视图来获取有关数据库中当前锁定活动的信息。 有关示例代码,请参阅 sys.dm_tran_locks。 有关故障排除阻塞的详细信息,请参阅了解并解决 Azure SQL 阻塞问题。
监视死锁
在某些情况下,两个或多个查询可能会相互阻止,从而导致死锁。
可以创建扩展事件跟踪数据库以捕获死锁事件,然后在查询存储中查找相关查询及其执行计划。
有关 Azure SQL 托管实例,请参阅死锁指南中的死锁工具。
监视查询计划
低效的查询计划还可能会增加 CPU 占用率。 下面的示例使用 sys.dm_exec_query_stats 视图来确定哪一个查询使用最多的 CPU 累计时间。
SELECT
highest_cpu_queries.plan_handle,
highest_cpu_queries.total_worker_time,
q.dbid,
q.objectid,
q.number,
q.encrypted,
q.[text]
FROM
(SELECT TOP 50
qs.plan_handle,
qs.total_worker_time
FROM
sys.dm_exec_query_stats qs
ORDER BY qs.total_worker_time desc) AS highest_cpu_queries
CROSS APPLY sys.dm_exec_sql_text(plan_handle) AS q
ORDER BY highest_cpu_queries.total_worker_time DESC;