事件表的查询

有关在 Azure 门户中使用这些查询的信息,请参阅 Log Analytics 教程。 有关 REST API,请参阅查询

内存使用率百分比

对于群集,查看平均节点内存使用率百分比。

//Select your log analytics workspace and replace enter cluster ID with your cluster arm ID
//Unit for MemoryUsage is in percentage(%),TotalMemory, and UsedMemory are in bytes
//Please use Nodename to set alert for each node
Event
| where EventLog =~ "Microsoft-Windows-SDDC-Management/Operational" and EventID == "3000"
| extend ClusterData = parse_xml(EventData)
| extend ClusterName = tostring(ClusterData.DataItem.UserData.EventData["ClusterName"])
| extend ClusterArmId = tostring(ClusterData.DataItem.UserData.EventData["ArmId"])
| where ClusterArmId =~ 'enter cluster ID'
| summarize arg_max(TimeGenerated, RenderedDescription)
| extend servers_information = parse_json(RenderedDescription).m_servers
| mv-expand servers_information
| extend Nodename = tostring(servers_information.m_name)
| extend TotalMemory = todecimal(servers_information.m_totalPhysicalMemoryInBytes)
| extend UsedMemory = iff(TotalMemory == 0.0, todecimal(0.0), todecimal(servers_information.m_usedPhysicalMemoryInBytes))
| extend MemoryUsage = iff(TotalMemory == 0.0, todecimal(0.0), todecimal(round(UsedMemory / TotalMemory * 100, 0)))

平均节点 CPU 使用率百分比

对于群集,查看平均节点 CPU 使用率百分比。

//Select your log analytics workspace and replace enter cluster ID with your cluster arm ID
//Unit for UsedCpuPercentage is in percentage(%)
//Please use Nodename to set alert for each node
Event
| where EventLog =~ "Microsoft-Windows-SDDC-Management/Operational" and EventID == "3000"
| extend ClusterData = parse_xml(EventData)
| extend ClusterName = tostring(ClusterData.DataItem.UserData.EventData["ClusterName"])
| extend ClusterArmId = tostring(ClusterData.DataItem.UserData.EventData["ArmId"])
| where ClusterArmId =~ 'enter cluster ID'
| summarize arg_max(TimeGenerated, RenderedDescription)
| extend servers_information = parse_json(RenderedDescription).m_servers
| mv-expand servers_information
| extend Nodename = tostring(servers_information.m_name)
| extend UsedCpuPercentage = toint(servers_information.m_totalProcessorsUsedPercentage)

虚拟机失败

对于群集,查看群集中失败的虚拟机。

//Select your log analytics workspace and replace enter cluster ID with your cluster arm ID
Event
| where EventLog =~ "Microsoft-Windows-SDDC-Management/Operational" and EventID == "3003"
| extend ClusterName = tostring(parse_xml(EventData).DataItem.UserData.EventData["ClusterName"])
| extend ClusterArmId = tostring(parse_xml(EventData).DataItem.UserData.EventData["ArmId"])
| where ClusterArmId =~ 'enter cluster ID'
| summarize arg_max(TimeGenerated, RenderedDescription)
| extend description = parse_json(RenderedDescription)
| extend VmsFailed = toint(description.m_totalVmsFailed)

群集中的虚拟机总数。

对于群集,查看群集中的虚拟机总数、正在运行的虚拟机、已停止的虚拟机和失败的虚拟机

//Select your log analytics workspace and replace enter cluster ID with your cluster arm ID
Event
| where EventLog =~ "Microsoft-Windows-SDDC-Management/Operational" and EventID == "3003"
| extend ClusterName = tostring(parse_xml(EventData).DataItem.UserData.EventData["ClusterName"])
| extend ClusterArmId = tostring(parse_xml(EventData).DataItem.UserData.EventData["ArmId"])
| where ClusterArmId =~ 'enter cluster ID'
| summarize arg_max(TimeGenerated, RenderedDescription)
| extend description = parse_json(RenderedDescription)
| extend VmsStopped = toint(description.m_totalVmsStopped)

群集中的可用卷容量。

查看群集中卷的可用容量(以字节为单位)

//Select your log analytics workspace and replace enter cluster ID with your cluster arm ID 
Event
| where EventLog =~ "Microsoft-Windows-SDDC-Management/Operational" and EventID == "3002"
| extend ClusterData = parse_xml(EventData)
| extend ClusterName = tostring(ClusterData.DataItem.UserData.EventData["ClusterName"])
| extend ClusterArmId = tostring(ClusterData.DataItem.UserData.EventData["ArmId"])
| where ClusterArmId =~ 'enter cluster ID'
| summarize arg_max(TimeGenerated, RenderedDescription)
| extend volumes_information = parse_json(RenderedDescription).VolumeList
| mv-expand volumes_information
| extend Volumes = tostring(volumes_information.m_Label)
| extend TotalCap = todecimal(volumes_information.m_Size)
| extend AvailableCap = TotalCap - todecimal(volumes_information.m_SizeUsed)

卷延迟

此查询显示卷的延迟。

//Select your log analytics workspace and replace enter cluster ID with your cluster arm ID
Event
| where EventLog =~ "Microsoft-Windows-SDDC-Management/Operational" and EventID == "3002"
| extend ClusterData = parse_xml(EventData)
| extend ClusterName = tostring(ClusterData.DataItem.UserData.EventData["ClusterName"])
| extend ClusterArmId = tostring(ClusterData.DataItem.UserData.EventData["ArmId"])
| where ClusterArmId =~ 'enter cluster ID'
| summarize arg_max(TimeGenerated, RenderedDescription)
| extend volumes_information = parse_json(RenderedDescription).VolumeList
| mv-expand volumes_information
| extend VolumeName = tostring(volumes_information.m_Label)
| extend Latency = todouble(volumes_information.m_AverageLatency)
| extend Latency = iff(Latency < 0, 0.0, Latency)

体积 IOPS

此查询显示群集中卷的每秒输入输出操作数。

//Select your log analytics workspace and replace enter cluster ID with your cluster arm ID to view IOPS of volumes in a cluster
//Unit for IOPS will be IOPS/s
Event
| where EventLog =~ "Microsoft-Windows-SDDC-Management/Operational" and EventID == "3002"
| extend ClusterData = parse_xml(EventData)
| extend ClusterName = tostring(ClusterData.DataItem.UserData.EventData["ClusterName"])
| extend ClusterArmId = tostring(ClusterData.DataItem.UserData.EventData["ArmId"])
| where ClusterArmId =~ 'enter cluster ID'
| summarize arg_max(TimeGenerated, RenderedDescription)
| extend volumes_information = parse_json(RenderedDescription).VolumeList
| mv-expand volumes_information
| extend VolumesName = tostring(volumes_information.m_Label)
| extend Iops = todouble(volumes_information.m_TotalIops)
| extend Iops = iff(Iops < 0, 0.0, Iops)

卷吞吐量

此查询显示群集中卷的吞吐量。

//Select your log analytics workspace and replace enter cluster ID with your cluster arm ID
//Unit for throughput is B/s
Event
| where EventLog =~ "Microsoft-Windows-SDDC-Management/Operational" and EventID == "3002"
| extend ClusterData = parse_xml(EventData)
| extend ClusterName = tostring(ClusterData.DataItem.UserData.EventData["ClusterName"])
| extend ClusterArmId = tostring(ClusterData.DataItem.UserData.EventData["ArmId"])
| where ClusterArmId =~ 'enter cluster ID'
| summarize arg_max(TimeGenerated, RenderedDescription)
| extend volumes_information = parse_json(RenderedDescription).VolumeList
| mv-expand volumes_information
| extend VolumeName = tostring(volumes_information.m_Label)
| extend Throughput = todouble(volumes_information.m_TotalThroughput)
| extend Throughput = iff(Throughput < 0, 0.0, Throughput)

群集节点已关闭

如果群集内的某个节点发生故障,则会收到警报。

//Select your log analytics workspace and replace clusterarmId1 with your cluster arm ID
//Please split dimensions by clusterarmID and dimension name as faulting resource ID to set up alerts for each node within a cluster. Please check include all future values to get alerts for future dimension names.
Event
| where EventLog =~ "Microsoft-Windows-Health/Operational"
| extend description = parse_json(RenderedDescription)
| extend CorrelationId = tostring(description.CorrelationId)
| join kind=leftsemi (Event
    | where EventLog =~ "Microsoft-Windows-Health/Operational"
    | extend description = parse_json(RenderedDescription)
    | extend ClusterArmId = tostring(description.ArmId)
    //| where ClusterArmId in~ ('clusterarmId1', 'clusterarmId2', 'clusterarmId3')
    | where tostring(description.IsLastMessage) =~ 'true'
    | extend CorrelationId = tostring(description.CorrelationId)
    | summarize arg_max(TimeGenerated, *) by ClusterArmId
    | project CorrelationId)
    on CorrelationId
| extend ClusterArmId = tostring(description.ArmId)
| where tostring(description.Fault.RootObjectType) == 'Microsoft.Health.EntityType.Cluster'
| extend Fault = description.Fault
| extend ShortDescription = split(tostring(Fault.Type), '.')[-1]
| extend Faulttype= Fault.Type
| where Faulttype == "Microsoft.Health.FaultType.Server.Down"
| extend Severity = toint(Fault.Severity)
| extend FaultingResourceType = split(tostring(Fault.ObjectType), '.')[-1]
| extend FaultingResourceId = tostring(Fault.ObjectId)
| extend ReportedTime = datetime_add('Microsecond', tolong(Fault.Timestamp) / 10, make_datetime(1601, 1, 1))
| extend Detail = pack(
    "Severity", iff(Severity == 0, "Healthy", iff(Severity == 1, "Warning", iff(Severity == 2, "Critical", "Unknown"))),
    "Faulting Resource ID", FaultingResourceId,
    "Faulting Resource Type", FaultingResourceType,
    "Faulttype", Faulttype,
    "Reported Time", ReportedTime,
    "Short Description", ShortDescription,
    "Description", tostring(Fault.Description),
    "clusterARMId", tostring(ClusterArmId),
    "Remediation", tostring(Fault.Remediation))
| sort by ReportedTime asc
| limit 100

内存使用率百分比

对于群集,查看平均节点内存使用率百分比。

//Select your log analytics workspace and replace clusterarmId1 with your cluster arm ID
//Unit for MemoryUsage is in percentage(%),TotalMemory, and UsedMemory are in bytes
Event
| where EventLog =~ "Microsoft-Windows-SDDC-Management/Operational" and EventID == "3000"
| extend ClusterData = parse_xml(EventData)
| extend ClusterName = tostring(ClusterData.DataItem.UserData.EventData["ClusterName"])
| extend ClusterArmId = tostring(ClusterData.DataItem.UserData.EventData["ArmId"])
//| where ClusterArmId in~ ('clusterarmId1', 'clusterarmId2', 'clusterarmId3')
| summarize arg_max(TimeGenerated, *) by ClusterArmId
| extend servers_information = parse_json(RenderedDescription).m_servers
| mv-expand servers_information
| extend Nodename = tostring(servers_information.m_name)
| extend TotalMemory = todecimal(servers_information.m_totalPhysicalMemoryInBytes)
| extend UsedMemory = iff(TotalMemory == 0.0, todecimal(0.0), todecimal(servers_information.m_usedPhysicalMemoryInBytes))
| extend MemoryUsage = iff(TotalMemory == 0.0, todecimal(0.0), todecimal(round(UsedMemory / TotalMemory * 100, 0)))
| extend MemoryUsageint = toint(MemoryUsage)
| where Nodename != ""
| limit 100

引入延迟(端到端)时间图表 - 事件表

将过去 1 天的引入延迟绘制到事件表中。

Event
| where TimeGenerated > ago(1d)
| project TimeGenerated, IngestionDurationSeconds = (ingestion_time()-TimeGenerated)/1s
| render timechart title = "Ingestion latency: Event table" 

显示所选事件的趋势

绘制过去一天内报告某事件的次数的图表。

// To create an alert for this query, click '+ New alert rule'
Event
| where EventID == 44 // this ID indicates Windows Update started downloading an update
| summarize count() by bin(TimeGenerated, 1h), Computer, _ResourceId // bin is used to set the time grain to 1 hour
| render barchart

计算机上的错误事件缺少安全或关键更新

缺少关键更新或安全必需更新的计算机的错误事件。

// To create an alert for this query, click '+ New alert rule'
Event
| where EventLevelName == "error"
    | join kind=inner (Update |where (Classification == "Security Updates" or Classification == "Critical Updates") and UpdateState == "Needed" and Optional == "false" | distinct Computer) on Computer 
    | sort by TimeGenerated desc

过去一小时内的所有事件

过去一小时内的所有事件。

Event
| where TimeGenerated > ago(1h)
| sort by TimeGenerated desc

事件已启动

按事件 ID 启动的事件。

Event
| where RenderedDescription contains "started" 
| summarize count() by EventID

按事件源排序的事件

按事件源排序的事件。

Event
| summarize count() by Source

按事件 ID 排序的事件

按事件 ID 排序的前 10 个事件。

Event 
| summarize count() by EventID
| top 10 by count_

警告事件

按时间排序的警告事件。

Event 
| where EventLevelName == "warning" 
| sort by TimeGenerated desc

警告事件的计数

按事件 ID 统计的警告事件计数。

Event 
| where EventLevelName == "warning" 
| summarize count() by EventID

OM 中介于 2000 到 3000 之间的事件

ID 范围为 2000 到 3000 的操作管理器事件。

Event 
| where EventLog == "Operations Manager" and (EventID >= 2000 and EventID <= 3000) 
| sort by TimeGenerated desc

Windows 防火墙策略设置

Windows 防火墙策略设置已更改。

Event
| where EventLog == "Microsoft-Windows-Windows Firewall With Advanced Security/Firewall" and EventID == 2008 
| sort by TimeGenerated desc

计算机更改了 Windows 防火墙策略设置

计算机更改了 Windows 防火墙策略设置。

Event 
| where EventLog == "Microsoft-Windows-Windows Firewall With Advanced Security/Firewall" and EventID == 2008 
| summarize count() by Computer 
| limit 10000