网络性能监视器解决方案:性能监视Network Performance Monitor solution: Performance monitoring

借助网络性能监视器中的性能监视器功能可以监视网络中各个位置的网络连接。The Performance Monitor capability in Network Performance Monitor helps you monitor network connectivity across various points in your network. 可以监视云部署和本地位置、多个数据中心和分支机构、任务关健型多层应用程序或微服务。You can monitor cloud deployments and on-premises locations, multiple data centers and branch offices, and mission-critical multitier applications or microservices. 使用性能监视器,可以在用户产生抱怨之前检测到网络问题。With Performance Monitor, you can detect network issues before your users complain. 主要优势包括:Key advantages are that you can:

  • 跨各种子网监视数据丢失和延迟情况并设置警报。Monitor loss and latency across various subnets and set alerts.
  • 监视网络上的所有路径(包括冗余路径)。Monitor all paths (including redundant paths) on the network.
  • 对难以复现且在特定时间点出现的暂时性网络问题进行故障排除。Troubleshoot transient and point-in-time network issues, which are difficult to replicate.
  • 确定网络上导致性能下降的具体分段。Determine the specific segment on the network, which is responsible for degraded performance.
  • 监视网络运行状况,不需要 SNMP。Monitor the health of the network, without the need for SNMP.

网络性能监视器

配置Configuration

若要打开网络性能监视器的配置,请打开网络性能监视器解决方案并选择“配置”。 To open the configuration for Network Performance Monitor, open the Network Performance Monitor solution, and select Configure.

配置网络性能监视器

新建网络Create new networks

网络性能监视器中的网络是子网的逻辑容器。A network in Network Performance Monitor is a logical container for subnets. 它可以根据需要帮助你对网络基础结构的监视进行组织。It helps you organize the monitoring of your network infrastructure according to your needs. 可以创建一个具有友好名称的网络,并根据业务逻辑在其中添加子网。You can create a network with a friendly name and add subnets to it according to your business logic. 例如,可以创建名为 London 的网络并添加伦敦数据中心内的所有子网。For example, you can create a network named London and add all the subnets in your London data center. 或者,可以创建名为 ContosoFrontEnd 的网络,并将为应用前端提供服务的、名为 Contoso 的所有子网添加到此网络。Or you can create a network named ContosoFrontEnd and add to this network all the subnets named Contoso that serve the front end of your app. 该解决方案会自动创建一个默认网络,其中包含在你的环境中发现的所有子网。The solution automatically creates a default network, which contains all the subnets discovered in your environment.

每当创建网络时,都可以向该网站添加子网。Whenever you create a network, you add a subnet to it. 然后从默认网络中删除该子网。Then that subnet is removed from the default network. 删除某个网络时,会自动将其中的所有子网返回到默认网络。If you delete a network, all its subnets are automatically returned to the default network. 默认网络充当包含在用户定义的任何网络中的所有子网的容器。The default network acts as a container for all the subnets that aren't contained in any user-defined network. 无法编辑或删除默认网络。You can't edit or delete the default network. 它始终保留在系统中。It always remains in the system. 可以根据需要创建任意数量的自定义网络。You can create as many custom networks as you need. 在大多数情况下,会将组织中的子网排列在多个网络中。In most cases, the subnets in your organization are arranged in more than one network. 应根据业务逻辑创建一个或多个网络,以便对子网进行分组。Create one or more networks to group your subnets for your business logic.

若要创建新网络,请执行以下操作:To create a new network:

  1. 选择“网络”选项卡。 Select the Networks tab.
  2. 选择 添加网络,并输入网络名称和说明。Select Add network, and then enter the network name and description.
  3. 选择一个或多个子网,然后选择“添加”。 Select one or more subnets, and then select Add.
  4. 选择“保存” 以保存配置。Select Save to save the configuration.

创建监视规则Create monitoring rules

突破 2 个子网或 2 个网络之间网络连接的性能阈值时,性能监视器会生成运行状况事件。Performance Monitor generates health events when the threshold of the performance of network connections between two subnetworks or between two networks is breached. 系统可以自动获知这些阈值。The system can learn these thresholds automatically. 你也可以提供自定义警报。You also can provide custom thresholds. 系统会自动创建一个默认规则,每当任何一对网络或子网链接之间的丢失或延迟突破系统获知的阈值时,该规则就会生成运行状况事件。The system automatically creates a default rule, which generates a health event whenever loss or latency between any pair of network or subnetwork links breaches the system-learned threshold. 在尚未显式创建任何监视规则之前,此过程可以帮助解决方案监视网络基础结构。This process helps the solution monitor your network infrastructure until you haven't created any monitoring rules explicitly. 如果启用了默认规则,所有节点都会将综合事务发送到已启用监视的其他所有节点。If the default rule is enabled, all the nodes send synthetic transactions to all the other nodes that you enabled for monitoring. 默认规则非常适合小型网络。The default rule is useful with small networks. 例如,有少量的服务器在运行微服务,而你想要确保所有服务器相互连接,则默认规则就非常有用。An example is a scenario where you have a small number of servers running a microservice and you want to make sure that all the servers have connectivity to each other.

备注

我们建议禁用默认规则并创建自定义监视规则,尤其网络规模较大,需要使用大量的节点进行监视时。We recommend that you disable the default rule and create custom monitoring rules, especially with large networks where you use a large number of nodes for monitoring. 自定义监视规则可以减少解决方案生成的流量,并有助于对网络监视进行组织。Custom monitoring rules can reduce the traffic generated by the solution and help you organize the monitoring of your network.

根据业务逻辑创建监视规则。Create monitoring rules according to your business logic. 假设你要监视两个办公地点到总部的网络连接性能。An example is if you want to monitor the performance of the network connectivity of two office sites to headquarters. 可将办公地点 1 中的所有子网分组到网络 O1。Group all the subnets in office site1 in network O1. 然后将办公地点 2 中的所有子网分组到网络 O2。Then group all the subnets in office site2 in network O2. 最后,将总部中的所有子网分组到网络 H 中。创建两个监视规则 - 一个在 O1 与 H 之间,另一个在 O2 与 H 之间。Finally, group all the subnets in the headquarters in network H. Create two monitoring rules--one between O1 and H and the other between O2 and H.

若要创建自定义监视规则,请执行以下操作:To create custom monitoring rules:

  1. 在“监视器”选项卡中,选择“添加规则”,并输入规则名称和说明。Select Add Rule on the Monitor tab, and enter the rule name and description.
  2. 从列表中选择要监视的网络或子网链接对。Select the pair of network or subnetwork links to monitor from the lists.
  3. 从网络下拉列表中选择包含所需子网的网络。Select the network that contains the subnetworks you want from the network drop-down list. 然后从相应的子网下拉列表中选择子网。Then select the subnetworks from the corresponding subnetwork drop-down list. 若要监视网络链接中的所有子网,请选择“所有子网” 。If you want to monitor all the subnetworks in a network link, select All subnetworks. 同样,选择其他所需的子网。Similarly, select the other subnetworks you want. 若要从已选项中排除对特定子网链接的监视,请选择“添加例外” 。To exclude monitoring for particular subnetwork links from the selections you made, select Add Exception.
  4. 选择 ICMP 或 TCP 协议用于执行综合事务。Choose between ICMP and TCP protocols to execute synthetic transactions.
  5. 如果不希望针对所选项生成运行状况事件,请清除“启用对此规则覆盖的链接进行运行状况监视” 。If you don't want to create health events for the items you selected, clear Enable Health Monitoring on the links covered by this rule.
  6. 选择监视条件。Choose monitoring conditions. 若要针对运行状况事件生成设置自定义阈值,请输入阈值。To set custom thresholds for health-event generation, enter threshold values. 只要条件值超出其针对所选网络或子网对选择的阈值,就会生成运行状况事件。Whenever the value of the condition exceeds its selected threshold for the selected network or subnetwork pair, a health event is generated.
  7. 选择“保存” 以保存配置。Select Save to save the configuration.

保存监视规则后,可以选择“创建警报”,将该规则与警报管理集成。 After you save a monitoring rule, you can integrate that rule with Alert Management by selecting Create Alert. 系统将使用搜索查询自动创建警报规则。An alert rule is automatically created with the search query. 其他所需参数会自动填入。Other required parameters are automatically filled in. 使用警报规则可以收到基于电子邮件的警报,以及网络性能监视器中的现有警报。Using an alert rule, you can receive e-mail-based alerts, in addition to the existing alerts within Network Performance Monitor. 警报还能配合 Runbook 触发补救措施,或者,可以使用 Webhook 将警报与现有服务管理解决方案集成。Alerts also can trigger remedial actions with runbooks, or they can integrate with existing service management solutions by using webhooks. 选择“管理警报”以编辑警报设置。 Select Manage Alert to edit the alert settings.

现在可以创建多个性能监视器规则或移动到解决方案仪表板来开始使用此功能。You can now create more Performance Monitor rules or move to the solution dashboard to use the capability.

选择协议Choose the protocol

网络性能监视器使用综合事务来计算数据包丢失和链接延迟等网络性能指标。Network Performance Monitor uses synthetic transactions to calculate network performance metrics like packet loss and link latency. 为了更好地理解此概念,请考虑一个已连接到网络链接一端的网络性能监视器代理。To understand this concept better, consider a Network Performance Monitor agent connected to one end of a network link. 此网络性能监视器代理将探测数据包发送至已连接到网络另一端的第二个网络性能监视器代理。This Network Performance Monitor agent sends probe packets to a second Network Performance Monitor agent connected to another end of the network. 第二个代理使用响应数据包答复。The second agent replies with response packets. 此过程将重复多次。This process repeats a few times. 通过测量答复数和接收每个答复所花费的时间,第一个网络性能监视器代理将评估链接延迟和数据包丢弃情况。By measuring the number of replies and the time taken to receive each reply, the first Network Performance Monitor agent assesses link latency and packet drops.

这些数据包的格式、大小和序列由在创建监视规则时选择的协议决定。The format, size, and sequence of these packets is determined by the protocol that you choose when you create monitoring rules. 根据数据包协议,中间网络设备(例如路由器和交换机)可能以不同方式处理这些数据包。Based on the protocol of the packets, the intermediate network devices, such as routers and switches, might process these packets differently. 因此,协议选择将影响结果的准确性。Consequently, your protocol choice affects the accuracy of the results. 选择的协议还决定了是否在部署网络性能监视器解决方案后必须执行任何手动步骤。Your protocol choice also determines whether you must take any manual steps after you deploy the Network Performance Monitor solution.

网络性能监视器提供 ICMP 与 TCP 协议之间的选择,以便于执行综合事务。Network Performance Monitor offers you the choice between ICMP and TCP protocols for executing synthetic transactions. 如果在创建综合事务规则时选择 ICMP,网络性能监视器代理将使用 ICMP ECHO 消息来计算与网络延迟和数据包丢失相关的数据。If you choose ICMP when you create a synthetic transaction rule, the Network Performance Monitor agents use ICMP ECHO messages to calculate the network latency and packet loss. ICMP ECHO 使用传统 ping 实用工具发送的同一消息。ICMP ECHO uses the same message that's sent by the conventional ping utility. 当使用 TCP 作为协议时,网络性能监视器代理通过网络发送 TCP SYN 数据包。When you use TCP as the protocol, Network Performance Monitor agents send TCP SYN packets over the network. 执行此步骤后,将完成 TCP 握手,然后使用 RST 数据包删除连接。This step is followed by a TCP handshake completion, and the connection is removed by using RST packets.

在选择协议之前,请考虑以下信息:Consider the following information before you choose a protocol:

  • 发现多个网络路由。Discovery of multiple network routes. 发现多个路由时,TCP 更为准确,并且在每个子网中需要使用的代理更少。TCP is more accurate when discovering multiple routes, and it needs fewer agents in each subnet. 例如,一个或两个使用 TCP 的代理可以发现子网之间的所有冗余路径。For example, one or two agents that use TCP can discover all redundant paths between subnets. 需要多个使用 ICMP 的代理才能实现类似效果。You need several agents that use ICMP to achieve similar results. 使用 ICMP,如果两个子网之间有许多路由,则源或目标子网中需要 5N 个以上的代理。Using ICMP, if you have a number of routes between two subnets, you need more than 5N agents in either a source or destination subnet.

  • 结果的准确性。Accuracy of results. 路由器和交换机往往将较低的优先级分配给 ICMP ECHO 数据包(与 TCP 数据包相比)。Routers and switches tend to assign lower priority to ICMP ECHO packets compared to TCP packets. 在某些情况下,当网络设备负载很重时,TCP 获取的数据将更准确地反映应用程序遇到的丢失和延迟情况。In certain situations, when network devices are heavily loaded, the data obtained by TCP more closely reflects the loss and latency experienced by applications. 这是因为大部分应用程序流量都是通过 TCP 传送的。This occurs because most of the application traffic flows over TCP. 在这种情况下,与 TCP 相比,ICMP 提供的结果准确性较低。In such cases, ICMP provides less-accurate results compared to TCP.

  • 防火墙配置。Firewall configuration. TCP 协议要求 TCP 数据包发送到目标端口。TCP protocol requires that TCP packets are sent to a destination port. 网络性能监视器代理使用的默认端口为 8084。The default port used by Network Performance Monitor agents is 8084. 可以在配置代理时更改此端口。You can change the port when you configure agents. 请确保网络防火墙或网络安全组 (NSG) 规则(在 Azure 中)允许该端口上的流量。Make sure that your network firewalls or network security group (NSG) rules (in Azure) allow traffic on the port. 还需要确保安装代理的计算机上的本地防火墙已配置为允许此端口上的流量。You also need to make sure that the local firewall on the computers where agents are installed is configured to allow traffic on this port. 可以使用 PowerShell 脚本配置运行 Windows 的计算机上的防火墙规则,但需要手动配置网络防火墙。You can use PowerShell scripts to configure firewall rules on your computers running Windows, but you need to configure your network firewall manually. 相反,ICMP 不使用端口运行。In contrast, ICMP doesn't operate by using a port. 多数企业方案中都允许 ICMP 流量通过防火墙,以便于使用 ping 实用工具等网络诊断工具。In most enterprise scenarios, ICMP traffic is permitted through the firewalls to allow you to use network diagnostics tools like the ping utility. 如果可以从一台计算机 ping 另一台计算机,则可以使用 ICMP 协议而无需手动配置防火墙。If you can ping one machine from another, you can use the ICMP protocol without having to configure firewalls manually.

备注

某些防火墙可能会阻止 ICMP,导致重新传输,因而在安全信息和事件管理系统中生成大量的事件。Some firewalls might block ICMP, which might lead to retransmission that results in a large number of events in your security information and event management system. 请确保选择的协议未被网络防火墙或 NSG 阻止。Make sure the protocol that you choose isn't blocked by a network firewall or NSG. 否则网络性能监视器无法监视网段。Otherwise, Network Performance Monitor can't monitor the network segment. 我们建议使用 TCP 进行监视。We recommend that you use TCP for monitoring. 如果存在如下所述的情况,因而无法使用 TCP,请使用 ICMP:Use ICMP in scenarios where you can't use TCP, such as when:

  • 使用基于 Windows 客户端的节点,因为 Windows 客户端中不允许 TCP 原始套接字。You use Windows client-based nodes, because TCP raw sockets aren't allowed in Windows clients.
  • 网络防火墙或 NSG 会阻止 TCP。Your network firewall or NSG blocks TCP.
  • 不知道如何切换协议。You don't know how to switch the protocol.

如果在部署期间选择使用 ICMP,可以随时通过编辑默认监视规则来切换到 TCP。If you chose to use ICMP during deployment, you can switch to TCP at any time by editing the default monitoring rule.

  1. 转到 ”网络性能”  > ” 监视”  > ” 配置”  > ” 监视” 。Go to Network Performance > Monitor > Configure > Monitor. 然后选择  “默认规则”Then select Default rule.
  2. 滚动到“协议” 部分,并选择要使用的协议。Scroll to the Protocol section, and select the protocol that you want to use.
  3. 选择“保存” 以应用设置。Select Save to apply the setting.

即使默认规则使用特定协议,也可以使用其他协议创建新规则。Even if the default rule uses a specific protocol, you can create new rules with a different protocol. 甚至可以创建混合规则,其中一些规则使用 ICMP,另一些规则使用 TCP。You can even create a mix of rules where some rules use ICMP and others use TCP.

演练Walkthrough

现在,我们简单调查一下某个运行状况事件的根本原因。Now look at a simple investigation into the root cause for a health event.

在解决方案仪表板上,某个运行状况事件显示网络链接不正常。On the solution dashboard, a health event shows that a network link is unhealthy. 若要调查问题,请选择“正受到监视的网络链接”磁贴 。To investigate the issue, select the Network links being monitored tile.

深化页上显示 DMZ2-DMZ1 网络链接不正常。The drill-down page shows that the DMZ2-DMZ1 network link is unhealthy. 选择此网络链接对应的“查看子网链接”。 Select View subnet links for this network link.

深化页显示 DMZ2-DMZ1 网络链接中的所有子网链接。The drill-down page shows all the subnetwork links in the DMZ2-DMZ1 network link. 这两个子网链接的延迟已超过阈值,使网络链接不正常。For both subnetwork links, the latency crossed the threshold, which makes the network link unhealthy. 还可以看到这两个子网链接的延迟趋势。You also can see the latency trends of both subnetwork links. 使用图表中的时间选择控件可以重点关注所需的时间范围。Use the time selection control in the graph to focus on the required time range. 可以查看延迟达到其峰值时的当天时间。You can see the time of day when latency reached its peak. 若要调查问题,以后可以搜索日志来查询此时间段。Search the logs later for this time period to investigate the issue. 选择“查看节点链接” 以进一步深化。Select View node links to drill down further.

“子网链接”页

与上一页类似,特定子网链接的挖掘页面会列出其构成节点链接。Similar to the previous page, the drill-down page for the particular subnetwork link lists its constituent node links. 可以在此处执行与上一步类似的操作。You can perform similar actions here as you did in the previous step. 选择“查看拓扑” 可查看两个节点之间的拓扑。Select View topology to view the topology between the two nodes.

“节点链接”页

两个所选节点之间的所有路径都绘制在拓扑图中。All the paths between the two selected nodes are plotted in the topology map. 可以在拓扑图上可视化两个节点之间路由的逐跳拓扑。You can visualize the hop-by-hop topology of routes between two nodes on the topology map. 它清晰地呈现两个节点之间存在多少个路由,以及数据包会采用哪条路径。It gives you a clear picture of how many routes exist between the two nodes and what paths the data packets take. 网络性能瓶颈以红色显示。Network performance bottlenecks are shown in red. 若要定位到发生问题的网络连接或网络设备,请查看拓扑图上的红色元素。To locate a faulty network connection or a faulty network device, look at the red elements on the topology map.

包含拓扑图的拓扑仪表板

可在“操作”窗格中查看每个路径中的丢失情况、延迟和跃点数。 You can review the loss, latency, and the number of hops in each path in the Action pane. 滚动条可用于查看不正常路径的详细信息。Use the scrollbar to view the details of the unhealthy paths. 使用筛选器可以选择包含不正常跃点的路径,以便只绘制所选路径的拓扑。Use the filters to select the paths with the unhealthy hop so that the topology for only the selected paths is plotted. 使用鼠标滚轮可以放大或缩小拓扑图。To zoom in or out of the topology map, use your mouse wheel.

在下图中,网络特定部分的问题区域的根本原因出现在红色路径和跃点中。In the following image, the root cause of the problem areas to the specific section of the network appear in the red paths and hops. 选择拓扑图中的某个节点会呈现该节点的属性,包括 FQDN 和 IP 地址。Select a node in the topology map to reveal the properties of the node, which includes the FQDN and IP address. 选择某个跃点会显示该跃点的 IP 地址。Selecting a hop shows the IP address of the hop.

已在其中选择节点属性的拓扑图

后续步骤Next steps

搜索日志以查看详细的网络性能数据记录。Search logs to view detailed network performance data records.