Troubleshoot Linux update agent issues
There can be many reasons why your machine isn't showing up as ready (healthy) in Update Management. You can check the health of a Linux Hybrid Runbook Worker agent to determine the underlying problem. The following are the three readiness states for a machine:
- Ready: The Hybrid Runbook Worker is deployed and was last seen less than one hour ago.
- Disconnected: The Hybrid Runbook Worker is deployed and was last seen over one hour ago.
- Not configured: The Hybrid Runbook Worker isn't found or hasn't finished deployment.
Note
There can be a slight delay between what the Azure portal shows and the current state of a machine.
This article discusses how to run the troubleshooter for Azure machines from the Azure portal and non-Azure machines in the offline scenario.
Note
The troubleshooter script currently doesn't route traffic through a proxy server if one is configured.
Start the troubleshooter
For Azure machines, select the troubleshoot link under the Update Agent Readiness column in the portal to open the Troubleshoot Update Agent page. For non-Azure machines, the link brings you to this article. To troubleshoot a non-Azure machine, see the instructions in the Troubleshoot offline section.
Note
The checks require the VM to be running. If the VM isn't running, Start the VM appears.
On the Troubleshoot Update Agent page, select Run Checks to start the troubleshooter. The troubleshooter uses Run command to run a script on the machine to verify the dependencies. When the troubleshooter is finished, it returns the result of the checks.
When the checks are finished, the results are returned in the window. The check sections provide information on what each check is looking for.
Prerequisite checks
Operating system
The operating system check verifies if the Hybrid Runbook Worker is running one of the supported operating systems.
Dmidecode check
To verify if a VM is an Azure VM, check for Asset tag value using the below command:
sudo dmidecode
If the asset tag is different than 7783-7084-3265-9085-8269-3286-77, then reboot VM to initiate re-registration.
Monitoring agent service health checks
Monitoring Agent
To fix this, install Azure Log Analytics Linux agent and ensure it communicates the required endpoints. For more information, see Install Log Analytics agent on Linux computers.
This task checks if the folder is present -
/etc/opt/microsoft/omsagent/conf/omsadmin.conf
Monitoring Agent status
To fix this issue, you must start the OMS Agent service by using the following command:
sudo /opt/microsoft/omsagent/bin/service_control restart
To validate you can perform process check using the below command:
process_name="omsagent"
ps aux | grep %s | grep -v grep" % (process_name)"
For more information, see Troubleshoot issues with the Log Analytics agent for Linux
Multihoming
This check determines if the agent is reporting to multiple workspaces. Update Management doesn't support multihoming.
To fix this issue, purge the OMS Agent completely and reinstall it with the workspace linked with Update management
Validate that there are no more multihoming by checking the directories under this path:
/var/opt/microsoft/omsagent.
As they are the directories of workspaces, the number of directories equals the number of workspaces on-boarded to OMSAgent.
Hybrid Runbook Worker
To fix the issue, run the following command:
sudo su omsagent -c 'python /opt/microsoft/omsconfig/Scripts/PerformRequiredConfigurationChecks.py'
This command forces the omsconfig agent to talk to Azure Monitor and retrieve the latest configuration.
Validate to check if the following two paths exists:
/opt/microsoft/omsconfig/modules/nxOMSAutomationWorker/VERSION </br> /opt/microsoft/omsconfig/modules/nxOMSAutomationWorker/DSCResources/MSFT_nxOMSAutomationWorkerResource/automationworker/worker/configuration.py
Hybrid Runbook Worker status
This check makes sure the Hybrid Runbook Worker is running on the machine. The processes in the example below should be present if the Hybrid Runbook Worker is running correctly.
ps -ef | grep python
nxautom+ 8567 1 0 14:45 ? 00:00:00 python /opt/microsoft/omsconfig/modules/nxOMSAutomationWorker/DSCResources/MSFT_nxOMSAutomationWorkerResource/automationworker/worker/main.py /var/opt/microsoft/omsagent/state/automationworker/oms.conf rworkspace:<workspaceId> <Linux hybrid worker version>
nxautom+ 8593 1 0 14:45 ? 00:00:02 python /opt/microsoft/omsconfig/modules/nxOMSAutomationWorker/DSCResources/MSFT_nxOMSAutomationWorkerResource/automationworker/worker/hybridworker.py /var/opt/microsoft/omsagent/state/automationworker/worker.conf managed rworkspace:<workspaceId> rversion:<Linux hybrid worker version>
nxautom+ 8595 1 0 14:45 ? 00:00:02 python /opt/microsoft/omsconfig/modules/nxOMSAutomationWorker/DSCResources/MSFT_nxOMSAutomationWorkerResource/automationworker/worker/hybridworker.py /var/opt/microsoft/omsagent/<workspaceId>/state/automationworker/diy/worker.conf managed rworkspace:<workspaceId> rversion:<Linux hybrid worker version>
Update Management downloads Hybrid Runbook Worker packages from the operations endpoint. Therefore, if the Hybrid Runbook Worker is not running and the operations endpoint check fails, the update can fail.
To fix this issue, run the following command:
sudo su omsagent -c 'python /opt/microsoft/omsconfig/Scripts/PerformRequiredConfigurationChecks.py'
This command forces the omsconfig agent to talk to Azure Monitor and retrieve the latest configuration.
If the issue still persists, run the omsagent Log Collector tool
Connectivity checks
Proxy enabled check
To fix the issue, either remove the proxy or make sure that the proxy address is able to access the prerequisite URL.
You can validate the task by running the below command:
HTTP_PROXY
IMDS connectivity check
To fix this issue, allow access to IP 169.254.169.254. For more information, see Access Azure Instance Metadata Service
After the network changes, you can either rerun the Troubleshooter or run the below commands to validate:
curl -H \"Metadata: true\" http://169.254.169.254/metadata/instance?api-version=2018-02-01
General internet connectivity
This check makes sure that the machine has access to the internet and can be ignored if you have blocked internet and allowed only specific URLs.
CURL on any http url.
Registration endpoint
This check determines if the Hybrid Runbook Worker can properly communicate with Azure Automation in the Log Analytics workspace.
Proxy and firewall configurations must allow the Hybrid Runbook Worker agent to communicate with the registration endpoint. For a list of addresses and ports to open, see Network planning
Fix this issue by allowing the prerequisite URLs. For more information, see Update Management
Post the network changes you can either re-run the troubleshooter or CURL on provided jrds endpoint.
Operations endpoint
This check determines if the Log Analytics agent can properly communicate with the Job Runtime Data Service.
Proxy and firewall configurations must allow the Hybrid Runbook Worker agent to communicate with the Job Runtime Data Service. For a list of addresses and ports to open, see Network planning.
Log Analytics endpoint 1
This check verifies that your machine has access to the endpoints needed by the Log Analytics agent.
Fix this issue by allowing the prerequisite URLs.
Post making Network changes you can either rerun the Troubleshooter or Curl on provided ODS endpoint.
Log Analytics endpoint 2
This check verifies that your machine has access to the endpoints needed by the Log Analytics agent.
Fix this issue by allowing the prerequisite URLs.
Post making Network changes you can either rerun the Troubleshooter or Curl on provided OMS endpoint
Software repositories
Fix this issue by allowing the prerequisite Repo URL.
Post making Network changes you can either rerun the Troubleshooter or
Curl on software repositories configured in package manager.
Refreshing repos would help to confirm the communication.
sudo apt-get check
sudo yum check-update
Note
The check is available only in offline mode.
Troubleshoot offline
You can use the troubleshooter offline on a Hybrid Runbook Worker by running the script locally. The Python script, UM_Linux_Troubleshooter_Offline.py, can be found in GitHub.
Note
The current version of the troubleshooter script does not support Ubuntu 20.04.
An example of the output of this script is shown in the following example:
Debug: Machine Information: Static hostname: LinuxVM2
Icon name: computer-vm
Chassis: vm
Machine ID: 00000000000000000000000000000000
Boot ID: 00000000000000000000000000000000
Virtualization: microsoft
Operating System: Ubuntu 16.04.5 LTS
Kernel: Linux 4.15.0-1025-azure
Architecture: x86-64
Passed: Operating system version is supported
Passed: Microsoft Monitoring agent is installed
Debug: omsadmin.conf file contents:
WORKSPACE_ID=00000000-0000-0000-0000-000000000000
AGENT_GUID=00000000-0000-0000-0000-000000000000
LOG_FACILITY=local0
CERTIFICATE_UPDATE_ENDPOINT=https://00000000-0000-0000-0000-000000000000.oms.opinsights.azure.cn/ConfigurationService.Svc/RenewCertificate
URL_TLD=opinsights.azure.cn
DSC_ENDPOINT=https://scus-agentservice-prod-1.azure-automation.cn/Accounts/00000000-0000-0000-0000-000000000000/Nodes\(AgentId='00000000-0000-0000-0000-000000000000'\)
OMS_ENDPOINT=https://00000000-0000-0000-0000-000000000000.ods.opinsights.azure.cn/OperationalData.svc/PostJsonDataItems
AZURE_RESOURCE_ID=/subscriptions/00000000-0000-0000-0000-000000000000/resourcegroups/myresourcegroup/providers/microsoft.compute/virtualmachines/linuxvm2
OMSCLOUD_ID=0000-0000-0000-0000-0000-0000-00
UUID=00000000-0000-0000-0000-000000000000
Passed: Microsoft Monitoring agent is running
Passed: Machine registered with log analytics workspace:['00000000-0000-0000-0000-000000000000']
Passed: Hybrid worker package is present
Passed: Hybrid worker is running
Passed: Machine is connected to internet
Passed: TCP test for {scus-agentservice-prod-1.azure-automation.cn} (port 443) succeeded
Passed: TCP test for {eus2-jobruntimedata-prod-su1.azure-automation.cn} (port 443) succeeded
Passed: TCP test for {00000000-0000-0000-0000-000000000000.ods.opinsights.azure.cn} (port 443) succeeded
Passed: TCP test for {00000000-0000-0000-0000-000000000000.oms.opinsights.azure.cn} (port 443) succeeded
Passed: TCP test for {ods.systemcenteradvisor.com} (port 443) succeeded