Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
This article helps you troubleshoot errors you might experience with the Log Analytics agent for Linux in Azure Monitor.
Caution
This article references CentOS, a Linux distribution that reached end of support status. Consider your use and planning accordingly. For more information, see the CentOS End Of Life guidance.
Log Analytics Troubleshooting Tool
The Log Analytics agent for Linux Troubleshooting Tool is a script that helps find and diagnose problems with the Log Analytics agent. The agent installation automatically includes the tool. Run the tool as the first step in diagnosing an issue.
Use the Troubleshooting Tool
To run the Troubleshooting Tool, paste the following command into a terminal window on a machine with the Log Analytics agent:
sudo /opt/microsoft/omsagent/bin/troubleshooter
Manual installation
The installation of the Log Analytics agent automatically includes the Troubleshooting Tool. If the installation fails, you can also install the tool manually:
- Ensure that the GNU Project Debugger (GDB) is installed on the machine because the troubleshooter relies on it.
- Copy the troubleshooter bundle onto your machine:
wget https://raw.github.com/microsoft/OMS-Agent-for-Linux/master/source/code/troubleshooter/omsagent_tst.tar.gz - Unpack the bundle:
tar -xzvf omsagent_tst.tar.gz - Run the manual installation:
sudo ./install_tst
Scenarios covered
The Troubleshooting Tool checks the following scenarios:
- The agent is unhealthy; the heartbeat doesn't work properly.
- The agent doesn't start or can't connect to Log Analytics.
- The agent Syslog isn't working.
- The agent has high CPU or memory usage.
- The agent has installation problems.
- The agent custom logs aren't working.
- Agent logs can't be collected.
For more information, see the Troubleshooting Tool documentation on GitHub.
Note
Run the Log Collector tool when you experience an issue. Having the logs initially helps the support team troubleshoot your problem faster.
Purge and reinstall the Linux agent
A clean reinstall of the agent fixes most problems. This task might be the first suggestion from the support team to get the agent into an uncorrupted state. Running the Troubleshooting Tool and Log Collector tool and attempting a clean reinstall helps to solve problems more quickly.
Download the purge script:
$ wget https://raw.githubusercontent.com/microsoft/OMS-Agent-for-Linux/master/tools/purge_omsagent.shRun the purge script (with sudo permissions):
$ sudo sh purge_omsagent.sh
Important log locations and the Log Collector tool
| File | Path |
|---|---|
| Log Analytics agent for Linux log file | /var/opt/microsoft/omsagent/<workspace id>/log/omsagent.log |
| Log Analytics agent configuration log file | /var/opt/microsoft/omsconfig/omsconfig.log |
Use the Log Collector tool to get important logs for troubleshooting or before you submit a GitHub issue. For more information about the tool and how to run it, see OMS Linux Agent Log Collector.
Important configuration files
| Category | File location |
|---|---|
| Syslog | /etc/syslog-ng/syslog-ng.conf, /etc/rsyslog.conf, or /etc/rsyslog.d/95-omsagent.conf |
| Performance, Nagios, Zabbix, Log Analytics output, and general agent | /etc/opt/microsoft/omsagent/<workspace id>/conf/omsagent.conf |
| Extra configurations | /etc/opt/microsoft/omsagent/<workspace id>/conf/omsagent.d/*.conf |
Note
If you configure the collection from the agent's configuration in the Azure portal for your workspace, it overwrites any edits to configuration files for performance counters and Syslog. To disable configuration for all agents, disable collection from Legacy agents management. For a single agent, run the following script:
sudo /opt/microsoft/omsconfig/Scripts/OMS_MetaConfigHelper.py --disable && sudo rm /etc/opt/omi/conf/omsconfig/configuration/Current.mof* /etc/opt/omi/conf/omsconfig/configuration/Pending.mof*
Installation error codes
| Error code | Meaning |
|---|---|
| NOT_DEFINED | The installer can't install the auoms auditd plug-in because the necessary dependencies aren't installed. Installation of auoms failed. Install package auditd. |
| 2 | Invalid option provided to the shell bundle. Run sudo sh ./omsagent-*.universal*.sh --help for usage. |
| 3 | No option provided to the shell bundle. Run sudo sh ./omsagent-*.universal*.sh --help for usage. |
| 4 | Invalid package type or invalid proxy settings. The omsagent-rpm.sh packages can only be installed on RPM-based systems. The omsagent-deb.sh packages can only be installed on Debian-based systems. Use the universal installer from the latest release. Also review to verify your proxy settings. |
| 5 | The shell bundle must be executed as root or there was a 403 error returned during onboarding. Run your command by using sudo. |
| 6 | Invalid package architecture or there was a 200 error returned during onboarding. The omsagent-*x64.sh packages can only be installed on 64-bit systems. The omsagent-*x86.sh packages can only be installed on 32-bit systems. Download the correct package for your architecture from the latest release. |
| 17 | Installation of OMS package failed. Look through the command output for the root failure. |
| 18 | Installation of OMSConfig package failed. Look through the command output for the root failure. |
| 19 | Installation of OMI package failed. Look through the command output for the root failure. |
| 20 | Installation of SCX package failed. Look through the command output for the root failure. |
| 21 | Installation of Provider kits failed. Look through the command output for the root failure. |
| 22 | Installation of bundled package failed. Look through the command output for the root failure. |
| 23 | SCX or OMI package already installed. Use --upgrade instead of --install to install the shell bundle. |
| 30 | Internal bundle error. File a GitHub issue with details from the output. |
| 55 | Unsupported openssl version or can't connect to Azure Monitor or dpkg is locked or missing curl program. |
| 61 | Missing Python ctypes library. Install the Python ctypes library or package (python-ctypes). |
| 62 | Missing tar program. Install tar. |
| 63 | Missing sed program. Install sed. |
| 64 | Missing curl program. Install curl. |
| 65 | Missing gpg program. Install gpg. |
Onboarding error codes
| Error code | Meaning |
|---|---|
| 2 | Invalid option provided to the omsadmin script. Run sudo sh /opt/microsoft/omsagent/bin/omsadmin.sh -h for usage. |
| 3 | Invalid configuration provided to the omsadmin script. Run sudo sh /opt/microsoft/omsagent/bin/omsadmin.sh -h for usage. |
| 4 | Invalid proxy provided to the omsadmin script. Verify the proxy and see the documentation for using an HTTP proxy. |
| 5 | 403 HTTP error received from Azure Monitor. See the full output of the omsadmin script for details. |
| 6 | Non-200 HTTP error received from Azure Monitor. See the full output of the omsadmin script for details. |
| 7 | Unable to connect to Azure Monitor. See the full output of the omsadmin script for details. |
| 8 | Error onboarding to Log Analytics workspace. See the full output of the omsadmin script for details. |
| 30 | Internal script error. File a GitHub issue with details from the output. |
| 31 | Error generating agent ID. File a GitHub issue with details from the output. |
| 32 | Error generating certificates. See the full output of the omsadmin script for details. |
| 33 | Error generating metaconfiguration for omsconfig. File a GitHub issue with details from the output. |
| 34 | Metaconfiguration generation script not present. Retry onboarding with sudo sh /opt/microsoft/omsagent/bin/omsadmin.sh -w <Workspace ID> -s <Workspace Key>. |
Enable debug logging
OMS output plug-in debug
FluentD supports plug-in-specific logging levels, so you can set different log levels for inputs and outputs. To set a different log level for OMS output, update the general agent configuration at /etc/opt/microsoft/omsagent/<workspace id>/conf/omsagent.conf.
In the OMS output plug-in, change the log_level property from info to debug before the end of the configuration file:
<match oms.** docker.**>
type out_oms
log_level debug
num_threads 5
buffer_chunk_limit 5m
buffer_type file
buffer_path /var/opt/microsoft/omsagent/<workspace id>/state/out_oms*.buffer
buffer_queue_limit 10
flush_interval 20s
retry_limit 10
retry_wait 30s
</match>
Debug logging shows you batched uploads to Azure Monitor, separated by type, number of data items, and time taken to send.
Here's an example debug-enabled log:
Success sending oms.nagios x 1 in 0.14s
Success sending oms.omi x 4 in 0.52s
Success sending oms.syslog.authpriv.info x 1 in 0.91s
Verbose output
Instead of using the OMS output plug-in, you can send data items directly to stdout. You can see this output in the Log Analytics agent for Linux log file.
In the Log Analytics general agent configuration file at /etc/opt/microsoft/omsagent/<workspace id>/conf/omsagent.conf, comment out the OMS output plug-in by adding a # in front of each line:
#<match oms.** docker.**>
# type out_oms
# log_level info
# num_threads 5
# buffer_chunk_limit 5m
# buffer_type file
# buffer_path /var/opt/microsoft/omsagent/<workspace id>/state/out_oms*.buffer
# buffer_queue_limit 10
# flush_interval 20s
# retry_limit 10
# retry_wait 30s
#</match>
Below the output plug-in, uncomment the following section by removing the # in front of each line:
<match **>
type stdout
</match>
Issue: Unable to connect through proxy to Azure Monitor
Probable causes
- You specified an incorrect proxy during onboarding.
- Your datacenter's approved list doesn't include the Azure Monitor and Azure Automation service endpoints.
Resolution
Reonboard to Azure Monitor by using the Log Analytics agent for Linux. Use the following command with the
-voption enabled. This option provides verbose output of the agent connecting through the proxy to Azure Monitor:/opt/microsoft/omsagent/bin/omsadmin.sh -w <Workspace ID> -s <Workspace Key> -p <Proxy Conf> -vReview the section Update proxy settings to verify you properly configured the agent to communicate through a proxy server.
Double-check that the endpoints outlined in the Azure Monitor network firewall requirements list are added to an allow list correctly. If you use Azure Automation, the necessary network configuration steps are also linked previously.
Issue: You receive a 403 error when trying to onboard
Probable causes
- Date and time are incorrect on the Linux server.
- The workspace ID and workspace key aren't correct.
Resolution
- Check the time on your Linux server by using the
datecommand. If the time is more than 15 minutes different from the current time, onboarding fails. To correct this problem, update the date and time or the time zone of your Linux server. - Verify that you installed the latest version of the Log Analytics agent for Linux. The newest version now notifies you if time skew causes the onboarding failure.
- Reonboard by using the correct workspace ID and workspace key in the installation instructions earlier in this article.
Issue: You see a 500 and 404 error in the log file right after onboarding
This error is a known problem that occurs on the first upload of Linux data into a Log Analytics workspace. This problem doesn't affect data being sent or service experience.
Issue: You see omiagent using 100% CPU
Probable causes
A regression in the nss-pem package v1.0.3-5.el7 causes a severe performance problem. This problem frequently appears in Redhat and CentOS 7.x distributions. For more information about this problem, see 1667121 Performance regression in libcurl.
Performance-related bugs don't always happen and they're difficult to reproduce. If you experience such a problem with omiagent, use the script omiHighCPUDiagnostics.sh. The script collects the stack trace of the omiagent when it exceeds a certain threshold.
Download the script:
wget https://raw.githubusercontent.com/microsoft/OMS-Agent-for-Linux/master/tools/LogCollector/source/omiHighCPUDiagnostics.shRun diagnostics for 24 hours with 30% CPU threshold:
bash omiHighCPUDiagnostics.sh --runtime-in-min 1440 --cpu-threshold 30The call stack is dumped in the omiagent_trace file. If you notice many curl and NSS function calls, follow these resolution steps.
Resolution
Upgrade the nss-pem package to v1.0.3-5.el7_6.1:
sudo yum upgrade nss-pemIf nss-pem isn't available for upgrade, which mostly happens on CentOS, downgrade curl to 7.29.0-46. If you run "yum update" by mistake, curl is upgraded to 7.29.0-51 and the problem happens again:
sudo yum downgrade curl libcurlRestart OMI:
sudo scxadmin -restart
Problem: You don't see forwarded Syslog messages
Probable causes
- The configuration you applied to the Linux server doesn't allow collection of the sent facilities or log levels.
- Syslog isn't forwarding correctly to the Linux server.
- The number of messages being forwarded per second is too great for the base configuration of the Log Analytics agent for Linux to handle.
Resolution
- Verify the configuration in the Log Analytics workspace for Syslog has all the facilities and the correct log levels. Review configure Syslog collection in the Azure portal.
- Verify the native Syslog messaging daemons (
rsyslog,syslog-ng) can receive the forwarded messages. - To ensure that messages aren't being blocked, check firewall settings on the Syslog server.
- Simulate a Syslog message to Log Analytics by using a
loggercommand:
logger -p local0.err "This is my test message"
Issue: You're receiving Errno address already in use in omsagent log file
You see [error]: unexpected error error_class=Errno::EADDRINUSE error=#<Errno::EADDRINUSE: Address already in use - bind(2) for "127.0.0.1" port 25224> in omsagent.log.
Probable causes
This error indicates that the Linux diagnostic extension (LAD) is installed side by side with the Log Analytics Linux VM extension. They're both using the same port for Syslog data collection as omsagent.
Resolution
As root, run the following commands. Port number 25224 is an example, and you might see a different port number used by LAD in your environment.
/opt/microsoft/omsagent/bin/configure_syslog.sh configure LAD 25229 sed -i -e 's/25224/25229/' /etc/opt/microsoft/omsagent/LAD/conf/omsagent.d/syslog.confYou then need to edit the correct
rsyslogdorsyslog_ngconfig file and change the LAD-related configuration to write to port 25229.If the VM is running
rsyslogd, modify/etc/rsyslog.d/95-omsagent.conf(if it exists) or/etc/rsyslog. If the VM is runningsyslog_ng, modify/etc/syslog-ng/syslog-ng.conf.Restart omsagent by running
sudo /opt/microsoft/omsagent/bin/service_control restart.Restart the Syslog service.
Issue: You can't uninstall omsagent by using the purge option
Probable causes
- The Linux diagnostic extension is installed.
- The Linux diagnostic extension was installed and uninstalled. However, you still see an error about omsagent being used by mdsd and it can't be removed.
Resolution
- Uninstall the Linux diagnostic extension.
- Remove Linux diagnostic extension files from the machine if they're present in the following locations:
/var/lib/waagent/Microsoft.Azure.Diagnostics.LinuxDiagnostic-<version>/and/var/opt/microsoft/omsagent/LAD/.
Issue: You can't see any Nagios data
Probable causes
- The omsagent user doesn't have permissions to read from the Nagios log file.
- The Nagios source and filter aren't uncommented in the omsagent.conf file.
Resolution
Add the
omsagentuser to read from the Nagios file by following these instructions.In the Log Analytics agent for Linux general configuration file at
/etc/opt/microsoft/omsagent/<workspace id>/conf/omsagent.conf, ensure that both the Nagios source and filter are uncommented.<source> type tail path /var/log/nagios/nagios.log format none tag oms.nagios </source> <filter oms.nagios> type filter_nagios_log </filter>
Problem: You don't see any Linux data
Probable causes
- Onboarding to Azure Monitor failed.
- Connection to Azure Monitor is blocked.
- Virtual machine was rebooted.
- OMI package was manually upgraded to a newer version compared to what version the Log Analytics agent for Linux package installed.
- OMI is frozen, blocking the OMS agent.
- DSC resource logs class not found error in
omsconfig.loglog file. - Log Analytics agent for data is backed up.
- DSC logs Current configuration doesn't exist. Execute Start-DscConfiguration command with -Path parameter to specify a configuration file and create a current configuration first. in
omsconfig.loglog file, but no log message exists aboutPerformRequiredConfigurationChecksoperations.
Resolution
Install all dependencies, such as the auditd package.
Check if onboarding to Azure Monitor was successful by checking if the following file exists:
/etc/opt/microsoft/omsagent/<workspace id>/conf/omsadmin.conf. If it wasn't, reonboard by using the omsadmin.sh command-line instructions.If you're using a proxy, check the preceding proxy troubleshooting steps.
In some Azure distribution systems, the omid OMI server daemon doesn't start after the virtual machine is rebooted. If this condition is true, you don't see Audit, ChangeTracking, or UpdateManagement solution-related data. The workaround is to manually start the OMI server by running
sudo /opt/omi/bin/service_control restart.After you manually upgrade the OMI package to a newer version, you must manually restart it for the Log Analytics agent to continue functioning. This step is required for some distros where the OMI server doesn't automatically start after upgrade. To restart the OMI, run
sudo /opt/omi/bin/service_control restart.In some situations, the OMI can become frozen. The OMS agent might enter a blocked state waiting for the OMI, which blocks all data collection. The OMS agent process is running but there's no activity. There are no new log lines (such as sent heartbeats) present in
omsagent.log. Restart the OMI by usingsudo /opt/omi/bin/service_control restartto recover the agent.If you see a DSC resource class not found error in omsconfig.log, run
sudo /opt/omi/bin/service_control restart.In some cases, when the Log Analytics agent for Linux can't talk to Azure Monitor, the agent backs up data to the full buffer size of 50 MB. Restart the agent by running the following command:
/opt/microsoft/omsagent/bin/service_control restart.Note
This issue is fixed in agent version 1.1.0-28 or later.
If the
omsconfig.loglog file doesn't indicate thatPerformRequiredConfigurationChecksoperations are running periodically on the system, there might be a problem with the cron job or service. Make sure the cron job exists under/etc/cron.d/OMSConsistencyInvoker. If needed, run the following commands to create the cron job:mkdir -p /etc/cron.d/ echo "*/15 * * * * omsagent /opt/omi/bin/OMSConsistencyInvoker >/dev/null 2>&1" | sudo tee /etc/cron.d/OMSConsistencyInvokerAlso, make sure the cron service is running. You can use
service cron statuswith Debian, Ubuntu, and SUSE orservice crond statuswith RHEL, CentOS, and Oracle Linux to check the status of this service. If the service doesn't exist, you can install the binaries and start the service by using the following instructions:Ubuntu/Debian
# To Install the service binaries sudo apt-get install -y cron # To start the service sudo service cron startSUSE
# To Install the service binaries sudo zypper in cron -y # To start the service sudo systemctl enable cron sudo systemctl start cronRHEL/CentOS
# To Install the service binaries sudo yum install -y crond # To start the service sudo service crond startOracle Linux
# To Install the service binaries sudo yum install -y cronie # To start the service sudo service crond start
Issue: When you configure collection from the portal for Syslog or Linux performance counters, the settings aren't applied
Probable causes
- The Log Analytics agent for Linux didn't pick up the latest configuration.
- The changed settings in the portal weren't applied.
Resolution
Background: omsconfig is the Log Analytics agent for Linux configuration agent that looks for new portal-side configuration every five minutes. This configuration is then applied to the Log Analytics agent for Linux configuration files located at /etc/opt/microsoft/omsagent/conf/omsagent.conf.
In some cases, the Log Analytics agent for Linux configuration agent can't communicate with the portal configuration service. This scenario results in the latest configuration not being applied.
Check that the
omsconfigagent is installed by runningdpkg --list omsconfigorrpm -qi omsconfig. If it's not installed, reinstall the latest version of the Log Analytics agent for Linux.Check that the
omsconfigagent can communicate with Azure Monitor by running the following command:sudo su omsagent -c 'python /opt/microsoft/omsconfig/Scripts/GetDscConfiguration.py'. This command returns the configuration that the agent receives from the service, including Syslog settings, Linux performance counters, and custom logs. If this command fails, run the following command:sudo su omsagent -c 'python /opt/microsoft/omsconfig/Scripts/PerformRequiredConfigurationChecks.py'. This command forces theomsconfigagent to talk to Azure Monitor and retrieve the latest configuration.
Issue: You aren't seeing any custom log data
Probable causes
- Onboarding to Azure Monitor failed.
- The setting Apply the following configuration to my Linux Servers wasn't selected.
omsconfigdidn't pick up the latest custom log configuration from the service.- The Log Analytics agent for Linux user
omsagentcan't access the custom log due to permissions or the log isn't found. You might see the following errors:[DATETIME] [warn]: file not found. Continuing without tailing it.[DATETIME] [error]: file not accessible by omsagent.
- Known issue with race condition fixed in Log Analytics agent for Linux version 1.1.0-217.
Resolution
Verify onboarding to Azure Monitor was successful by checking if the following file exists:
/etc/opt/microsoft/omsagent/<workspace id>/conf/omsadmin.conf. If not, either:- Reonboard by using the
omsadmin.shcommand line instructions.
- Under Advanced Settings in the Azure portal, ensure that the setting Apply the following configuration to my Linux Servers is enabled.
- Reonboard by using the
Check that the
omsconfigagent can communicate with Azure Monitor by running the following command:sudo su omsagent -c 'python /opt/microsoft/omsconfig/Scripts/GetDscConfiguration.py'. This command returns the configuration that the agent receives from the service, including Syslog settings, Linux performance counters, and custom logs. If this command fails, run the following command:sudo su omsagent -c 'python /opt/microsoft/omsconfig/Scripts/PerformRequiredConfigurationChecks.py'. This command forces theomsconfigagent to talk to Azure Monitor and retrieve the latest configuration.
Background: Instead of the Log Analytics agent for Linux running as a privileged user - root, the agent runs as the omsagent user. In most cases, you must grant explicit permission to this user for certain files to be read. To grant permission to omsagent user, run the following commands:
- Add the
omsagentuser to the specific group:sudo usermod -a -G <GROUPNAME> <USERNAME>. - Grant universal read access to the required file:
sudo chmod -R ugo+rx <FILE DIRECTORY>.
There's a known issue with a race condition with the Log Analytics agent for Linux version earlier than 1.1.0-217. After you update to the latest agent, run the following command to get the latest version of the output plug-in: sudo cp /etc/opt/microsoft/omsagent/sysconf/omsagent.conf /etc/opt/microsoft/omsagent/<workspace id>/conf/omsagent.conf.
Issue: You're trying to reonboard to a new workspace
When you try to reonboard an agent to a new workspace, the Log Analytics agent configuration needs to be cleaned up before reonboarding. To clean up old configuration from the agent, run the shell bundle with --purge:
sudo sh ./omsagent-*.universal.x64.sh --purge
Or
sudo sh ./onboard_agent.sh --purge
You can continue to reonboard after you use the --purge option.
Issue: Log Analytics agent extension in the Azure portal is marked with a failed state: Provisioning failed
Probable causes
- You removed the Log Analytics agent from the operating system.
- The Log Analytics agent service is down, disabled, or not configured.
Resolution
- Remove the extension from the Azure portal.
- Install the agent by following the instructions.
- Restart the agent by running the following command:
sudo /opt/microsoft/omsagent/bin/service_control restart. - Wait several minutes until the provisioning state changes to Provisioning succeeded.
Issue: The Log Analytics agent upgrade on-demand
Probable causes
The Log Analytics agent packages on the host are outdated.
Resolution
Check for the latest release on this GitHub page.
Download the installation script (1.4.2-124 is an example version):
wget https://github.com/Microsoft/OMS-Agent-for-Linux/releases/download/OMSAgent_GA_v1.4.2-124/omsagent-1.4.2-124.universal.x64.shUpgrade packages by executing
sudo sh ./omsagent-*.universal.x64.sh --upgrade.
Issue: Installation fails and says Python2 can't support ctypes, even though Python3 is being used
Probable causes
For this known issue, if the VM's language isn't English, a check fails when verifying which version of Python is used. This issue leads to the agent always assuming Python 2 is used and failing if there's no Python 2.
Resolution
Change the VM's environmental language to English:
export LANG=en_US.UTF-8