Issue:
The Gateway console shows warnings posted by the ClockDriftDetector logger, with the following message: "Clock drift, degraded performance, or pause-the-world detected"; the message also contains information about last and current times and the deviation from expected delta.
Background
The Clock Drift Detector is a task in the Ignition Gateway that is scheduled to run every 1 second. Every time this task runs, it checks how long it has been since the last time it ran. If it's been longer than 2 seconds (1 second scheduled delay plus 1 second or more of unexpected delay), the Clock Drift Detector logs a warning with the delta between now and the last run.
If the warnings are rare and the deviation (delta) they post is negligible, the warning is worth noting, but is not necessarily a cause for concern.
If occurring frequently, on the other hand, it's worth paying attention to: Ignition has many components that are time/timer-based on periods much smaller than 1 second; if a one-second task is not getting executed regularly on time, it's likely that neither are the other periodic tasks. For example, issues like connection timeouts could potentially have the same root cause as ClockDrift warnings.
Common Causes
- System clock changing
If something is controlling the system clock in a way that will make it run unreliably, or if something is resetting the clock, Ignition will detect the unexpected changes and log a ClockDrift warning. The cause can be anything from the CMOS battery dying on an old machine, to time synchronization not being set up correctly.
- JVM (Java Virtual Machine) pauses
Java uses built-in automatic garbage collection - the process of identifying and deleting objects in heap memory that are not in use anymore, so the memory used by an unreferenced object can be reclaimed. Sometimes JVM needs to be paused to complete garbage collection. Normally this pause is not long enough to be noticed or logged, but some circumstances, such as insufficient heap size or other concurrent operating system activities, can extend the pause enough to cause a ClockDrift warning to be logged. - Pauses or delays in execution caused by the underlying operating system
One common cause for such pauses or delays is the system running out of memory. When this happens, the computer will use a special space on the hard drive (called page file or swap space) to store data that cannot be held in its random-access memory (RAM) when RAM fills up. Hard drive offers inherently much slower access speeds compared to RAM, which affects performance of running programs, making them run slower.
This is common when running in a virtual machine or when the system is resource-starved.
Troubleshooting ClockDriftDetector warnings
- Make sure the system clock is correct
It is a good idea to keep system clock synchronized to a time server. To do so, use the utilities your operating system offers.
For example, to synchronize the clock on Windows 7, go to the Control Panel and select Date and Time section. In the "Date and Time" window that will open, switch to the "Internet Time" tab, click on the "Change Settings" button, check the "Synchronize with an Internet time server" box and use the dropdown to select the server to synchronize with. Click "OK" to save the changes.
To synchronize the clock on a typical Ubuntu installation with graphical user interface, use your preferred method to open the "Time and Date" window and select the "Automatically from the Internet" radio button in the "Set the time" section.
- Make sure your Ignition Gateway is allocated sufficient memory resources
Check the CPU / Memory usage as reported by the Gateway’s Configure / System / Status page.
If you see that the memory usage reported is chronically high (the ratio reported on the "Memory used/max" line is high, like in the image above), the Gateway may need more memory allocated to it.
To allocate more memory to your Ignition Gateway, edit the wrapper.java.maxmemory property in the ignition.conf file:
On a typical Windows installation ignition.conf file is located in the C:\Program Files\Inductive Automation\Ignition\data\ directory. On a typical Linux installation, it is located in the /etc/ignition/ directory.
The value for the wrapper.java.maxmemory property is specified in megabytes. By default it is 2048 for a 64-bit installation, and 1024 for a 32-bit installation. Change the value and save the changes. A Gateway restart will be required to load the new configuration.
Warning: Be careful increasing the wrapper.java.maxmemory property value: make sure your operating system has enough free memory to accommodate this change, otherwise the Gateway will fail to start.
Warning: Be careful increasing the wrapper.java.maxmemory property value on a 32-bit Ignition installation: the maximum possible heap size for 32-bit JVM is around 1.4 GB (the exact value depends on the system). Setting the maximum heap size for a 32-bit JVM to a greater value will cause the Gateway to fail to start.
- Make sure the operating system has sufficient resources
Check the total CPU and RAM usage as reported by the operating system (make sure you are checking it on the computer where Ignition is installed).
To check resource usage on a Windows operating system, use your preferred method to open the Task Manager. When the Task Manager window opens, switch to the "Performance" tab to see the system's CPU and Memory usage graphs. On Windows 7 and earlier, you can also look at the bottom bar to see the percentage values.
In the image above, the Task Manager is reporting memory usage of over 90%. This system does not have enough RAM to run all of the processes it is running.
To check resource usage on a typical Ubuntu installation with graphical user interface, use your preferred method to open the "System Monitor" window and switch to the Resources tab to see the system's CPU and Memory usage graphs. You can also look at the legend under each graph to see the percentage values.
In the image above, the System Monitor is reporting memory usage of under 10%. This system is not resource-starved.
If the memory usage reported by the operating system is high, the system does not have enough RAM to run all of the processes it is running. An Ignition Gateway on such system will almost certainly experience clock drift, and time-sensitive tasks are most likely not executing on time. To correct this issue, either remove additional applications consuming significant amounts of memory, or install more RAM. If the maximum Java heap size for Ignition is set to a very high value, it may also be useful to re-evaluate the need for such large heap size. This is the case where more is not always better, especially if the "used" value reported in the Gateway’s Configure / System / Status page on the "Memory used / max" line never comes close to the "max" value.
The same is true for high CPU usage. If the CPU usage on the system is chronically high, it means that the system does not have a sufficiently powerful processor for what it is running. If Ignition is using most of the CPU time, either consider what may be consuming it (for example, audit the scripts running in the Gateway scope, consider subscriptions and database connections, etc), or move Ignition to a more powerful system. If another application is consuming a large percentage of CPU time, remove that application. - If Ignition is running in a Virtual Machine, make sure to check resource usage for both the Guest and the Host operating system.
A virtual machine (VM) is a software implementation of a machine (for example, a computer) that executes programs like a physical machine. The discussion of virtualization technologies and products is beyond the scope of this article, but the important points to keep in mind are that multiple virtual machines (guests) can be running simultaneously on the same physical machine (host) and that guest machines all share the resources available on the host. This has a number of implications when running Ignition on a VM, for example:
- The memory and CPU resources available to the guest operating system running Igntion are not the same as the resources the physical machine has.
For example, while the server (host) may have 32 GB RAM and 16 CPU cores, the VM running Ignition may be allocated only 4 GB RAM and 2 CPU cores. In this case, Ignition will be running on an operating system meeting only minimum hardware requirements which may become insufficient as the system expands, despite the fact that the server itself has sufficient resources to run a much larger system. One of the most common causes of ClockDrift warnings when running Ignition in such setup is insufficient CPU and RAM allocated to the VM. A good place to start troubleshooting the issue would be to allocate the virtual machine running Ignition more RAM and more CPU cores. On details of how to do that, refer to your VM software documentation or consult a system administrator. - The memory and CPU resources for the guest may be allocated statically or dynamically.
It is common for VM resources to be dynamically allocated, which in general allows the system administrator to run more VMs on the same host sharing limited resources between the VMs as needed. On the other hand, this means that any given VM is likely to not actually have 100% of the resources (e.g. CPU time and RAM) that it reports. As resource consumption by the VM grows, it is given access to more RAM and CPU time as needed, given such resources are available at the time. If additional resources are not readily available, the VM, and consequently its applications, including Ignition, will run resource-starved, which will make Ignition logs report clock drift and JVM pauses. For best performance, a VM running Ignition should be given dedicated (statically allocated) RAM and CPU cores. - Overallocating resources to the VM can adversely affect performance of the host, which will make the VM's performance worse, not better.
If it has been determined that the VM running Ignition needs more RAM and / or CPU cores, it is important to first consider if the host running that VM has sufficient additional resources. For example, if the VM was originally given 4 GB RAM and now needs additional 4 GB RAM, but the host running this VM only has 16 GB RAM and needs to run two other VMs each of which is given 4 GB RAM, adding up memory usage by all three VMs after proposed increase (8GB for the Ignition VM + 2*4 GB for the other two VMs = 16 GB) will show that following the change, the host will not have enough RAM to run all three VMs simultaneously and perform its own tasks. Proceeding with the proposed memory increase for the Ignition VM will cause severe performance issues for the host, and consequently, for the Ignition VM, or the Ignition VM will simply fail to start after the change is made.
Virtual environment resource planning is a complex topic, and the above should be treated as a simplified example. For detailed information and recommendations, please refer to the VM software documentation or consult a system administrator.
- The memory and CPU resources available to the guest operating system running Igntion are not the same as the resources the physical machine has.
Comments
1 comment
Very helpful article. I would like to mention a newbie mistake I made. The VM guest was set to synch time with the host. The VM Guest used NTP to synch time. At some point the host was not able to access the Network and the guest could. I think the time was being constantly changed - the host setting the guest time, then the guest synching via NTP. Once I disabled synch time with host the issue ceased.
Please sign in to leave a comment.