What is a VM?

Why use a VM instead of physical hardware?

Considerations for VMs Hosting Ignition

Provisioning the VM

Virtual Hardware Specs

Resource Allocation

VMware Example

Viewing Resource Usage

Ignition Gateway Web Interface

Guest OS (VM) Performance Monitor

Hypervisor Performance Monitor

Host OS Performance Monitor

Timekeeping in VMs

Getting Clock Drifts When OS Time Within the VM is Off Due to Timekeeping Issues

VM Scheduled Tasks

Clock drifts or Gateway crashing at consistent times due to VM snapshots being taken

VM Snapshots and Backups

Migrating VMs

Consider the following post migration steps

Virtual machines (VMs) are a popular choice for developing and deploying Ignition systems. Ignition will run well in a VM environment but users should be aware of some guidelines that are specific to VMs. This article covers VM terminology, advantages and drawbacks of VMs, and specific issues that users may encounter when hosting Ignition in a VM.

What is a VM?

A VM is a virtual computer run by a physical computer. The physical computer emulates the hardware and operating system of the virtual computer. The physical computer "shares" its resources with the virtual computer. Users can customize their VM's virtual hardware to meet their needs.

Note: In this case we are talking about hardware-level virtual machines. This type of virtual machine is different from OS-level virtual machines. A Docker container is an example of an OS-level virtual machine. This article will not cover Docker containers.

Here is some terminology often used when talking about VMs:

Host:

The physical, bare-metal, “real-world” computer running the virtual machine. A host could be a traditional rack server, a desktop PC, a laptop, etc.

Guest:

The virtual computer that is emulated by the host. One host can emulate several guests and each guest can have different virtual hardware or operating systems.

Cluster:

A group of hosts. Whenever you add a new host to a cluster, the cluster gains its computing resources.The cluster is the combined computing resources of all its hosts.

Hypervisor:

The computer software, firmware, or hardware that manages the operations of virtual machines. Hypervisors can divide the hosts' resources among many guest operating systems. They can provision new virtual machines or change existing virtual machines. They also track the resource usage and performance of each virtual machine. Hypervisors are sometimes known as ‘virtual machine monitors’ for this reason.

Types of Hypervisors

There are two categories of hypervisor: Type 1 and Type 2.

Type 1 hypervisors run on the physical hardware of the host machine. Type 2 hypervisors run at the operating system level and are sometimes called “hosted” hypervisors.

Type 1 hypervisors are often considered more efficient and secure. The tradeoff is that they are less flexible when it comes to hardware support. By contrast, Type 2 hypervisors can support a wider range of hardware but are less efficient than Type 1 hypervisors.

Why use a VM instead of physical hardware?

Here are some key reasons you may opt to use virtual machines for your Ignition servers:

Recovery after failure:

If the underlying host machine fails, a virtual machine can migrate to a new host in the cluster. This offers protection from data loss if a physical machine fails.

Ease of maintenance:

Maintenance operations have minimal impact on production environments using VMs. There is usually no need for downtime when performing maintenance. It can also be simpler to test and develop systems in VMs for that reason.

Ease of deployment/portability:

You can save the current state of a VM (called a “snapshot”) and then move or copy the VM state as a file. Since VMs are not tied down to a specific server, they are easy to migrate to any other server using this snapshot. VMs can also migrate from one host to another while the instance is still running.

Scalability:

You can move a virtual machine from an on-premise server to a cloud-hosted solution as needed. Cloud-hosted solutions are a cluster of host machines. You can rent computing resources from this cluster to meet your system's demands. Cloud-hosted solutions offer more flexibility than on-premise servers because you can increase or decrease the computing resources as demand changes.

Reduced costs:

Since a single physical server can host multiple virtual machines, you can reduce the amount of required hardware.

VM Software and Examples

There are many options when it comes to VM hypervisor software. Ignition can run on any hypervisor that emulates hardware and an operating system. Some examples are listed below.

Note: Cloud-hosted VM solutions like AWS, Azure, and Google Cloud are not included in this list.

VMWare Workstation Pro:

Hypervisor: Type 2

Host operating systems: x64 versions of Microsoft Windows and Linux

Guest operating systems: Microsoft Windows, Linux

https://www.vmware.com/products/workstation-pro.html

VMWareFusion:

Hypervisor: Type 2

Host operating systems: Intel-based macOS

Guest operating systems: Microsoft Windows, Linux, or macOS

https://www.vmware.com/products/fusion.html

Microsoft Hyper-V:

Hypervisor: Type 1

Host operating systems: x86 and x64 versions of Microsoft Windows

Guest operating systems: Microsoft Windows, Linux

https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/about/

Oracle VM VirtualBox:

Hypervisor: Type 2

Host operating systems: Microsoft Windows, macOS, Linux

Guest operating systems: Microsoft Windows, Linux

https://www.virtualbox.org/

Parallels Desktop

Hypervisor: Type 2

Host operating systems: Intel-based macOS

Guest operating systems: x86 versions of Microsoft Windows, macOS, Linux

Considerations for VMs Hosting Ignition

Provisioning the VM

Virtual Hardware Specs

When setting up a VM to host Ignition, you may be thinking to yourself “what virtual hardware specs will my VM need?” A good place to start is the system requirements for your Ignition version. You can find them on the Inductive Automation Downloads page: https://inductiveautomation.com/downloads/

Your actual system requirements will always vary by usage. There are major differences in the hardware required to run a small Ignition system vs. a large one. For example, a small Ignition system may need 4 vCPUs and 8GB of RAM. A very large system may need 32 vCPUs and 64GB of RAM, or more.

You should:

Perform tests to verify that your VM can handle the Ignition Gateway’s computing tasks.
Observe your server's performance in a testing environment while it is being developed.
Make adjustments to the architecture and system specs as needed.
Deploy the VM(s) in a production setting after you complete the steps above.
Continue to monitor server performance as the Ignition system grows.

Whoever develops the system will determine the VM’s allocated CPU and memory, not anyone at Inductive Automation. For general guidelines, you can consult the Server Sizing and Architecture Guide. If you need a more tailored hardware or architecture recommendation, you can reach out to the Sales Department to consult with one of our knowledgeable Sales Engineers.

Note: Please remember that a VM is inherently slower than a physical machine since the hypervisor requires some additional computing overhead.

Resource Allocation

Dedicated vs. Shared Resources: Ignition performs best when the VM hosting it has dedicated resources rather than shared resources.

Dedicated resources:
- The hypervisor guarantees that the VM will have its computing resources available 100% of the time.
- Each virtual CPU is granted exclusive access to a physical core.
Shared resources:
- The hypervisor can provision VMs with more computing resources in total compared to what actually exists on the host.
- The hypervisor can provision resources to VMs

When given shared resources, the VM is “over-provisioned”. This can be desirable if none of the guest OSs perform CPU- or memory-intensive tasks. It can also be desirable if their periods of peak activity are at different times during the day.

If you have a small Ignition system with only a handful of clients and tags, then shared resources may not cause any issues for you. However, if there are spikes of heavy CPU and memory usage in each VM concurrently, the hypervisor can struggle to share the computing resources among each guest OS. This may lead to severe slowdowns or the VMs could simply shut down or restart. To avoid this, any VMs running critical applications and under significant load should always be allocated with dedicated resources.

Note: The actual settings to determine whether a VM has “shared resources” or “dedicated resources” will vary depending on your choice of hypervisor. “Overprovisioning” is also called “Over-Subscribing” or “Overcommitting”.

VMware Example

In VMWare vSphere, you can set “Latency Sensitivity” to “High” to ensure that a VM has dedicated resources. Latency Sensitivity is an advanced attribute for the VM used to optimize the scheduling delay for latency sensitive applications. If Latency Sensitivity is set to “Normal” , the hypervisor will provision the virtual CPUs of the VM differently depending on the system load.

After configuring this, check to make sure the vCPU allocation is 100% dedicated to the Ignition VM. If it’s not, it means the VM host is over-provisioned.

Other tips for improving VM performance:

Make sure that the host machine is not responsible for running resource-intensive applications like SQL databases.
Make sure that the OS storage drive or partition on the host machine is clean and defragmented.
If given a choice of virtual hard disk, use a fixed disk instead of a dynamically-expanding disk. A fixed disk typically performs better than other virtual disk files.
Make sure that the host machine's network interface can handle all of the network traffic for itself and the VMs it is hosting. VMs share networking resources with the host. Too much network traffic on a single network interface can affect the performance of all VMs.
Make sure that the host’s antivirus software is configured to exclude virtual machine files from its scans.
Disable the CPU power management features on the host machine. These features can cause CPU performance issues on the virtual machines.

Viewing Resource Usage

There are several places where you will be able to inspect the performance metrics like CPU and memory usage for your Ignition system. It can be confusing when the numbers reported by each performance monitor don’t line up as you expect. Let’s look at each of those performance monitors and see how they differ.

Below is a visualization to help understand the relationship between the levels of virtualization and performance metrics. Ignition’s Java process is nested at the deepest level and only “sees” the resource usage from within the Java Virtual Machine (JVM). The Guest OS “sees” the resources of the JVM and all other processes running on the VM. The hypervisor will “see” all resources used across every VM running on the Host OS. The Host OS will “see” the resource usage of the hypervisor and all host OSs, which includes the JVM process for Ignition within the VM.

Ignition Gateway Web Interface

Performance metrics can be found on the Status > Systems > Performance page.
This page shows the CPU and memory usage for Ignition’s Java process over time, as reported by the Java Virtual Machine.
A value of 100% means that all CPUs were actively running threads from the JVM during the period being observed.
- This includes the Ignition application threads as well as the JVM’s internal threads.
- The value is reported by the underlying operating system on which the Java virtual machine is running.

Guest OS (VM) Performance Monitor

Note: The performance monitors available at the Guest OS and Host OS level will often be the same unless the guest and host are running different operating systems.

Windows: Task Manager (taskmgr)
- General Info
  - It shows all applications (one or more processes running within a single application context) and their state, all processes and some of their most frequently used performance measures, and some general system performance measurements.
  - All measurements are made by directly calling functions in the operating system to retrieve system counters.
- Processes Tab
  - The Processes tab shows the current memory and percentage of CPU usage of every process running on the computer as well as the total CPU and memory usage of the system.
  - While this is not a lot of information, it is a very good first indicator when a process is taking too much of the CPU or has a memory leak.
  - On the Processes tab, Ignition will show up as java.exe with the description “Zulu Platform x64 Architecture”:

Performance Tab
- The Performance tab in Task Manager provides a top level view of the system state in terms of CPU and memory usage.
- A short history is shown on a graph, but none of this data is logged for comprehensive analysis.
- The CPU usage % is the total processor utilization across all CPU cores.
- On the Performance tab, this will show the overall % CPU utilization over the last 60 seconds for the entire OS.
- The Memory tab will show the memory composition (in-use, modified, standby, and free memory).
- These CPU and memory metrics will always be different from what is reported on the Ignition Gateway’s Performance page.
  - That is because these metrics include all other applications and tasks performed on the OS, not just the Java Virtual Machine.
Details Tab:
- This will show the process name and process ID, along with the CPU usage and “physical memory” in use by the process that can’t be used by other processes.
- Even though it says “physical memory”, these are computing resources that are emulated by the host machine.

On the Details tab:

Unix: Top command
- The top command is a performance monitoring program used to monitor processes and system resource usage in Linux.
- It can display CPU usage, Memory Usage, Swap memory, Cache size, Buffer size, Process PID, User, commands, and other information.

Top also provides some general system information and performance measures including:
- how long the system has been running
- what the average load has been
- a summary of the running processes and their states
- average percentages of CPU time in each state
- memory statistics
- swap statistics
The process information Top provides is presented in a table with a large number of possible fields including:
- PID - Process ID
- USER - User that started the process
- PRI - Process priority
- NI - Process nice value
- SIZE - Memory used by process (including code size)
- STATE - Current process state
- TIME - Total CPU time used by process
- %CPU - Percent CPU used by process
- %MEM - Percent memory used by process
- COMMAND - Process's name

Hypervisor Performance Monitor

As discussed earlier, hypervisors are sometimes called “virtual machine monitors”, as one of their features is tracking the resource usage and performance of each virtual machine.

Depending on your hypervisor, you will have different tools available for performance monitoring. Some features are not enabled by default or are limited to specific hardware. For example, if you are using VMware Workstation or VMware Player, you can enable a feature called virtual performance monitoring counters (vPMCs). This allows software running inside the virtual machine to access the physical CPUs’ performance metrics through the hypervisor. However, for this to work, the CPU architecture must be compatible between the physical and virtual CPUs.

Consult the documentation for your specific hypervisor to see what options are available to you.

Host OS Performance Monitor

The performance monitors available at the Guest OS and Host OS levels (e.g. Windows Task Manager) will often be the same, unless the guest and host are running different operating systems.

Please note that type-2 hypervisors run as an application on the host machine’s OS, so you will see the CPU and memory being used by the hypervisor itself.

Timekeeping in VMs

The follow sources were used for this section:

(https://www.vmware.com/files/pdf/techpaper/Timekeeping-In-VirtualMachines.pdf)

(https://kb.meinbergglobal.com/kb/time_sync/time_synchronization_in_virtual_machines)

Timekeeping in a VM works differently than how it works on a physical machine. For example, VMs work by sharing time with the host physical hardware, which means they cannot exactly duplicate the timing activity of physical machines. VMs usually use several techniques to minimize and conceal differences in timing performance, but the differences can still sometimes cause timekeeping inaccuracies and other software issues. Most VM software will have synchronization tools that will keep its clock synced up with the host machine. The process itself involves the use of timer interrupts. Timer interrupts can be described as a clock ticking down a specific interval where the end results in an interrupt event. The interrupt event notifies the VM’s operating system that a unit of time has passed.

Timekeeping in VMs can be pretty tricky. If timer tick interrupts are virtualized, then the interrupt handler in a particular VM can be delayed when the physical machine is busy (e.g. with other VMs). The next time a physical host is idle, several interrupts might be handled in a batch to catch up with the current time.

This means that if the time in a VM is compared to a real external reference clock during regular intervals, the time difference observed in subsequent comparisons can jump back and forth even if the reference clock is stable. For example, if 1 second seems to have passed inside a VM, the time of a real clock may have gained 1.1 seconds because the timer updates in the VM were delayed, or only 0.9 seconds if the timer updates were batched to catch up.

The accuracy of disciplining time in a virtual machine by any synchronization tool depends on the implementation of the virtualization software’s ability to ensure timer interrupts in each VM are scheduled accurately whenever an associated timer tick interval has expired.

With that in mind, usually the VM software will try its best to keep the time synchronized with the internal tools that it has. However, some best practices would be:

Limit the number of VMs on a given host machine.
Keep the time of the VM the same as the host time. Do not try to change the time in the VM itself.

Getting Clock Drifts When OS Time Within the VM is Off Due to Timekeeping Issues

Proper timekeeping is important not only for keeping accurate timestamps in the guest operating systems but also for maintaining synchronization between the host system and the VMs.It's important to note that different virtualization platforms and hypervisors may implement timekeeping mechanisms differently. System administrators and virtualization architects need to understand the timekeeping features of their chosen platform to ensure accurate and synchronized timekeeping across their virtualized environment. In most cases, it is highly recommended to keep the Operating systems host time and VM time the same to avoid future issues.

VM Scheduled Tasks

VM scheduled tasks are automated actions or processes configured to run at specific times or intervals within a virtualized environment. These tasks are typically managed and executed by the virtualization platform's management tools or by external orchestration systems. Scheduled tasks help streamline administrative and operational activities, improve resource utilization, and maintain the health and performance of virtual machines.

To mitigate the potential impact of VM backups on JVMs and applications, consider the following best practices:

Schedule backups during periods of lower application workload to minimize resource contention.
Monitor system performance during backup processes to identify resource bottlenecks or contention issues.
Use backup solutions that are compatible with your virtualization platform and have been tested in similar environments.
Ensure that your VMs and applications are designed to handle temporary resource spikes or contention gracefully.
Regularly test backups and perform disaster recovery drills to ensure that backup and restore processes are reliable and do not negatively impact JVMs or applications.

In summary, while VM backups should not directly cause a JVM to crash, it's essential to consider the potential indirect effects of backup processes on system resources and application performance. Proper planning, monitoring, and testing can help minimize any impact on JVMs and ensure the overall stability of your virtualized environment.

Clock drifts or Gateway crashing at consistent times due to VM snapshots being taken

Whenever a VM does a scheduled task, such as creating a backup or snapshot, it would mess with the JVM timekeeping. When a snapshot is created, the hypervisor freezes all other processes. For example, VM backups seem to almost always cause clock drifts because they're pausing the entire VM to generate a copy of it. Ignition has no way of running while the VM is 'stuck' during that time.

VM Snapshots and Backups

For more information of specific scheduled tasks, such as VM snapshots and backups:

Migrating VMs

Migrating a VM involves moving a running or powered-off VM instance from one physical host or hypervisor to another. VM migration is used to provide benefits such as load balancing, resource optimization, hardware maintenance, and high availability. There are different migration methods and technologies available, depending on the virtualization platform being used.

Different types of VM migration methods:

Live Migrations:

Live migration allows a VM to be moved between hosts without disrupting its operation. This is typically achieved using technologies like VMware vMotion, Microsoft Hyper-V Live Migration, or XenMotion.
The source host and destination host establish a communication channel to transfer memory, CPU state, and device states. The VM's memory pages are copied from the source host to the destination host while the VM continues to run. During this process, the source host tracks memory changes and sends any updated pages to the destination host. Once the memory migration is complete, the VM's execution is transferred to the destination host.

Cold Migrations:

Cold migration involves shutting down the VM on the source host and copying its virtual disk files to the destination host.
Cold migration requires a brief downtime during the shutdown and startup phases.

Storage migrations:

Storage migration involves moving the VM's virtual disk files (VMDK, VHD, etc.) to a different storage location while keeping the VM running on the same host.
Once the storage migration is complete, the VM's data is accessed from the new storage location.

Consider the following post migration steps

Update any DNS records or load balancer settings to reflect the new location of the migrated VM.
Monitor the VM's performance and connectivity on the new host to ensure a smooth transition.

Considerations and Best Practices for VMs Hosting Ignition

What is a VM?

Types of Hypervisors

Why use a VM instead of physical hardware?

VM Software and Examples

Considerations for VMs Hosting Ignition