The Importance of IO Efficiency
IOCM is a major technology of the sKVM hypervisor developed in Mikelangelo, providing a virtual I/O performance superior to the vanilla KVM hypervisor. We’ve explained the importance of efficient I/O for virtual machines in the following two blogs:
The new developments in the areas of IoT, Big Data and HPC further increase the importance of I/O even more, even within the HPC world where the compute power has historically been perceived as the dominant and almost the only factor. New scenarios emerge, where millions of devices send metrics into computation centers, generating Big Data on which HPC applications work and provide insights. In such scenarios, the ability to consume and process this vast amount of messages is highly dependant on efficient I/O.
In this document we describe some of the technical aspects of the IOCM technology.
IOcm introduces efficiency improvements to the I/O subsystem of the hypervisor. By using a shared I/O processing thread, the hypervisor is then able to take control of its own decisions regarding the scheduling policies of I/O processing. By relieving the general Linux thread scheduler, and transferring the responsibility to the hypervisor, the system gains a higher level of control over I/O traffic, and less wasted overhead of thread context switching.
The shared I/O processing thread allows the hypervisor to control processing per virtual device very precisely, allowing the hypervisor to make rapid scheduling decisions in response to a changing environment. This alleviates thread starvation, and threads that hold the CPU despite not having any active I/O requests.
IOcm provides a mechanism to control the aforementioned shared I/O threads. It enables low-level functionality such as creating and destroying I/O threads (vhost ), and migrating devices between I/O threads.
At a higher level, the main component of IOcm is the monitor, which makes decisions about resource allocation to ensure maximum I/O performance.
The monitor takes a system-wide view for balancing between the I/O and the computation requirements. It periodically reads statistics such as the throughput, and uses these statistics to determine the optimum configuration of the I/O subsystem of the hypervisor (vhost thread).
As you can see in the experiment results for running Apache bench and a more dynamic workload, the importance of dynamic management of the side cores is evident. In the lower graph you can see even a point where a static cores assignment fell even shorter than the baseline (vanilla KVM), this is due to not adjusting to the dynamicity of the workload resource requirements.
IOCM is implemented within a kernel module called vhost. The vhost module is controlled from user-space through an API implemented using sysfs. Sysfs is a common mechanism in the Linux kernel for providing information and control points to user space. Specific data is exposed through a set of virtual files which can be used to understand the state of vhost, and modify its behaviour accordingly. The files exposed through sysfs appear in the file system as regular files owned by ’root’, and are subject to the same treatment as other files (ownership, access controls, etc). An application in user space can communicate with the vhost module by reading and writing these sysfs files. Together, these files constitute the vhost IOcm interface (API).
IOCM is a technology that that increases the efficiency of I/O intensive workloads running inside virtual machines. It relieves part of the overhead in I/O processing resulting from the mixture of I/O and compute threads by assigning dedicated cores for I/O, and dynamically adjusts the number of I/O cores according to the workload’s changing resource requirements. Experimental results show promising improvements for various types of workloads.