Openstack meets MIKELANGELO and OSv - Mikelangelo - Horizon 2020 Project on Virtualization, Cloud Computing, and HPC

MIKELANGELO was at the OpenStack Summit event, presenting the talk titled HPC, Unikernels and OpenStack, as part of the HPC Research track. The session focused on lessons learned while integrating HPC workloads with unikernels on top of OpenStack. This blog post is a summary of the presentation given by project’s coordinators, Daniel Vladušič and Gregor Berginc.

HPC workloads are traditionally executed in general purpose operating systems running directly on the host. To improve manageability and mostly flexibility of the infrastructure, sometimes container technology is used – typically in deployments called HPC-clouds. However, while containers offer density and rapid deployment, they lack in security. To this end, we use lightweight general purpose unikernel OSv. This approach provides bare minimum, yet fully virtualized and thus more secure execution environment.

Looking at the high-level diagram of MIKELANGELO project, the presentation was focused on the execution, application deployment in OSv and then on the integration with OpenStack.

To put OSv in context, we presented other unikernels – currently this field if lively, with many highly specialized, language specific unikernels emerging (e.g. MirageOS, HaLVM, Clive, etc.). OSv on the other hand is a general purpose unikernel – the only similar unikernel is Rumprun. As we need versatile, drop-in replacement for Linux, OSv is the best choice – it can run Java, C, C++, Ruby, Perl, Go and more. It is also standards’ compliant (POSIX, Linux, stdlib), offers KVM support, has demonstrated superior network performance, with numerous existing apps and community backing.

The advantages of the unikernel are its small image footprint, which results in network and energy efficiency. It also boots extremely fast, flattens execution and permission layers (single address space, single user), offers isolation and security and has small attack surface. However these properties come at a price. Single address space means running of a single process (no forking). Given the developer base is rather small (unikernels are still rather niche technology), there always exists a concern about the performance level – some functionalities are not developed on the scale as in the more widespread operating systems.

Our experience with OSv confirms the general advantages as well as disadvantages. We had to work around the kernel, libraries and app forks, as only multi-threaded applications are allowed – we have extended the threading support in OSv (all accepted in OSv source). Typical HPC application requires MPI, for which an OSv-dependent extension has been implemented for Open MPI (processes replaced with threads and replaced remote access). On the more general level, we can observe that when running applications, some system calls and standard library functions were missing. Sometimes, stubbing them worked, but not always. Typical changes required to run an application under OSv are source code re-compilation, omission of unavailable concepts (e.g. fork) and then trial and error approach.

Once we are capable to run application in OSv, we should appropriately package it. To this end, we have several tools available – historically, we have Capstan, which has been heavily updated in the last year. It now offers package management (Initialisation, Collection, Composition, Execution and Run configurations), basic OpenStack integration (Image creation, Instance execution) and Package hub (package repository to allow composition of application from several smaller packages). On the other hand, as part of the larger effort in the unikernel area, we have contributed OpenStack Provider for UniK, allowing for Image & Instance management and Networking functionalities. Use of either of these two tools allows management of OSv applications under OpenStack. Strictly speaking, OSv is now supported from the Glance, Nova, Neutron, Heat and Horizon components.

Let us now turn to real-life applications tested and deployed. The first one is aerodynamics problem, which requires OpenFOAM and Open MPI. OpenFOAM was re-compiled with no changes to the source code, but with some small changes to the build system. We compiled it into shared objects with build dependencies andhad to provide a few new functions in OSv and stdlib. The result is OpenFOAM system, running unmodified commands, with several pre-built application packages already provided.

Another example is Fortran proprietary simulation code. We had to make just a few modifications in the source code, recompile it and it is able to run unmodified commands.

To conclude, the MIKELANGELO project covers a complete stack – from unikernel to OpenStack management of the said unikernel. Inside the unikernel, we can run much more than we could one year ago, also offering this efficient solution to the HPC providers.