The MIKELANGELO project proudly presents the overall MIKELANGELO technology stack.
We plan to release technology updates in six month intervals – this means you know when to expect the new flavours of the MIKELANGELO components and when to schedule internal testing of our releases. As of today, two releases have been published:
- MIKELANGELO Stack 3.0 (December 2017)
- MIKELANGELO Stack 2.0 (December 2016)
- MIKELANGELO Stack 1.0 (June 2016)
MIKELANGELO Stack 3.0:
These release notes summarise the enhancements, resolved bugs and additional components introduced in the final release of the MIKELANGELO Stack 3.0. All components are released as open source as part of the MIKELANGELO GitHub project.
IOcm – IO core manager
I/O Core Manager monitors the system and automatically adjusts the number of I/O cores accordingly. As I/O load rises, additional cores are dedicated to handle the I/O processing, so as to allow the VMs to operate more efficiently. The Manager continues to monitor I/O load vs CPU load, and can reallocate CPU cores as needed to most efficiently match the workload.
The source code and instructions on how to run the I/O Core Manager are available from the Github repository.
vRDMA – Virtualised RDMA
The vRDMA release includes the backend and frontend driver for both Linux and OSv guests, the user and kernel library support on OSv and several applications and benchmarks that are able to be run with OSv.
- vRDMA backend driver
- Complete hypercall support for handling the RDMA verbs calls from the user application running with frontend driver on the guest.
- RDMA Communication Manager (CM) API support has been implemented for applications that prefer to use RDMA CM mode for RDMA initialization, e.g. Open MPI, in order to achieve better performance than using socket API (out-of-band).
- Fast routing and address translation mechanism between guest and host for mapping the actual target and virtual target as needed.
- vRDMA frontend driver
- Complete hypercall support for application implemented with RDMA verbs.
- RDMA CM API support to the user application on the guest, and using asynchronous hypercall to communicate with the backend driver.
- Applications supported for OSv
- PerfTest: A pure RDMA verbs based benchmark, which comes with read/write tests on bandwidth and latency.
- Open MPI: The foundation for running MPI application on HPC or Cloud platform. The additional OOB and RDMA CM modules were extended and enhanced for running Open MPI with applications on OSv.
- NetPIPE: A scientific benchmark that provides several communication modes, i.e. TCP, RDMA verbs, MPI, etc. This has been supported on OSv.
vRDMA is available from the GitHub repository.
SCAM – Side-Channel Attack Mitigation & Monitoring
The Side-Channel Attack Mitigation & Monitoring module (SCAM) is a user space application that automatically detects cache-based side-channel attacks on co-located Virtual Machines. When an attack is detected the module performs a mitigation process that adds noise to attacker measurements of cache activity. By altering the configuration file one is able to select from various modes of operation, and tune parameters depending on the required security, and constraints on the available resources.
The release contains the implementation of all components of SCAM, along with a tutorial on how to run the various modules.
UNCLOT – UNikernel Cross Layer OpTimisations
UNikernel Cross-Layer OpTimisation (UNCLOT) is an OSv extension that uses the shared memory of the hypervisor to optimise the networking stack when collocated virtual machines are communicating between each other. It is built into OSv that automatically establishes the communication channel that completely bypasses the standard TCP/IP networking for collocated VMs.
This release brings the initial proof of concept implementation along with the tutorial describing the steps to run the demonstrator.
OSv – Operating system for the Cloud
MIKELANGELO replaces the Linux kernel and system libraries by OSv, a new operating system designed especially for running efficiently a single application (i.e., it is a unikernel) on a virtual machine, and is capable of running existing Linux applications with certain limitations. Compared to Linux, OSv has a significantly smaller disk footprint, smaller memory footprint, faster boot time (sub-second), fewer run-time overheads, faster networking, and simpler configuration management.
The work on OSv this year focused on getting more applications, runtime environments and programming languages, to run – correctly and efficiently – on OSv. We Implemented missing or broken system calls and C library functions, fixed various bugs, and tested support for additional runtime environments (e.g., Python, Go and Java 9). We also worked on making OSv build properly using newer compilers and distributions (up to gcc 7) but also older compilers (since gcc 4.8), to allow everyone to build OSv regardless of his or her build environment. We also continue to improve OSv’s support for different clouds, with HyperV support added this year. This year we also improved cloud-init support and support for immutable (state-less) images.
Seastar – C++ framework for high-performance server applications
While OSv allows running existing Linux applications, certain Linux APIs, including the socket API, and certain programming habits, make applications which use them inefficient on modern hardware. OSv improves the performance of such applications to some degree, but rewriting the application to use new non-Linux APIs can bring even better performance. So we designed and implemented a new API, Seastar, for writing new highly-efficient asynchronous network applications, which are significantly faster than traditional applications.
Our work on Seastar this year included a large number of improvements to this young project, improving on both on its usability (new features and documentation) and performance. We focused on features needed by the Cassandra re-implementation used by the Cloud Bursting use case, and especially on features such as the CPU scheduler whose goal is to reduce latency of latency-sensitive requests by isolating different components of the application. Additional improvements this year include expanded documentation (for both API documentation and tutorial), cleaned up APIs, performance improvements and specifically continuation batching (SEDA), and improved monitoring capabilities.
LEET – Lightweight Execution Environment Toolbox
Lightweight Execution Environment Toolbox (LEET) is a set of tools that simplify management of application packages, OSv-based unikernel composition, provisioning and service orchestration. This release includes enhancements made to Capstan and Virtlet open-source projects:
- Improved customisation of run configurations, including full inheritance of application packages. This long-standing feature now fully supports reuse of packages along with their predefined run scripts.
- Package version support. Package versioning and support for updating local packages.
- Built-in support for additional block storage. It’s now possible to attach multiple block devices to OSv unikernels when provisioned via Capstan.
- Remote package composition. This feature supports composition of unikernels in their target environment.
- Updated runtime environments for native, Java, Node.js and Python.
- Virtual machine logging. With this extension of Virtlet, it is possible to review live logs from a running unikernel.
- Numerous updates to the OSv contextualisation allowing Virtlet to customise the unikernel prior to running.
Self-sustained VMs have been prepared containing all the necessary tools of LEET as well as prepackaged Apache Spark and OpenFOAM applications. Images in QCOW2 and VMDK format can be downloaded here.
Snap – Telemetry framework
This MIKELANGELO release includes several newly developed components for the snap telemetry framework.
- Snap-deploy automates provisioning and configuration management of the Snap framework and plugins. It provides a convenient and consistent way of configuring, re-configuring and managing Snap deployments. Snap-deploy is a binary application and includes the following functionality:
- Deploy – configure and deploy the Snap framework with appropriate plugins
- Redeploy – reconfigure and re-deploy the Snap framework with appropriate plugins
- Download – download Snap framework binaries and appropriate plugins
- Kill – kill the Snap service process along with plugins
- Start – start the Snap process and configured tasks
- Generate task – create a task manifest and export as a json file
- Help – print user help
- Snap-plugin-collector-openvswitch allows gathering of metrics from the Open vSwitch database. This multilayer virtual switch implementation is the default network solution used by both OpenStack and vTorque. All standard network interface metrics are exposed, including number of flows, packets and bytes transmission per flow and bridge.
- Numerous enhancements have been contributed to the broader snap framework, including upgrades to the numerous snap collector, processor and publisher plugins previously contributed by MIKELANGELO. These all now work in Snap 2.0, support the swagger API, and include integrated plugin diagnostics.
Scotty – Continuous experimentation
Scotty implements continuous experimentation, which allows the management of infrastructure experiments. Scotty provides facilities to defined infrastructure resources, workload, and experiments. Experiments contain multiple workloads, which in turn run on resources. Thus, scotty enables the automated execution of experiments. Further features provided by scotty include a set of infrastructure integrations for resource deployment, resource configuration, and, data collection. Resource deployment builds on the integration with OpenStack Heat and OpenStack Nova. Resource configuration happens using the integration with actuator.py. Data collection happens based on integrations with MongoDB, InfluxDb, and ownCloud.
This release includes features related to the parallel execution of resources and workloads, the data export to ownCloud, and assorted bug fixes. The model of the parallel execution has been changed from multi-threading to parallel processes. The multithreaded execution turned out to lead to unintended resource sharing during resource deployment and workload execution. The data export to ownCloud allows researchers now to store data in workloads in a local folder, which is then transferred to a pre-configured ownCloud directory. This feature allows for a direct data archival for experiments.
MCM – MIKELANGELO Cloud Manager
The MIKELANGELO Cloud Manager is a live-resource manager for cloud computing. MCM currently integrates with OpenStack, actuator.py, and InfluxDB. OpenStack is used to obtain information about the deployment of VMs, to migrate VMs, and to scale VMs. InfluxDB is used to feed MCM with live data from the infrastructure.
This release provides an integration with actuator.py. This integration allows MCM to leverage the infrastructure control features in actuator.py. Thus, in addition to migration and scaling via OpenStack, MCM now can use all control mechanisms provided by actuator.py. These features include, among others, changing the ioscheduler in Linux, controlling TurboBoost, and controlling DVFS.
OpenFOAM Cloud – Aerodynamic use case
OpenFOAM Cloud showcases the MIKELANGELO technology stack using the Aerodynamics use case. It integrates Capstan to dynamically compose OSv-based virtual machines and deploy them onto OpenStack and Amazon cloud infrastructures. Snap is used to collect data from different layers. Core functionality is offered as a standalone backend service to control the simulations. Built-in scheduler dynamically distributes the requested workloads depending on the capacity of target infrastructures. The user interface is provided as an OpenStack Horizon dashboard allowing users to control and monitor OpenFOAM scientific experiments.
The primary focus of this release was on ease of installation, use and stability improvements. The scheduler improves the integration with OpenStack (tested with up to OpenStack Newton) and Amazon Web Services.
Both, the backend and frontend are offered as PIP packages:
MIKELANGELO Stack 2.0:
Over the last six months, MIKELANGELO has improved a large number of components to deliver the MIKELANGELO Stack 2.0. Please find all detailed improvements below.
- sKVM has been improved in particular through IOcm, vRDMA, and SCAM.
- IOcm has been extended by a dynamic IO core manager. The dynamic IO core manager samples I/O statistics, estimates the I/O pressure, and allocates the right amount of I/O cores.
- A second prototype of vDRMA has been implemented. Compared to first prototype, the front-end driver has been improved. It now allows the use of a verbs-based API for improved performance.
- SCAM has been extended by monitoring, profiling, and initial mitigation of attacks.
- In the guest OS layer, OSv and Seastar have been extended.
- OSv has been improved for better support of applications including additional system call and C-lib functions, new build environments, a better NFS client, isolated threads, partial NUMA support, cloud-init support, improved DHCP, file system performance, and assorted bug fixes.
- Seastar improvements cover improved RPC, I/O scheduling, IOtune, CPU scheduling, monitoring, ext4 support, a log-structured memory allocator, and DPDK support.
- The middleware extensions cover the MIKELANGELO Package Manager (MPM), Snap for monitoring, Scotty for system testing, vTorque for virtual HPC, and the MIKELANGELO Cloud Manager (MCM).
- The MPM has been extended with a Docker container to sever the dependency to KVM during VM composition. Furthermore, MPM has been integrated with UniK for a better cloud integration.
- Snap has been extended by ten new collector plugins and two new processor plugins. Most of those plugins relate directly to MIKELANGELO-specific components and some of them have already been further enhanced by the snap community.
- Scotty has been developed from scratch to allow automated experimentation with virtual infrastructures (cloud and HPC). The initial version of Scotty is shipped with Puppet-based deployment scripts.
- For HPC, vTorque has been developed as an extension to Torque to allow the deployment of virtual machines. The current release allows the transparent provisioning of Linux and OSv instances for virtual workloads.
- For the cloud, MCM has been developed as an extension to OpenStack. MCM allows live-scheduling of resources in OpenStack. MCM integrates with OpenStack’s APIs and InfluxDB to collect monitoring data. The main goal of the MCM development is to foster the development of cloud resource management algorithms.
The individual components have been tested in two integrated testbeds. The first testbed uses OpenStack for a cloud deployment. This testbed integrates OSv, IOcm, Snap, MCM, Scotty, and MPM. The second testbed targets HPC and integrates OSv, IOcm, Snap, vRDMA, vTorque, and Scotty. The project’s use cases (link) have used the integrated testbeds to validate the components.
MIKELANGELO Stack 1.0:
This is the first release of the MIKELANGELO project and presents the first step towards the fully integrated and demonstrated MIKELANGELO technology stack. This release is the result of the ramping up period, followed up by intensive development in the first 18 project months. You can see the release as an indication of the directions the project is about to take (as we prepare the ground to introduce new packages, techniques, and more).
We prepared three extensive documents for everyone who wishes to fully check and understand the work on different components, how they fit together and how well they perform.
The first document is about the overall architecture of the MIKELANGELO technology stack and the individual components – The intermediate MIKELANGELO architecture. In short, skimming through the document will show you the components developed, the outlook for new features and also provide you with information on how to run them and what to expect.
Understanding the technology is a good way to start thinking if it is a good fit for your particular use case. In MIKELANGELO, we have four use-cases, ranging from HPC to Cloud – these are all described in the Use Cases Implementation Strategy. You will see what our use-cases are and how we tackled them.
Finally, as MIKELANGELO is about improving the performance in the virtual technology stack, increasing flexibility, unifying the HPC and Cloud software stack, we put these to the test – The Architecture and Implementation Evaluation. This document shows where we are at M18.