MIKELANGELO has adopted and extended Intel’s snap open-source telemetry framework to deliver full-stack instrumentation and monitoring across all of the MIKELANGELO use cases.
Snap is a framework that allows data center owners to dynamically instrument cloud-scale data centers. Precise, custom, complex flows of telemetry can be easily constructed and managed. Data can be captured from hardware and software sources, in-band or out-of-band, local or remote via snap collector plugins. Captured data can be passed through local filters – snap processor plugins – that analyse and perform some action on the data. The processed data can then be published by snap publisher plugins to arbitrary destinations. Endpoints can include SQL and NoSQL databases, message queues, or analytics engines such as Intel’s open source Trusted Analytics Platform.
Snap has been developed from the ground up to be trustworthy, performant, dynamic, scalable and highly extensible. Snap includes:
- a daemon on nodes that collect, process and/or publish data. The data can be collected from the local node, or from remote nodes.
- a dynamic catalogue of metrics, based on currently loaded plugins
- highly configurable telemetry workflows, knowns as tasks
- a command line interface that allows metrics, plugins and tasks to be manipulated
- a RESTful API for remote management
- simplified cluster-aware management via tribe
For a complete introduction to snap in MIKELANGELO see our blog post “Full-stack cloud-scale instrumentation? It’s a snap…”.
Achievements and Results
To date, MIKELANGELO has developed and open-sourced plugins to
- collect data from Libvirt, OSv and OpenFOAM,
- inject meta-data tags to facilitate offline analysis,
- publish telemetry to PostgreSQL.
MIKELANGELO has also demonstrated snap running on a 500 node cluster, and proven that the MIKELANGELO-enhanced ScyllaDB rewrite of Apache Cassandra can be employed as a back-end data-store for snap-gathered telemetry.
Development is nearing completion on plugins to automatically reduce data resolution when metric readings are steady, and to gather utilisation, saturation and error data for common subsystems.
Snap can already collect data from OpenStack Cinder, Glance, Keystone, Neutron and Nova. In future work, we will explore if integration with OpenStack Ceilometer would be beneficial. We will also investigate techniques to simplify analysis of the data captured by snap.
Here are pointers to key snap resources from both our project and the community that you may find useful. Enjoy!
- D5.7 First report on the Instrumentation and Monitoring of the complete MIKELANGELO software stack – published January 2016. Introduces the monitoring requirements of MIKELANGELO, the state-of-the-art, the adoption of snap by the project, and our contributions by the end of 2015.
- D6.1 First report on the Architecture and Implementation Evaluation – published July 2016. Section 2.6 includes an evaluation of the snap framework in MIKELANGELO from architectural, performance and implementation points of view.
- Full-stack cloud-scale instrumentation? It’s a snap… – published July 2016. Introduces snap in MIKELANGELO, summarising all contributions to snap to date, contributions that are imminent, and results of integration work between snap and ScyllaDB.
- snap collector plugins
- snap processor plugins
- snap publisher plugins
- snap home page – start here!
- snap on GitHub – get all the code, log suggestions, contribute
- snap blog posts – technical insights and articles to get you up and running
- snap team on slack – chat with snap developers
- snap videos – hear the thinking behind snap, see it in action