Wednesday, 31 May 2017

Difference between MILEPOST GCC (machine learning based self-tuning compiler) and Collective Knowledge Framework

I recently received several questions about the differences between MILEPOST GCC compiler and Collective Knowledge Framework. This motivated me to write this slightly nostalgic post with the R&D history behind MILEPOST GCC and our CK framework.

MILEPOST GCC is an extended GCC which includes:

1) Interactive Compilation Interface aka ICI - a plugin based framework to expose or change various information and optimization decisions inside compilers at fine-grain level via external plugins. I originally developed it for Open64 and later collaborated with Zbigniew Chamski and colleagues from Google and Mozilla to make it a standard plugin framework for GCC.

2) Feature extractor developed by Mircea Namolaru from IBM as an ICI plugin to expose low-level program features at a function level (see available features here). It was also extended by Jeremy Singer (ft57–65).

However, to keep MILEPOST project complexity under control, I decided to separate MILEPOST GCC from an infrastructure to auto-tune workloads, build models and use them to predict optimizations. Therefore, I developed the first version of the cTuning framework to let users auto-tune GCC flags for shared benchmarks and data sets, use MILEPOST GCC to extract features for these benchmarks, build predictive models (possibly on the fly, i.e. via active learning), and then use them to predict optimizations for previously unseen programs (using ICI to change optimizations).

However, since it was still taking really too long to train models (my PhD students, Yuriy Kashnikov and Abdul Memon, spent 5 months preparing experiments in 2010 for our MILEPOST GCC paper), we decided to crowdsource autotuning via a common repository across diverse hardware provided by volunteers and thus dramatically speed up training process. Accelerating training process and improving the diversity of a training set is the main practical reason why my autotuning frameworks use crowdtuning mode by default nowadays ;) … 

The first cTuning framework turned out very heavy and difficult to install and port (David Del Vento and his interns from NCAR used it in 2010 to tune their workloads and provided lots of useful feedback — thanks guys!). This motivated me to develop a common research SDK (Collective Knowledge aka CK) to simplify, unify and automate general experiments in computer engineering.

CK framework lets the community share their artifacts (benchmarks, data sets, tools, models, experimental results) as customizable and reusable Python components with JSON API. So, you can take advantage from already shared components to quickly prototype your own research workflows such as benchmarking, multi-objective autotuning, machine-learning based optimization, run-time adaptation, etc. That is rather then re-building numerous ad-hoc in-house tools or scripts for autotuning and machine-learning based optimization which rarely survive after PhD students are gone, you can now participate in collaborative and open research with the community, reproduce and improve collaborative experiments, and build upon them ;) … That’s why ACM is now considering using CK for unified artifact sharing (see CK on the ACM DL front page).

You can also take advantage of integrated and cross-platform CK package manager which can prepare your workflow and install missing dependencies on Linux, Windows, MacOS and Android.

For example, see highest ranked artifact from CGO’17 shared as a customizable and portable CK workflow at GitHub.

To conclude my nostalgic overview of the MILEPOST project and CK ;) — MILEPOST GCC is now added to the CK as a unified workflow while taking advantage of a growing number of shared benchmarks, data sets, and optimization statistics (see CK GitHub repo).

I just didn’t have time to provide all the ML gluing, i.e. building models from all optimization statistics and features shared by the community at cKnowledge.org/repo . But it should be quite straightforward, so I hope our community will eventually help implement it. We are now particularly interested to check the prediction accuracy from different models (SVM, KNN, DNN, etc) or to find extra features which improve optimization prediction.