The vast number of transistors available through modern fabrication technology gives architects an unprecedented amount of freedom in chip-multiprocessor (CMP) designs. However, such freedom translates into a design space that is impossible to fully, or even partially to any significant fraction, explore. In this paper we propose to address this problem using predictive modelling, a traditional machine learning technique. More specifically we build models that, given only a minute fraction of the design space, are able to accurately predict the remaining designs orders of magnitude faster than simulating them.
Our models are able to accurately predict both energy and execution time, and that by looking at a very small number of training points. More importantly, in contrast to previous work, our models can predict these metrics not only for unseen CMP configurations for a given application, but also for unseen configurations of a new application, given only a small number of results (2 to 32 configurations) for this new application.
We perform extensive experiments to show the efficacy of the technique for exploring the design space of CMP's running parallel applications. Choosing both explicitly parallel applications and applications that are parallelized using the thread-level speculation (TLS) approach, we evaluate performance on a CMP design space with about 95 million points using 18 benchmarks with up to 1000 training points each. For predicting the Energy Delay metric, prediction errors for single applications range from 2.4% to 4.6% and from 3.1% to 4.9% for cross-application predictions.