CacheScouts: Fine-Grain Monitoring of Shared Caches in CMP Platforms

Li Zhao,   Ravi Iyer,  Ramesh Illikkal,  Jaideep Moses,  Don Newell,  Srihari Makineni


As multi-core architectures flourish in the marketplace, operating systems (OS) and virtual machine monitors (VMM) attempt make efficient use of the available cores by scheduling for maximum performance efficiency. However, in the CMP architectures, the cores share platform resources (cache, memory and I/O) that are invisible to the execution environment (OS/VMM). In this paper, we focus on the shared cache resource since recent studies have shown that contention for this resource can cause significant loss in performance, determinism and QoS. In today’s platforms, it is impossible to accurately tell which of the applications running on the platform are occupying shared cache space and causing significant contention. We investigate mechanisms for monitoring the use of cache resources along three vectors: (a) occupancy – how much space is being used and by whom, (b) interference – how much contention is present and who is being affected and (c) sharing – how are threads cooperating. We propose novel tagging (based on ASIDs) and sampling mechanisms (based on set sampling) to reduce the overhead of these monitoring mechanisms drastically. We also show why these low overhead mechanisms are critical to operating systems and virtual machine monitors through case studies of their use for (a) scheduling / performance management, (b) quality of service guarantees and (c) metering and chargeback of platform usage. In summary, our findings show that we can achieve shared cache monitoring on a per application basis at low overhead (<0.1%) and with very little loss of accuracy (<5%). We expect that mechanisms can be easily implemented in future platforms and will help the execution environment improve application performance and quality of service significantly.