Performance clarity as a first-class design principle.

TitlePerformance clarity as a first-class design principle.
Publication TypeConference Paper
Year of Publication2017
AuthorsOusterhout, K., Canel C., Wolffe M., Ratnasamy S., & Shenker S.
Published inProceedings of the 16th Workshop on Hot Topics in Operating Systems HotOS'17

Users often struggle to reason about the performance of today's systems. Without an understanding of what factors are most important to performance, users do not know how to tune their system's hardware and software configuration to improve performance. We argue that performance clarity -- making it easy to understand where bottlenecks lie and the performance implications of various system changes -- should be a first class design goal. To illustrate that this is possible, we propose an architecture for data analytics frameworks in which jobs are decomposed into schedulable units called monotasks that each consume a single resource. By untangling the use of different resources, using monotasks allows the system to trivially report time used on each resource and the resource bottleneck. Our prototype implementation of monotasks for Apache Spark is API-compatible and achieves performance parity with Spark, and yields a simple performance model that can predict the effects of future hardware and software changes.


We thank Radhika Mittal, John Ousterhout, Aurojit Panda, and Patrick Wendell for helpful comments on earlier drafts of this paper. We are appreciative of Shivaram Venkataraman for discussions during the tiny tasks project [8] that led to the idea of breaking jobs into small, single-resource units of work. This research was supported in part by a Hertz Foundation Fellowship, a Google PhD Fellowship, and Intel and other sponsors of UC Berkeley’s NetSys Lab.

ICSI Research Group

Networking and Security