PACMan: Coordinated Memory Caching for Parallel Jobs
Title | PACMan: Coordinated Memory Caching for Parallel Jobs |
Publication Type | Conference Paper |
Year of Publication | 2012 |
Authors | Ananthanarayanan, G., Ghodsi A., Wang A., Borthakur D., Kandula S., Shenker S. J., & Stoica I. |
Page(s) | 1-14 |
Other Numbers | 3381 |
Abstract | Data-intensive analytics on large clusters isimportant for modern Internet services. As machines inthese clusters have large memories, in-memory cachingof inputs is an effective way to speed up these analyticsjobs. The key challenge, however, is that these jobs runmultiple tasks in parallel and a job is sped up only wheninputs of all such parallel tasks are cached. Indeed, a singletask whose input is not cached can slow down the entirejob. To meet this all-or-nothing property, we havebuilt PACMan, a caching service that coordinates accessto the distributed caches. This coordination is essential toimprove job completion times and cluster efficiency. Tothis end, we have implemented two cache replacementpolicies on top of PACMans coordinated infrastructure LIFE that minimizes average completion time by evictinglarge incomplete inputs, and LFU-F that maximizescluster efficiency by evicting less frequently accessed inputs.Evaluations on production workloads from Facebookand Microsoft Bing show that PACMan reduces average |
URL | http://www.icsi.berkeley.edu/pubs/networking/ICSI_pacmancoordinatedmemory12.pdf |
Bibliographic Notes | Proceedings of the 9th USENIX Symposium on Networked Systems Design and Implementation (NSDI 12), pp. 1-14, San Jose, California |
Abbreviated Authors | G. Ananthanarayanan, A. Ghodsi, A. Wang, D. Borthakur, S. Kandula, S. Shenker, and I. Stoica |
ICSI Research Group | Networking and Security |
ICSI Publication Type | Article in conference proceedings |