IOCost: block IO control for containers in datacenters
Published in Architectural Support for Programming Languages and Operating Systems (ASPLOS 22), 2022
Resource isolation is a fundamental requirement in datacenter environments. However, our production experience in Meta’s large-scale datacenters shows that existing IO control mechanisms for block storage are inadequate in containerized environments. IO control needs to provide proportional resources to containers while taking into account the hardware heterogeneity of storage devices and the idiosyncrasies of the workloads deployed in datacenters. The speed of modern SSDs requires IO control to execute with low-overheads. Furthermore, IO control should strive for work conservation, take into account the interactions with the memory management subsystem, and avoid priority inversions that lead to isolation failures. To address these challenges, this paper presents IOCost, an IO control solution that is designed for containerized environments and provides scalable, work-conserving, and low-overhead IO control for heterogeneous storage devices and diverse workloads in datacenters. IOCost performs offline profiling to build a device model and uses it to estimate device occupancy of each IO request. To minimize runtime overhead, it separates IO control into a fast per-IO issue path and a slower periodic planning path. A novel work-conserving budget donation algorithm enables containers to dynamically share unused budget. We have deployed IOCost across the entirety of Meta’s datacenters comprised of millions of ma- chines, upstreamed IOCost to the Linux kernel, and open-sourced our device-profiling tools. IOCost has been running in production for two years, providing IO control for Meta’s fleet. We describe the design of IOCost and share our experience deploying it at scale
Recommended citation: Heo Tejun, Schatzberg Dan, Newell Andrew, Liu Song, Dhakshinamurthy Saravanan, Narayanan Iyswarya, Bacik Josef, Mason Chris, Tang Chunqiang, and Skarlatos Dimitrios. 2022. Iocost: Block io control for containers in datacenters. In Proceedings of the ACM ASPLOS. 595–608 https://doi.org/10.1145/3503222.3507727