Towards Better Memory Management in Hosted Linux Systems

Hubertus Franke (IBM T.J. Watson Research Center)

Playlists: 'linuxtag06' videos starting here / audio

Hubertus Franke, Martin Schwidefsky, Ray Mansell, Himanshu Raj, Damian Osisek, JongHuyk Choi IBM Corporation In this presentation we will introduce a novel collaborative memory management (CMM) for Linux when run virtualized in a hosted environment. CMM targets hosted environments where significant memory over-commitment is desired. Traditionally, in hosted/hypervised systems, like VMWare and XEN, this problem has been solved by dynamically adjusting the effective real memory sizes of the Linux guests through memory ballooning. However, this approach requires working set size estimations for each guest OS as well as frequent interactions with the guest OS to trigger changes and assert pressure on the guest to run its page eviction algorithms (LRUs). In systems where the host supports paging (VMWare, zSerie’s zVM), the host can utilize paging to provide the over-commitment of memory to its guests. In over-committed memory scenarios, both approaches can introduce significant overhead. Ballooning does not scale well with the number of guests, while host paging can introduce significant I/O activity. With host paging, the host deploys its own global host page eviction algorithm (LRU). The overhead origins from the fact that the host does not have any knowledge about the utilization of a guest page and as a result it must save the content of a guest page to the host swap area. CMM provides a facility that enables guest operating systems and hosts to share page usage and status information. This information is used by both, the host and the guest, to coordinate and optimize their paging behavior. The primary target is to help identify pages that are either unused (free) or that have a backing on storage and that can be reread by Linux (e.g. read only file pages). Such pages can simply be discarded by the host without the need to swap them out and without any involvement by the guest. When these pages are subsequently addressed again by the guest, a special page fault is sent to the guest to reload the content of the page from the backing storage. This reduces the delays a guest will experience due to host paging and it also reduces host paging activity. CMM has been prototyped for IBM's newest z/Architecture mainframe z9 virtualization stack, i.e. it's zVM hypervisor/host operating system and the Linux guest operating system. The page status information sharing is implemented as a z/Architecture millicode instruction. Linux was modified to track and communicate all its page state changes using said instruction to z/VM, which utilizes state information during its paging operation. We will show that under tight memory constraints this approach improves overall system performance.

Über den Autor Hubertus Franke: Dr. Hubertus Franke is a Research Staff Member at the IBM T.J.Watson Research Center, Yorktown Heights NY, where he currently manages the Enterprise Linux Group. His groups primary objectives is to drive enterprise level functionality towards the linux kernel. His technical interests are Operating Systems, Computer Architecture and distributed systems. In previous assignments at IBM research he contributed to the IBM SP2 supercomputer system through the implementation of the MPI message passing layer and the gang scheduling system. He received a Diplom Informatik degree from the Technical University of Karlsruhe in 1987 and a Ph.D. in Electrical Engineering from Vanderbilt in 1992.