OCFS2: Native Linux Cluster Filesystem

Philip Copeland (Oracle)

Playlists: 'linuxtag06' videos starting here / audio

This talk will review the various components of the OCFS2 stack, with a focus on the file system and its clustering aspects. OCFS2 extends many local file system features to the cluster, some of the more interesting of which are posix unlink semantics, data consistency, shared readable mmap, etc. In order to support these features, OCFS2 logically separates cluster access into multiple layers. An overview of the low level DLM layer will be given. The higher level file system locking will be described in detail, including a walk through of inode locking and messaging for various operations. Caching and consistency strategies will be discussed. Meta data journalling is done on a per node basis with JBD. Our reasoning behind that choice will be described. OCFS2 provides robust and performant recovery on node death. We will walk through the typical recovery process including journal replay, recovery of orphaned inodes and recovery of cached meta data allocations. Allocation areas in OCFS2 are broken up into groups which are arranged in self optimizing "chains". The chain allocators allow OCFS2 to do fast searches for free space, and deallocation in a constant time algorithm. Detail on the layout and use of chain allocators will be given. Disk space is broken up into clusters which can range in size from 4 kilobytes to 1 megabyte. File data is allocated in extents of clusters. This allows OCFS2 a large amount of flexibility in file allocation. File meta data is allocated in blocks via a sub allocation mechanism. All block allocators in OCFS2 grow dynamically. Most notably, this allows OCFS2 to grow inode allocation on demand.

Über den Autor Philip Copeland: Philip Copeland is senior software developer in Oracle’s Linux Engineering group, has been working with open source software (or designing and testing software) for more than 10 years. From Northern Ireland, Studied at the University of the West of England an gained a BSc(hons) Computing for Real Time Systems degree Worked in IBM Global Services Networking for EMEA 4 years. Worked at RHAT (North Carolina HQ) Specifically on clustering/high availability and the alpha processor distribution 3 years I moved back to N.Ireland and I started working for Oracle in the Open Source Systems group (Under Wim Coekaerts) almost 3 year.