BeeGFS (Admin Guide)
This article explains how the file system in Bielefeld is set up. Therefore, it is divided into two parts, the first describes the system specification and the user documentation while the second part focusses on BeeGFS Tuning.
The Bielefeld GPU Cluster files system is build out of two basic building blocks, BeeGFS and ZFS. BeeGFS provides a high-performance parallel files system and is combined with ZFS providing a file system and logical volume manager. Both building blocks, BeeGFS and ZFS, are described in detail in the next section “BeeGFS Tuning”. The specific user information are given in this article.
BeeGFS itself has three main components the metadata service, the storage service and the client service. Additionally, two optional components can be activated by the management service and the administration and monitoring service (Admon). Any client target, which wants to access the file system locally or remote, talks to either the metadata or storage service. This has no impact on the user experience since the client services provide a mount point such that the file system can be accessed in the standard way. BeeGFS makes use of the strict distinction of metadata and storage. On the one side if the client performs any file I/O operation the client service communicates directly with the storage service which performs the actual file I/O process and striping of files among different storage servers. On the other side the metadata service coordinates the file placement and striping. To keep the metadata access latency as small as possible BeeGFS opens the possibility to distribute the metadata service across multiple servers. This means that each metadata service contains a part of the global file system namespace.
Further information can be found at
ZFS is a combined file system and logical volume manager designed for scalability, high perfomance and large storage capacities. It integrates efficient data compression via snapshotting and copy on write clones. Continuous integrity checking with automatic repair enables a strong protection against data corruption. First designed by Sun Microsystems for the Solaris operating system the code was released as part of OpenSolaris, leading into two major implementations, one by Oracle and one by the OpenZFS project, facilliating a high availability on Unix like systems as well as windows systems. A more detailed desctiption, espacially on the data compression can be found in the “Backups und Archivierung” sections. TODO Link the articles (coming soon)
Further information can be found at
The most important part of the BeeGFS tuning of the Bielefeld GPU Cluster is the setup of the ZFS system. Having that achived afew parameter can be set. A detailed explanation can be found in the BeeGFS installation guide in the section Tuning and advanced Configuration. The runtime configuration can be set using the BeeGFS controle tool
beegfs-ctl or a config file
beegfs-client.conf. The static configuration can be set in the configuration files
Running a ZFS system beneath BeeGFS requires some more CPU power and RAM as a more standard installation but is indispensible on the Bielefeld GPU Cluster. As described in in the BeeGFS Wiki about Softwareraid it is important to be economical with the ZFS option to not overwhelm the resources a cluster has. This can be handled by setting parameters such as
zfs_max_recordsize appropriately in the zfs system and is also described in the Sofwareraid section of the BeeGFS Wiki.