Distributed Image Management for Linux Clusters
A scalable image management tool that allows blades to run a Linux distribution over the network without a local disk.
Date Posted: April 24, 2007
|
|
 |
 |
|
Update: October 25, 2007
New version supports blades for IBM BladeCenter with the Cell Broadband Engine processor, including the QS20 and recently announced QS21.
What is Distributed Image Management for Linux Clusters?
Distributed Image Management for Linux® Clusters was developed as a scalable image management tool that allows blades to run a Linux distribution over the network without a local disk. No modifications to the image are required in order to operate Distributed Image Management for Linux Clusters. Changes to the image for traditional maintenance are incrementally replicated to thousands of image replicas in seconds. This tool fills a critical need not met by existing cluster management suites. Distributed Image Management for Linux Clusters was first developed for use in IBM®'s MareNostrum supercomputer, Europe's most powerful, at the Barcelona Supercomputing Center. It consists of 2560 IBM JS21 Blades (x4) for a total of 10240 CPUs total as a super-large Linux cluster.
Distributed Image Management for Linux Clusters is also used for fast incremental maintenance of images. Changes such as new user IDs, changed passwords, or new RPMs can be replicated across a cluster of thousands of nodes in seconds. Smart incremental Linux replication tools are able to sense exactly where changes were made in the master image, regardless of the type of change. Distributed Image Management for Linux Clusters is configured with keep-out directories or files in order to avoid unwanted replication from the master image to the replicas.
Distributed Image Management for Linux Clusters also provides an XML file that describes the cluster network and naming taxonomy. Network IP configuration and DHCP configuration is automated from this XML file.
Distributed Image Management for Linux Clusters is a cluster image management utility. It does not contain tools for cluster monitoring, event management, or remote console management. Those tools can be obtained from cluster management suites such as xCAT (which is also available here at alphaWorks®). Utilities from xCAT can complement the capabilities of Distributed Image Management for Linux Clusters. xCAT provides an alternative open-source image management process called Warewulf. However, Warewulf is stateless and uses a RAM-resident root file system with a shared read-only file system. Distributed Image Management for Linux Clusters instead preserves state by managing individual read/write images for each node as well as a replicated, shared read-only file system for efficiency.
How does it work? One or more image servers provide the configuration, set-up, and provisioning for all nodes in the cluster. No modifications are required to the client image in order to support Distributed Image Management for Linux Clusters. A single, regular disk-based installation is first required. This master image is then cloned by the image server. A two-level hierarchy of image servers allows thousands of nodes to be easily maintained.
Distributed Image Management for Linux Clusters clones a traditional enterprise-class Linux image that may be customized as necessary. This image is called the master. No additional software need be installed on the master in order to support Distributed Image Management for Linux Clusters. Distributed Image Management for Linux Clusters distributes this image to one or more image servers. The image servers act as network boot servers and root file system servers for the nodes of the cluster. All nodes in the cluster are set for network boot. The network boot image contains a RAM disk that has a modified linuxrc routine; this is the first user process run by Linux. This routine sets up the Distributed Image Management for Linux Clusters network file system and changes root (chroot) to the network root file system. At this point, all the traditional Linux init processes are executed and all the unmodified Linux init scripts are executed as if the node is booting from a hard disk. Upon the first boot from the image server, the node is fully operational, thereby eliminating the need for node installation.
Distributed Image Management for Linux Clusters can manage different distributions or different personalities of a distribution so that nodes of a cluster can change distributions with a reboot. A different master is required for each distribution. Distributed Image Management for Linux Clusters makes extensive use of the Linux rsync facility in order to allow for incremental maintenance. This facility allows changes to a set of files on the master to be distributed to thousands of nodes in seconds regardless of whether the node is operational or offline. For added efficiency, Distributed Image Management for Linux Clusters separates the image into read/write and shared read-only components.
|
|
 |

|  | About the technology author(s): Gregory Rodgers, Ph.D., is the creator and chief architect of the MareNostrum supercomputer. He joined IBM in 1981 and is currently the Program Director of Next Generation Cluster Systems. He held various technical leadership positions at IBM and created the design environment for IBM POWER™ microprocessors. In 1999 he led the IBM team that ported Linux to POWER processors. While on location he joined the Australian open source development lab (Ozlabs) to the IBM Linux Technology Center. In 2004, he created the Three Rivers Linux cluster architecture with IBM BladeCenter® and JS20 blade servers. This architecutre was used for MareNostrum and the Indiana University Big Red Supercomputer. He is the architect for the Distributed Image Management for Linux Clusters system used on several large supercomputers. Today he works on the Next Generation Systems Development team.
Peter Morjan is the lead programmer for the Distributed Image Management for Linux Clusters system. He joined IBM Germany in 1998 and started developing and maintaining online trading systems in the finance busines. Since 2003 he is working at the IBM development lab in Boeblingen. He works on Distributed Image Management for Linux Clusters and other software development projects. He also installs large clusters of Cell Broadband Engine™-based Blades and Power PC®-based Blades for IBM customers.
Cell Broadband Engine and Cell/B.E. are trademarks of Sony Computer Entertainment, Inc., in the United States, other countries, or both and are used under license therefrom.
Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.
IBM, alphaWorks, POWER, and Power PC are trademarks of IBM Corporation in the United States, other countries, or both.
Other company, product, or service names may be trademarks or service marks of others.
| |
|
| |