Skip to main content

Distributed Image Management for Linux Clusters

A scalable image management tool that allows blades to run a Linux distribution over the network without a local disk.

Date Posted: April 24, 2007

alphaworks tab navigation


Update: October 25, 2007 New version supports blades for IBM BladeCenter with the Cell Broadband Engine processor, including the QS20 and recently announced QS21.

1. What are the key functionalities of Distributed Image Management for Linux® Clusters?

Distributed Image Management for Linux Clusters provides all tools necessary for setting up and maintaining large and very large clusters, especially clusters based on (but not limited to) IBM® BladeCenter®. The main subsystems managed by Distributed Image Management for Linux Clusters are the following:

  • IP addresses
  • DHCP
  • NFS
  • file system images
  • network boot images (BOOTP/PXE)
  • node remote control

2. Is Distributed Image Management for Linux Clusters a comprehensive cluster management tool?

No. The primary focus of Distributed Image Management for Linux Clusters is on managing the Linux distribution image for all the nodes of the cluster. It turns out that this is often one of the most difficult cluster management processes. Distributed Image Management for Linux Clusters can be complemented by other cluster management suites.

3. Is this technology diskless?

Not exactly. The nodes in the cluster do not require disks. However, your images must be stored on the disk of the image server.

4. If the node is diskless, does Distributed Image Management for Linux Clusters use a RAM disk for the root file system and thus have only minimal function in the distribution?

No. A RAM disk is only used for booting. Early in the boot process, Linux will change root (chroot) to a comprehensive, network-based root file system. Therefore, Distributed Image Management for Linux Clusters is unlike embedded Linux installations with minimal function. On the contrary, users of Distributed Image Management for Linux Clusters often have large RPM installations in order to support many functions of a wide user base.

5. Is this technology stateless?

Stateless operation is a desirable cluster policy for many reasons. However, Distributed Image Management for Linux Clusters will preserve state as necessary. Therefore, a stateless policy is not required. That is, changes to the read/write portion of the root file system for any particular node will be preserved between boots of that node when the node is booted to the same image.

6. If the nodes do not need disks, what happens if the image server goes down?

Currently, all the nodes connected to this image server will stop and must be rebooted.

7. Are there some cool features in Distributed Image Management for Linux Clusters?

  • Distributed Image Management for Linux Clusters provides a very flexible mechanism for defining the IP address layout of thousands of nodes in a few lines of XML.
  • Distributed Image Management for Linux Clusters also has some useful command line tools for administering Blades in IBM BladeCenters or rack-mounted servers via the Remote Supervisor Adapter (RSA) and via SNMP.
  • In Distributed Image Management for Linux Clusters, one can set up multiple images for every node in parallel. For example, one can set up images with RHEL and others with SLES9, even in advance without having the nodes currently available. Switching between these images takes only two Distributed Image Management for Linux Clusters commands: one to change the MAC addresses in DHCP and one to reboot the nodes.

8. What is the maximum number of nodes in a cluster managed by Distributed Image Management for Linux Clusters?

Currently, there is no limit. Distributed Image Management for Linux Clusters can be scaled very well because of its modular design. Today the largest cluster managed by Distributed Image Management for Linux Clusters has 2500+ nodes; however, there shouldn't be any reason to go much beyond this number.

9. How long does it take to maintain the Linux images -- for example, installing new files on a large cluster?

Depending on the number size of the files, it can take a few seconds or up to less than one minute.

10. How difficult is it to use Distributed Image Management for Linux Clusters?

It might be more difficult to build the initial set-up compared to other cluster management systems. Distributed Image Management for Linux Clusters has been designed for large and very large clusters and cannot fulfill all possible requirements. For example, every new Linux distribution requires some adjustments of the Distributed Image Management for Linux Clusters code. But, as a result, you will get a very efficient and scalable system. Skills in a scripting language such as Perl and Bash are definitely helpful.

11. Does Distributed Image Management for Linux Clusters require any other cluster management tool, such as CSM, xCAT, etc.?

No.

12. Where is Distributed Image Management for Linux Clusters headed?

We are planning a fault-tolerant image server so that if the image server goes down, the nodes will remain active.

13. What do I need in order to get started?

You need RedHat, SLES, or Fedora Core installed on one node in your cluster. This is called the master image. You also need an image server with enough disk space to contain at least two copies of the image. We recommend at least 20 GB + 0.3 GB per node of free disk space on its own partition. For very large clusters, you will need to plan the number of nodes you want to support and the number of versions of a distribution you want to maintain. You also should have some plan on how you want to organize IP addresses in your cluster. See the examples on how to define IP addresses in the conifg directory (/opt/dim/config).

Trademarks




Related technologies