| By Steve Jones | Article Rating: |
|
| March 17, 2006 08:15 AM EST | Reads: |
21,033 |
After building a number of clusters from the ground up -including one that made it to the Top500 Supercomputer list - I decided to try a service that many vendors now offer - having a system racked and stacked at the factory then shipped to us. Such a service saves a huge amount of time, not to mention my back, not having to build the cluster and cable all the equipment together. I've been a fan of well-cabled systems and have found the quality control to be acceptable. The key component is the pre-build requirements and verification before the system is built. This will ensure the system shipped is what is expected when it arrives at your front door. There can still be a fair amount of cabling that has to be done once it arrives, if you have a multi-rack configuration, but it's usually limited to plugging in the system's power and public network.
Once this is done, the fun begins...
I've tried a few cluster distribution toolkits, and the one that works for me is the Rocks Cluster Distribution from the San Diego Supercomputing Center. I came across the package in a simple Google search in 2002 and was immediately sold on it. I use the term "sold" loosely since it's under an Open Source BSD-style license available for download and supported by a broad range of technical people who answer most questions on the Rocks user list. I've found support on the list to be better than most commercial distributions, but this may be because there are over 500 registered systems on the Rocks Register.
Here's how simple it is - insert the boot CD, complete a few screens worth of configuration data, and grab a coffee because it's a fairly simple base installation. The Rocks solution is extensible, with a mechanism for users and software vendors to ensure customizations are correctly installed on the system at setup. The mechanism is called a Roll.
The Roll typically consists of packages (RPMS/SRPMS/source) that have to be installed and scripts that are needed to ensure the packages are properly installed and distributed on the cluster. The Rocks team has extensive documentation for the Roll developer in the user manual.
Rocks 4.0.0 is a "cluster on a CD" set. That is it contains all the bits and configuration to build a cluster from "naked" hardware. The core OS bundled with Rocks is CentOS 4, which is a freely downloadable rebuild of Red Hat Enterprise Linux 4. As a side note, in Rocks CentOS 4 is encapsulated as the "OS Roll" and this OS Roll can be substituted with any Red Hat Enterprise Linux 4 rebuild (e.g., Scientific Linux ) including the official bits from Red Hat. Rolls are used in Rocks to customize your cluster. For example, the HPC Roll contains cluster-specific packages, such as an MPI environment for developing and running parallel programs. Two other examples are the Ganglia Roll, which provides cluster-monitoring tools, and the Area51 Roll, which provides security tools such as Tripwire and chkrootkit.
The Software
The core OS we used for the cluster in this article is CentOS 4.0 and the rolls we used to customize the cluster to our needs were the Compute Roll and the PBS Roll from University of Tromso in Norway.
The Hardware
- 1 - Front-end node - a Dell PowerEdge 2850 with dual 3.6GHz Intel Xeon EM64T processors and 4GB RAM
- 48 - Compute nodes - Dell PowerEdge SC 1425s with dual 3.4GHz Intel Xeon EM64T processors, 2GB RAM and a Topspin PCI-X Infiniband HCA card
- 1 - Topspin 270 Infiniband chassis with modules
- 4 - Dell PowerConnect 5324 Gigabit Ethernet switches
- 1 - Panasas Storage Cluster with one DirectorBlade and 10 StorageBlades
- 2 - Dell 19-inch racks
Start the build process ***time 0:00:00***
Setting up the front-end:
- Insert Compute Roll and boot the system
- Select hpc, kernel, ganglia, base, java, and area51 as the rolls to install
- Select "Yes" for additional roll
- Insert CentOS disk 1
- Select "Yes" for additional roll
- Insert CentOS disk 2
- Select "Yes" for additional roll
- Insert PBS roll
- Select "No" for additional rolls
- Input data on the configuration screen (e.g., fully qualified domain name, root password, IP addresses)
- Select "Disk Druid" to create partitions
- Create/partition ext3 64GB
- Create swap partition 4GB
- Create/export partition 64GB
- Insert CDs as requested to merge them into the distribution
After the-front end installation completes, the site-specific customization of the front-end starts. The base installation of CentOS 4.0 x86_64 has the 2.6.9-5.0.5.ELsmp kernel and we need the 2.6.9-11.ELsmp for many of the packages that will be included with our cluster. Below we'll describe how we do this key upgrade then continue with many package and mount point customizations.
Customization of the front-end:
The first step is to apply the updated kernel packages:
- # rpm -ivh kernel-smp.2.6.9-11.EL.x86_64.rpm
- # rpm -ivh kernel-smp-devel-2.6.9-11.EL.x86_64.rpm
- # rpm -ivh kernel-sourcecode-2.6.9-11.EL.x86_64.rpm
I always check /boot/grub/grub.conf to be sure the system is booting from the proper kernel after an update.
Then apply an RPM to resolve a known (to us) library issue:
- # rpm -ivh compat-libstdc++-33-3.2.3-47.3.i386.rpm
Prepare for Panasas Storage Cluster and Topspin integration on the front-end:
- # rpm -ivh panfs-2.6.9-11.EM-mp-2.2.3-166499.27.rhel_4_amd64.rpm
- # rpm -ivh topspin-ib-rhel4-3.1.0-113. x86_64.rpm
- # rpm -ivh topspin-ib-mod-rhel4-2.6.9-11.ELsmp-3.1.0-113.x86_64.rpm
Time for a break. ***time 1:05:00***
Create a second partition for applications:
# fdisk /dev/sdb
(400GB single partition on our system)
Create the file system and mount point:
# mkfs -t ext3 /dev/sdb1
# mkdir /space
Modify /etc/fstab to include the mount point then mount it:
# mount /space
Now let's start adding some of the goods...
- Install Portland Group compilers in /space/apps/pgi/
- Install Intel 9.0 compilers in /space/apps/pgi
- Install OSU MVAPICH 0.95 in /space/apps/mvapich
- Build version of MVAPICH for intel/gnu/pgi
- Install our own version of Python in /space/apps/python64
- Install f2py, Numeric, pyMPI built against our vanilla version of python64
Published March 17, 2006 Reads 21,033
Copyright © 2006 SYS-CON Media, Inc. — All Rights Reserved.
Syndicated stories and blog feeds, all rights reserved by the author.
About Steve Jones
Steve Jones is currently the technology operations manager at the Institute for Computational and Mathematical Engineering at Stanford University. Steve designed and administered a Top 500 Supercomputer and speaks regularly about the design and management of High Performance Computing Clusters, most recently as a keynote speaker at the annual Rocks-a-Palooza conference at the San Diego Supercomputing Center. His free time is spent with his significant other, Leilani, far away from a keyboard. More information about Steve can be found at http://www.hpcclusters.org.
![]() |
clusteradmin.net 02/18/08 06:17:49 PM EST | |||
For those who came here searching for cluster resources you may consider visiting my blog (http://clusteradmin.net) about cluster administration. Some introductory stuff, load-balancing guide, monitoring and other articles. Thanks, -marek |
||||
![]() |
Grid 04/01/06 10:38:44 AM EST | |||
Seems like SGE was not mentioned: |
||||
![]() |
Grid 04/01/06 10:36:27 AM EST | |||
Seems like SGE was not mentioned: |
||||
![]() |
SYS-CON Belgium News Desk 03/17/06 09:36:01 AM EST | |||
After building a number of clusters from the ground up -including one that made it to the Top500 Supercomputer list - I decided to try a service that many vendors now offer - having a system racked and stacked at the factory then shipped to us. Such a service saves a huge amount of time, not to mention my back, not having to build the cluster and cable all the equipment together. I've been a fan of well-cabled systems and have found the quality control to be acceptable. The key component is the pre-build requirements and verification before the system is built. This will ensure the system shipped is what is expected when it arrives at your front door. There can still be a fair amount of cabling that has to be done once it arrives, if you have a multi-rack configuration, but it's usually limited to plugging in the system's power and public network. |
||||
- Ulitzer’s Amazing First 30 Days in Public Beta
- Why an Application Grid?
- Will Ulitzer Dominate News Content on The Web? -Gartner
- Building Private and Hybrid Clouds with Ubuntu 9.04
- Ulitzer Responds to Published Reports
- Ubuntu-based Open Source Linux Mint Tests KDE Version
- Is Cloud Computing Like Teenage Sex?
- Sun Upgrades VirtualBox
- Should Developers Care About Cloud Computing?
- Ted Weissman and Lois Paul & Partners PR Firm
- How to Rebuild a Home Network Integrating Ubuntu and Mac OS X
- Initial Thoughts on IBM Acquisition of Sun Microsystems
- Ulitzer’s Amazing First 30 Days in Public Beta
- Amazon Fiddles with Utility Pricing
- Why an Application Grid?
- Will Ulitzer Dominate News Content on The Web? -Gartner
- Micro Focus Offers Micro Focus COBOL for Eclipse
- Sun CEO Jonathan Schwartz Scopes Out Future for Sun's Cloud
- SCO Files Reorg Plan
- Building Private and Hybrid Clouds with Ubuntu 9.04
- The i-Technology Right Stuff
- Linux.SYS-CON.com Exclusive: Linus Discloses *Real* Fathers of Linux
- After Ubuntu, Windows Looks Increasingly Bad, Increasingly Archaic, Increasingly Unfriendly
- Linus' Top Ten SCO Barbs
- Netscape Co-Founder's 12 Reasons for Growth of Open Source
- A Closer Look at Damn Small Linux
- Introducing "Cooperative Linux" - Linux for Windows, No Less
- *POINT - COUNTERPOINT SPECIAL* What's Wrong with the Open Source Community?
- Linux.SYS-CON.com Exclusive: What Would UserLinux Look Like?
- i-Technology Viewpoint: The New Paradigm of IT Buying







































