| By Steve Jones | Article Rating: |
|
| March 17, 2006 08:15 AM EST | Reads: |
21,692 |
After building a number of clusters from the ground up -including one that made it to the Top500 Supercomputer list - I decided to try a service that many vendors now offer - having a system racked and stacked at the factory then shipped to us. Such a service saves a huge amount of time, not to mention my back, not having to build the cluster and cable all the equipment together. I've been a fan of well-cabled systems and have found the quality control to be acceptable. The key component is the pre-build requirements and verification before the system is built. This will ensure the system shipped is what is expected when it arrives at your front door. There can still be a fair amount of cabling that has to be done once it arrives, if you have a multi-rack configuration, but it's usually limited to plugging in the system's power and public network.
Once this is done, the fun begins...
I've tried a few cluster distribution toolkits, and the one that works for me is the Rocks Cluster Distribution from the San Diego Supercomputing Center. I came across the package in a simple Google search in 2002 and was immediately sold on it. I use the term "sold" loosely since it's under an Open Source BSD-style license available for download and supported by a broad range of technical people who answer most questions on the Rocks user list. I've found support on the list to be better than most commercial distributions, but this may be because there are over 500 registered systems on the Rocks Register.
Here's how simple it is - insert the boot CD, complete a few screens worth of configuration data, and grab a coffee because it's a fairly simple base installation. The Rocks solution is extensible, with a mechanism for users and software vendors to ensure customizations are correctly installed on the system at setup. The mechanism is called a Roll.
The Roll typically consists of packages (RPMS/SRPMS/source) that have to be installed and scripts that are needed to ensure the packages are properly installed and distributed on the cluster. The Rocks team has extensive documentation for the Roll developer in the user manual.
Rocks 4.0.0 is a "cluster on a CD" set. That is it contains all the bits and configuration to build a cluster from "naked" hardware. The core OS bundled with Rocks is CentOS 4, which is a freely downloadable rebuild of Red Hat Enterprise Linux 4. As a side note, in Rocks CentOS 4 is encapsulated as the "OS Roll" and this OS Roll can be substituted with any Red Hat Enterprise Linux 4 rebuild (e.g., Scientific Linux ) including the official bits from Red Hat. Rolls are used in Rocks to customize your cluster. For example, the HPC Roll contains cluster-specific packages, such as an MPI environment for developing and running parallel programs. Two other examples are the Ganglia Roll, which provides cluster-monitoring tools, and the Area51 Roll, which provides security tools such as Tripwire and chkrootkit.
The Software
The core OS we used for the cluster in this article is CentOS 4.0 and the rolls we used to customize the cluster to our needs were the Compute Roll and the PBS Roll from University of Tromso in Norway.
The Hardware
- 1 - Front-end node - a Dell PowerEdge 2850 with dual 3.6GHz Intel Xeon EM64T processors and 4GB RAM
- 48 - Compute nodes - Dell PowerEdge SC 1425s with dual 3.4GHz Intel Xeon EM64T processors, 2GB RAM and a Topspin PCI-X Infiniband HCA card
- 1 - Topspin 270 Infiniband chassis with modules
- 4 - Dell PowerConnect 5324 Gigabit Ethernet switches
- 1 - Panasas Storage Cluster with one DirectorBlade and 10 StorageBlades
- 2 - Dell 19-inch racks
Start the build process ***time 0:00:00***
Setting up the front-end:
- Insert Compute Roll and boot the system
- Select hpc, kernel, ganglia, base, java, and area51 as the rolls to install
- Select "Yes" for additional roll
- Insert CentOS disk 1
- Select "Yes" for additional roll
- Insert CentOS disk 2
- Select "Yes" for additional roll
- Insert PBS roll
- Select "No" for additional rolls
- Input data on the configuration screen (e.g., fully qualified domain name, root password, IP addresses)
- Select "Disk Druid" to create partitions
- Create/partition ext3 64GB
- Create swap partition 4GB
- Create/export partition 64GB
- Insert CDs as requested to merge them into the distribution
After the-front end installation completes, the site-specific customization of the front-end starts. The base installation of CentOS 4.0 x86_64 has the 2.6.9-5.0.5.ELsmp kernel and we need the 2.6.9-11.ELsmp for many of the packages that will be included with our cluster. Below we'll describe how we do this key upgrade then continue with many package and mount point customizations.
Customization of the front-end:
The first step is to apply the updated kernel packages:
- # rpm -ivh kernel-smp.2.6.9-11.EL.x86_64.rpm
- # rpm -ivh kernel-smp-devel-2.6.9-11.EL.x86_64.rpm
- # rpm -ivh kernel-sourcecode-2.6.9-11.EL.x86_64.rpm
I always check /boot/grub/grub.conf to be sure the system is booting from the proper kernel after an update.
Then apply an RPM to resolve a known (to us) library issue:
- # rpm -ivh compat-libstdc++-33-3.2.3-47.3.i386.rpm
Prepare for Panasas Storage Cluster and Topspin integration on the front-end:
- # rpm -ivh panfs-2.6.9-11.EM-mp-2.2.3-166499.27.rhel_4_amd64.rpm
- # rpm -ivh topspin-ib-rhel4-3.1.0-113. x86_64.rpm
- # rpm -ivh topspin-ib-mod-rhel4-2.6.9-11.ELsmp-3.1.0-113.x86_64.rpm
Time for a break. ***time 1:05:00***
Create a second partition for applications:
# fdisk /dev/sdb
(400GB single partition on our system)
Create the file system and mount point:
# mkfs -t ext3 /dev/sdb1
# mkdir /space
Modify /etc/fstab to include the mount point then mount it:
# mount /space
Now let's start adding some of the goods...
- Install Portland Group compilers in /space/apps/pgi/
- Install Intel 9.0 compilers in /space/apps/pgi
- Install OSU MVAPICH 0.95 in /space/apps/mvapich
- Build version of MVAPICH for intel/gnu/pgi
- Install our own version of Python in /space/apps/python64
- Install f2py, Numeric, pyMPI built against our vanilla version of python64
Published March 17, 2006 Reads 21,692
Copyright © 2006 SYS-CON Media, Inc. — All Rights Reserved.
Syndicated stories and blog feeds, all rights reserved by the author.
More Stories By Steve Jones
Steve Jones is currently the technology operations manager at the Institute for Computational and Mathematical Engineering at Stanford University. Steve designed and administered a Top 500 Supercomputer and speaks regularly about the design and management of High Performance Computing Clusters, most recently as a keynote speaker at the annual Rocks-a-Palooza conference at the San Diego Supercomputing Center. His free time is spent with his significant other, Leilani, far away from a keyboard. More information about Steve can be found at http://www.hpcclusters.org.
![]() |
clusteradmin.net 02/18/08 06:17:49 PM EST | |||
For those who came here searching for cluster resources you may consider visiting my blog (http://clusteradmin.net) about cluster administration. Some introductory stuff, load-balancing guide, monitoring and other articles. Thanks, -marek |
||||
![]() |
Grid 04/01/06 10:38:44 AM EST | |||
Seems like SGE was not mentioned: |
||||
![]() |
Grid 04/01/06 10:36:27 AM EST | |||
Seems like SGE was not mentioned: |
||||
![]() |
SYS-CON Belgium News Desk 03/17/06 09:36:01 AM EST | |||
After building a number of clusters from the ground up -including one that made it to the Top500 Supercomputer list - I decided to try a service that many vendors now offer - having a system racked and stacked at the factory then shipped to us. Such a service saves a huge amount of time, not to mention my back, not having to build the cluster and cable all the equipment together. I've been a fan of well-cabled systems and have found the quality control to be acceptable. The key component is the pre-build requirements and verification before the system is built. This will ensure the system shipped is what is expected when it arrives at your front door. There can still be a fair amount of cabling that has to be done once it arrives, if you have a multi-rack configuration, but it's usually limited to plugging in the system's power and public network. |
||||
- Ulitzer News: Search vs New Media
- Publishing Synergy: Blog, Twitter and Ulitzer
- Cloud Computing Expo: Exclusive Q&A with Yahoo! SVP Cloud Computing
- GovIT Expo Highlights Cloud Computing
- Confessions of a Ulitzer Addict
- Tactical Cloud Computing Panel at 1st Annual GovIT Expo
- Ubuntu-based Open Source Linux Mint Tests KDE Version
- Ulitzer Aid Campaign for the Typhoon Ondoy Victims
- Cloud Computing Can Revitalize Your Career as Software Developer
- Virtualization Journal "Readers' Choice Awards" Voting Is Now Open
- IBM’s Linux-Based ‘Cloud-in-a-Box’ Makes its First Sale
- Yahoo! SVP Shelton Shugar to Discuss Innovation at Cloud Computing Expo
- Cloud CEOs, CTOs & SVPs to Speak at 4th International Cloud Computing Expo
- 1st Annual GovIT Expo: Letter from the Technical Chair
- Ulitzer News: Search vs New Media
- Publishing Synergy: Blog, Twitter and Ulitzer
- The Difference Between Web Hosting and Cloud Computing
- Cloud Computing Expo: Exclusive Q&A with Yahoo! SVP Cloud Computing
- GovIT Expo Highlights Cloud Computing
- Confessions of a Ulitzer Addict
- Twitter, Linked In, Ning and Ulitzer: Easy Personal Branding Strategy
- The End of IT 1.0 As We Know It Has Begun
- My Thoughts on Ulitzer
- Tactical Cloud Computing Panel at 1st Annual GovIT Expo
- The i-Technology Right Stuff
- Linux.SYS-CON.com Exclusive: Linus Discloses *Real* Fathers of Linux
- After Ubuntu, Windows Looks Increasingly Bad, Increasingly Archaic, Increasingly Unfriendly
- Linus' Top Ten SCO Barbs
- A Closer Look at Damn Small Linux
- Netscape Co-Founder's 12 Reasons for Growth of Open Source
- Introducing "Cooperative Linux" - Linux for Windows, No Less
- *POINT - COUNTERPOINT SPECIAL* What's Wrong with the Open Source Community?
- Where Are RIA Technologies Headed in 2008?
- Linux.SYS-CON.com Exclusive: What Would UserLinux Look Like?
- i-Technology Viewpoint: The New Paradigm of IT Buying
- Is Linux Desktop-Ready Yet...or Not?




































