| By Steve Jones | Article Rating: |
|
| March 17, 2006 08:15 AM EST | Reads: |
21,740 |
At this point, we can run the following on the front-end:
# config_panfs -r [ip_address_of_shelf]
Restart the PanFS service on all compute nodes:
# cluster-fork 'service panfs restart'
(We included an IP in the extend-compute.xml for the shelf in advance.)
The cluster is ready to run jobs and you can read/write data to the shelf.
End of the basic installation for the cluster ***time 7:35:00***
So, you've seen the basics of setting up a cluster. The thing to remember is that Rocks is the middleware component that gives you the tools to do anything you want with the system, within reason. We have MANY customizations we will do on the cluster to support a variety of user requests. One example is enabling MPI-based applications to leverage Panasas-specific parallel I/O extensions.
Panasas offers an SDK that includes modifications for the MPI component implementing parallel I/O called ROMIO. ROMIO implements the MPI-IO layer in MPICH, one of the more popular MPI implementations. In the Panasas SDK is a patch to apply to MPICH (we run MVAPICH from OSU, based on MPICH and MVICH) that lets Panasas-specific features function.
Required items
- Panasas DirectFLOW client software that got installed during the initial cluster configuration
- Panasas SDK
- Source code for a ROMIO-based MPI implementation
Unpack the source and apply the patch:
# tar zxvf mvapich.tgz
# cd mpich
# patch NP1 < ~/romio.patch
Configure, make, and install MVAPICH including the following options (only PanFS flags shown)
# export CFLAGS=" -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -I \
# /opt/panfs/include/ -D__linux__=1"
# ./configure --with-romio --file-system=ufs+nfs+panfs
# make
# make install
The PanFS patch implements several MPI hints to specify data storage layout and/or concurrent write access when opening a file on a Panasas Storage Cluster.
Here's a sample PanFS-specific hint - panfs_concurrent_write - If the value of this hint is "1" open the file in concurrent write mode. If the value is "0" or the hint is missing, open the file in standard (non-concurrent write) mode.
So, now you have a much better idea of what's possible with the configuration of the system. Let's try a simple compile and run.
We use a simple program named "bounce" as the first benchmark for every system we build. This program times blocking send/received and reports the latency and bandwidth of the communication system. It's a great tool for us as it's small, portable, and tests our MPI F90 capability, something important to us. Here's how simple it is to use.
Compile:
$ mpif90 -o bounce bounce.f
Run:
$ qsub -I -lnodes=4:ppn=2
waiting for job.x to start
$ mpirun -ssh -np 8 -hostfile $PBS_NODEFILE ./bounce >& \
$PBS_WORKDIR/log.bounce
$ tail log.bounce
Start adding users:
# useradd alice
# passwd alice
What I typically do at this point is create an e-mail distribution list for the cluster for announcements of additions, changes, and maintenance windows. I'll include simple instructions on the compiling, location, and naming conventions used for the compilers and a sample PBS script. This usually answers most of the questions we'll have when first logging onto the cluster.
Here is a sample PBS script:
#!/bin/bash
#PBS -N BOUNCE
#PBS -e BOUNCE.err
#PBS -o BOUNCE.out
#PBS -m aeb
#PBS -M alice@cluster.com
#PBS -l nodes=8:ppn=1
#PBS -l walltime=30:00:00
PBS_O_WORKDIR='/home/alice/bounce'
export PBS_O_WORKDIR
### ---------------------------------------
### BEGINNING OF EXECUTION
### ---------------------------------------
echo The master node of this job is `hostname`
echo The working directory is `echo $PBS_O_WORKDIR`
echo This job runs on the following nodes:
echo `cat $PBS_NODEFILE`
### End of information preamble
cd $PBS_O_WORKDIR
cmd="/apps/bin/mpirun -ssh -np 8 -hostfile $PBS_NODEFILE
$PBS_O_WORKDIR/bounce"
echo "running mpirun with: $cmd in directory "`pwd`
$cmd >& $PBS_O_WORKDIR/log.bounce
Why did I select the solutions above? The Rocks Cluster Distribution is widely adopted with over 500 locations running the cluster distribution package. Many companies are offering fee-based support for the package as part of their offering. Dell even ships clusters with Rocks pre-installed.
Panasas was a blessing. Our researchers were finding it difficult to function on one system with 164 processors and a RAID SATA Disk Linux NFS server solution. I followed all the recommended solutions for optimizing NFS on Linux servers, but it was still easy to saturate. The first thing I tried was splitting users across multiple NFS servers, but this quickly became an issue since the users just scaled up their I/O traffic to run faster, which eventually created the same bottleneck - the NFS server.
I found Panasas through a contact at the AMD Developer Center who manages their clusters and decided to give it a try. I was impressed at the ease of installation on our cluster and the results I was able to achieve. I decided to run a benchmark on the Panasas system against our NFS servers. "bonnie++" was the tool I decided to use, and I also decided to put the executable in the shared location and run the tests through the queue.
Published March 17, 2006 Reads 21,740
Copyright © 2006 SYS-CON Media, Inc. — All Rights Reserved.
Syndicated stories and blog feeds, all rights reserved by the author.
More Stories By Steve Jones
Steve Jones is currently the technology operations manager at the Institute for Computational and Mathematical Engineering at Stanford University. Steve designed and administered a Top 500 Supercomputer and speaks regularly about the design and management of High Performance Computing Clusters, most recently as a keynote speaker at the annual Rocks-a-Palooza conference at the San Diego Supercomputing Center. His free time is spent with his significant other, Leilani, far away from a keyboard. More information about Steve can be found at http://www.hpcclusters.org.
![]() |
clusteradmin.net 02/18/08 06:17:49 PM EST | |||
For those who came here searching for cluster resources you may consider visiting my blog (http://clusteradmin.net) about cluster administration. Some introductory stuff, load-balancing guide, monitoring and other articles. Thanks, -marek |
||||
![]() |
Grid 04/01/06 10:38:44 AM EST | |||
Seems like SGE was not mentioned: |
||||
![]() |
Grid 04/01/06 10:36:27 AM EST | |||
Seems like SGE was not mentioned: |
||||
![]() |
SYS-CON Belgium News Desk 03/17/06 09:36:01 AM EST | |||
After building a number of clusters from the ground up -including one that made it to the Top500 Supercomputer list - I decided to try a service that many vendors now offer - having a system racked and stacked at the factory then shipped to us. Such a service saves a huge amount of time, not to mention my back, not having to build the cluster and cable all the equipment together. I've been a fan of well-cabled systems and have found the quality control to be acceptable. The key component is the pre-build requirements and verification before the system is built. This will ensure the system shipped is what is expected when it arrives at your front door. There can still be a fair amount of cabling that has to be done once it arrives, if you have a multi-rack configuration, but it's usually limited to plugging in the system's power and public network. |
||||
- Kindle 2 vs Nook
- Is Cloud Computing Like Teenage Sex?
- GovIT Expo Highlights Cloud Computing
- Tactical Cloud Computing Panel at 1st Annual GovIT Expo
- Cloud Computing Can Revitalize Your Career as Software Developer
- Ubuntu-based Open Source Linux Mint Tests KDE Version
- Yahoo! SVP Shelton Shugar to Discuss Innovation at Cloud Computing Expo
- Virtualization Journal "Readers' Choice Awards" Voting Is Now Open
- Einstein, Sharks and Clouds: IT Security in the Cloud
- Adobe Flex Developer Earns $100K in New York City
- Virtualization Expo Call for Papers Deadline December 15
- Amazon Web Services Database in the Cloud
- Kindle 2 vs Nook
- Cloud CEOs, CTOs & SVPs to Speak at 4th International Cloud Computing Expo
- Is Cloud Computing Like Teenage Sex?
- 1st Annual GovIT Expo: Letter from the Technical Chair
- Ulitzer News: Search vs New Media
- The Difference Between Web Hosting and Cloud Computing
- Cloud Computing Expo: Exclusive Q&A with Yahoo! SVP Cloud Computing
- Confessions of a Ulitzer Addict
- GovIT Expo Highlights Cloud Computing
- Twitter, Linked In, Ning and Ulitzer: Easy Personal Branding Strategy
- My Thoughts on Ulitzer
- Tactical Cloud Computing Panel at 1st Annual GovIT Expo
- The i-Technology Right Stuff
- Linux.SYS-CON.com Exclusive: Linus Discloses *Real* Fathers of Linux
- After Ubuntu, Windows Looks Increasingly Bad, Increasingly Archaic, Increasingly Unfriendly
- Linus' Top Ten SCO Barbs
- A Closer Look at Damn Small Linux
- Netscape Co-Founder's 12 Reasons for Growth of Open Source
- Introducing "Cooperative Linux" - Linux for Windows, No Less
- *POINT - COUNTERPOINT SPECIAL* What's Wrong with the Open Source Community?
- Where Are RIA Technologies Headed in 2008?
- Linux.SYS-CON.com Exclusive: What Would UserLinux Look Like?
- i-Technology Viewpoint: The New Paradigm of IT Buying
- Is Linux Desktop-Ready Yet...or Not?
























