Introduction
Welcome, intrepid cluster enthusiast! You’re about to embark on a journey worthy of digital pioneers everywhere. Today we’ll install the Rocks Cluster Distribution, a specialized Linux OS designed to turn a pile of commodity hardware into a high-performance computing (HPC) powerhouse. We’ll keep things serious, detailed, and—because life without humor is like a Raspberry Pi without GPIO pins—lighthearted.
By the end of this guide you’ll have a functioning cluster ready to tackle workloads from scientific simulations to big data analysis. Let’s rock!
System Requirements
First things first: gather your hardware and ensure you meet the minimum requirements. Note that real performance scales with more RAM, faster CPUs, and low-latency network interconnects.
| Component | Minimum | Recommended |
|---|---|---|
| Head node CPU | 2 cores @ 2GHz | 4 cores @ 3GHz |
| Compute node CPU | 2 cores @ 2GHz | 8 cores @ 2.5GHz |
| RAM (per node) | 4 GiB | 16 GiB |
| Storage (per node) | 50 GiB HDD | SSD or NVMe 120 GiB |
| Network | 1 GbE | 10 GbE or Infiniband |
You’ll also need:
- A reliable power supply (no magic wands included).
- Switch or router capable of handling internal network traffic.
- A workstation to serve as a console (or just your laptop, if you prefer).
1. Downloading Rocks
Head over to the official Rocks website and grab the ISO. You can use wget if you’re already on a Linux machine:
wget -O rocks.iso http://rollsite.rocksclusters.org/rocks6/isos/latest/rocks-6.2.iso
Pro tip: If your download fails at 99.9%, blame your ISP, curse gently, and retry. Persistence pays off!
2. Preparing Installation Media
Burn the ISO to a DVD or create a bootable USB stick. Here’s the USB method:
- Identify your USB drive:
lsblk. - Write the ISO:
dd if=rocks.iso of=/dev/sdX bs=4M status=progress syncReplace
/dev/sdXwith your USB device (e.g./dev/sdb).
Double-check you didn’t choose your system disk—unless you want a surprise cluster on your desktop.
3. Network Planning
A well-behaved cluster needs a neat IP scheme. Typically:
- Head node: 192.168.1.1
- Compute nodes: 192.168.1.2–192.168.1.254
- Netmask: 255.255.255.0
Configure your switch and DHCP (if using) accordingly. Rocks can run its own DHCP server, but you can also integrate with your existing infrastructure.
4. Installing the Head Node
4.1 Boot Installer
Insert the media into your prospective head node. Boot from it, and you’ll see the Rocks installer menu.
4.2 Basic Configuration
- Select Install.
- Choose your keyboard layout and locale.
- Set the root password. (Hint: not
password.) - Partition disks—defaults usually work well.
4.3 Network Settings
When prompted, specify:
- IP address: 192.168.1.1
- Netmask: 255.255.255.0
- Gateway: (your external gateway, if any)
- DNS: (your preferred DNS servers)
4.4 Selecting Roll CDs
Rocks uses “rolls” (modular package collections). At minimum, include:
- base (core OS)
- hpc (MPI, batch scheduler, etc.)
If you need other software—say, R or TensorFlow—you can install additional rolls post-install.
4.5 Finalize Installation
Sit back, grab a ☕, and let the installer do its thing. When finished, reboot into your new head node.
5. Post-Install Configuration on the Head Node
5.1 Logging In
Log in as root with the password you set.
5.2 Verify Services
- Check Rocks daemons:
rocks status - Ensure DHCP, DNS, and NFS are running:
systemctl status dhcpd
systemctl status named
systemctl status nfs
5.3 User Management
Create your cluster user:
useradd -m clusteruser
passwd clusteruser
Add them to wheel for sudo privileges if desired.
6. Adding Compute Nodes
Now the fun multiplies—by the number of nodes. Connect each compute node to the network and ensure it boots from the network (PXE).
6.1 PXE Boot
- Enable PXE in BIOS/UEFI.
- Boot, and the node should fetch an OS image from the head node’s TFTP server.
6.2 Auto-Discovery
Rocks will auto-discover new nodes and assign them a name like compute-001. The installation proceeds unattended.
6.3 Verification
rocks list host
rocks list host run
You should see your new compute nodes listed as running.
7. Testing Your Cluster
Confirm that MPI and your scheduler work properly.
7.1 Simple MPI Test
mpirun -np 4 hostname
Each compute node should reply with its hostname. If you get timeouts, check network connectivity and firewall rules.
7.2 Batch Scheduler Test
qsub -l nodes=2:ppn=2 << EOF
#!/bin/bash
echo Running on:
cat PBS_NODEFILE
EOF
Verify that your job runs across two nodes as requested.
8. Advanced Configuration
8.1 Parallel File System
For large scale I/O, consider deploying BeeGFS or Lustre. Rocks supports rolls for these.
8.2 GPFS Integration
If you have IBM GPFS, get the roll from GitHub and install:
rocks add roll gpfs-roll.iso
8.3 Monitoring
Consider Ganglia or PROMETHEUS to monitor cluster health:
- rocks add roll ganglia-roll.iso
- Configure web interface under
/var/www/html/ganglia
9. Troubleshooting
-
Node not PXE-booting: Check BIOS settings, network cable, and TFTP logs (
/var/log/xinetd.log). - DHCP conflict: Ensure no other DHCP server on the same subnet.
-
Slow NFS: Tune
/etc/exportsand mount options (rsize, wsize). - MPI hangs: Confirm Open MPI and IB drivers match across nodes.
Conclusion
Congratulations—you’re now the proud operator of a Rocks cluster! Whether you’re crunching numbers for scientific research or just flexing your nerd muscles, you’ve earned this achievement.
Remember to back up your head node configuration, keep your rolls up to date, and share tips with the community at rocksclusters.org. Now go forth and compute at scale!
Leave a Reply