Introduction to Merlin
The ARCCA SRIF-3 Cluster
User Guide (Quick Start)
Account Request (Coming soon)
These two schematics summarise the Merlin system which comprises of:
- 256 compute nodes (128 1U Chassis), containing 2048 Intel Xeon (Harpertown / Seaburg) 3.0GHz cores
- 4 x SMP compute nodes, with a total of 64 Intel Tigerton 2.93GHz cores, each node containing > 1TB RAID-5 internal storage
- Redundant Administration nodes
- 4 Front-end login nodes for users to access the cluster
- Infiniband (Connect-X) network infrastructure across the entire system (20Gbps HS/LL DDR, 1.4μsec latency)
- "Fabric A" is a 1Gbps and 10Gbps I/O network for mounting NFS & CFS volumes
- "Fabric B" is a 1Gbps network dedicated to cluster management traffic
- 38.5TB useable Lustre Cluster File System with:
- 2 x Lustre MetaData Servers (MDS)
- 2 x Lustre Object Storage Servers (OSS) nodes
- 59TB useable resiliant disk storage (SATA)
- 320TB Tape Storage
- Redundant NFS Storage Access Nodes
The 256 compute nodes are Bull NovaScale R422 which accommodate two servers in a single 1U, 19" chassis. There are two servers per node which share a single, 92% efficient power supply. Each server has 2x CPU sockets containing an Intel 3.0GHz Harpertown processor (4xcores/socket; Seaburg platform), giving 8 cores per server, with 2GB (FB DDR2) RAM per core, a 160GB SATA disk and a ConnectX infiniband interface.
The 4 SMP compute nodes are Bull NovaScale R480E servers. These are 4U 4-socket servers using the Intel Caneland platform containing Tigerton 2.93GHz processors.
|Qty||Make & Description||Details|
|256||Standard Compute Node|
|2x Xeon E5472 3.0GHz - 1600FSB - 16GB RAM - 160GB@7.2kRPM SATA HDD|
Adapters: ConnectX IB HCA
|4||SMP Compute Node|
|4x Xeon X7350 2.93GHz - 32GB RAM|
2x 146GB@15k RPM RAID-1(OS) + 5x 300GB@10kRPM RAID-5 (scratch) SAS HDD
Adapters: ConnectX IB HCA, dual-ethernet controller
|2x Xeon E5472 3.0GHz - 1600 FSB - 32GB RAM|
4x 146GB@15k RPM SAS HDD (RAID-1)
Adapters: ConnectX IB HCA, dual-ethernet controller, LSi MegaRaid ZCR PCI-X
The High Speed, Low Latency (HS,LL), high performance interconnect is provided by an InfiniBand 4x DDR network. This is principally provided by a single 288-port Voltaire GridDirector ISR2012 switch.
The fully non-blocking topology enables collision-less switching of both MPI and I/O traffic. A ConnectX dual-port 4x DDR HCA (host card adaptor) is provided in each compute node.
Key features of the network:
|Links||4x DDR InfiniBand (20Gbps)|
|Number of ports||Up to 288|
|Latency||140 to 420 nanoseconds (DDR2 crossbar levels fabric)|
|MTBF|| > 220000 hours|
ConnectX HCAs provide leading-edge performance:
|Latency (MPI PingPong)||1.40μs|
|Latency (MPI PingPong 8 to 8)||1.67μs|
Memory usage optimizations like shared send queues and shared reliably connected (SRC) protocols dramatically improve the resource availability for applications.
In this configuration all nodes (including the service nodes), are connected with full non-blocking bisectional bandwidth.
Fabric A all nodes are directly connected at 1GB/s. Also there are 2 direct attachments to the NFS storage access servers using 10Gigabit ethernet links (using a Foundry Networks RX4 switch).
There are two main storage sub-systems:
- Fast 38.5TB cluster file system based on ClusterFS Inc. Lustre software
- Redundant NFS system of 59TB useable RAID-6 disk.
Cluster File System is the scalable Parallel File System, Lustre from CFS. This provides high performance, fault tolerant, IO servers for data storage and meta-data storage. It uses a modular concept of an "IO cell". A single I/O cell comprising of 2 Bull NovaScale R460 servers and 2 EMC Clarion CX3-40F connected to both the InfiniBand network and Fabric A. The EMC Clarions are high-end disk arrays equipped with 83x 300GB FibreChannel disks which provides a net capacity of 38TB storage (and can be readily expanded downstream). Due to the balanced architecture between the disk array, I/O connectivity, server nodes and interconnect network, the sustained throughput from the Lustre files system is 2GB/s.
The NFS Storage is provided by an EMC Clarion CX3-40F disk array. This disk array can delivery up to 1GB/s of raw bandwidth and has 59TB useable disk. Equipped with dual-controllers and 4 redundant back-end loops, it provides reliable and fast storage (which again can be extended as necessary).
The main software components on the system are:
- Operating System: RHEL 5
- Job Scheduler: PBS Pro
- Cluster Management Tools: NovaScale Master HPC Edition
- Cluster File System: Lustre
Software tools include a range of libraries, compilers and applications (still under revision at the moment):
- Intel® C++ Compiler for Linux (Floating Academic 5 Seat Pack (ESD))
- Intel® Fortran Compiler for Linux (Floating Academic 5 Seat Pack (ESD))
- Coming soon: Portland Group Compiler Suite
- Coming soon: PathScale Compiler Suite
- Intel Math Kernel Library - Cluster Edition Medium Cluster License for Linux
Analysers, Profilers & Debuggers
- Intel® VTune™ Performance Analyzer for Linux - Floating Academic 1 Seat Pack (ESD)
- Intel® Trace Analyzer & Collector (ITA / ITC), Large Cluster System License, Single Cluster, unlimited Developers, Academic
- Allinea: DDT (Distributed Debugging Toolkit)
- Coming Soon: Allinea OPT