Neidio at y cynnwys
Neidio at y fwydlen fordwyo

English

Introduction to Merlin

The ARCCA SRIF-3 Cluster

PR Photograph of the ARCCA High Performance Computer supplied by Bull UK.

Overview of Merlin

Compute Nodes

Interconnect

Storage & Cluster File System

Software

User Guide (Quick Start)

Account Request (Coming soon)

Training

Overview of Merlin

Schematic diagram of the ARCCA SuperComputer, Merlin, designed and delivered by Bull Group.

 

Merlin Schematic with detailed component information - designed and delivered by Bull Group.

 

These two schematics summarise the Merlin system which comprises of:

  • 256 compute nodes (128 1U Chassis), containing 2048 Intel Xeon (Harpertown / Seaburg) 3.0GHz cores
  • 4 x SMP compute nodes, with a total of 64 Intel Tigerton 2.93GHz cores, each node containing > 1TB RAID-5 internal storage
  • Redundant Administration nodes
  • 4 Front-end login nodes for users to access the cluster
  • Infiniband (Connect-X) network infrastructure across the entire system (20Gbps HS/LL DDR, 1.4μsec latency)
  • "Fabric A" is a 1Gbps and 10Gbps I/O network for mounting NFS & CFS volumes
  • "Fabric B" is a 1Gbps network dedicated to cluster management traffic
  • 38.5TB useable Lustre Cluster File System with:
    • 2 x Lustre MetaData Servers (MDS)
    • 2 x Lustre Object Storage Servers (OSS) nodes
  • 59TB useable resiliant disk storage (SATA)
  • 320TB Tape Storage
  • Redundant NFS Storage Access Nodes

Compute Nodes

The 256 compute nodes are Bull NovaScale R422 which accommodate two servers in a single 1U, 19" chassis. There are two servers per node which share a single, 92% efficient power supply. Each server has 2x CPU sockets containing an Intel 3.0GHz Harpertown processor (4xcores/socket; Seaburg platform), giving 8 cores per server, with 2GB (FB DDR2) RAM per core, a 160GB SATA disk and a ConnectX infiniband interface.

The 4 SMP compute nodes are Bull NovaScale R480E servers. These are 4U 4-socket servers using the Intel Caneland platform containing Tigerton 2.93GHz processors.

 

Qty Make & Description Details
256Standard Compute Node
Model: R422
2x Xeon E5472 3.0GHz - 1600FSB -  16GB RAM - 160GB@7.2kRPM SATA HDD
Adapters: ConnectX IB HCA
4SMP Compute Node
Model: R480-E1
4x Xeon X7350 2.93GHz - 32GB RAM
2x 146GB@15k RPM RAID-1(OS) + 5x 300GB@10kRPM RAID-5 (scratch) SAS HDD
Adapters: ConnectX IB HCA, dual-ethernet controller
4Login Nodes
Model: R460-SAS
2x Xeon E5472 3.0GHz - 1600 FSB - 32GB RAM
4x 146GB@15k RPM SAS HDD (RAID-1)
Adapters: ConnectX IB HCA, dual-ethernet controller, LSi MegaRaid ZCR PCI-X

 

Interconnect

The High Speed, Low Latency (HS,LL), high performance interconnect is provided by an InfiniBand 4x DDR network. This is principally provided by a single 288-port Voltaire GridDirector ISR2012 switch.

The 288-port Voltaire switch at the centre of the Merlin supercomputing communications infrastructure. Votaire infinband cables connecting the Merlin SuperComputer, providing the communication fabric across the entire system.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

The fully non-blocking topology enables collision-less switching of both MPI and I/O traffic. A ConnectX dual-port 4x DDR HCA (host card adaptor) is provided in each compute node.

Key features of the network:

Feature Description
Links4x DDR InfiniBand (20Gbps)
Number of portsUp to 288
Aggregate Bandwidth11.52Tb/s
Latency140 to 420 nanoseconds (DDR2 crossbar levels fabric)
MTBF > 220000 hours

 

ConnectX HCAs provide leading-edge performance:

Characteristic Performance
Latency (MPI PingPong)1.40μs
Latency (MPI PingPong 8 to 8)1.67μs
Unidirectional Bandwidth1.8GB/s
Bidirectional Bandwidth3.6GB/s

Memory usage optimizations like shared send queues and shared reliably connected (SRC) protocols dramatically improve the resource availability for applications.

In this configuration all nodes (including the service nodes), are connected with full non-blocking bisectional bandwidth.

Fabric A all nodes are directly connected at 1GB/s. Also there are 2 direct attachments to the NFS storage access servers using 10Gigabit ethernet links (using a Foundry Networks RX4 switch).

 

Storage & Cluster File System

There are two main storage sub-systems:

  • Fast 38.5TB cluster file system based on ClusterFS Inc. Lustre software
  • Redundant NFS system of 59TB useable RAID-6 disk.

Cluster File System is the scalable Parallel File System, Lustre from CFS. This provides high performance, fault tolerant, IO servers for data storage and meta-data storage. It uses a modular concept of an "IO cell". A single I/O cell comprising of 2 Bull NovaScale R460 servers and 2 EMC Clarion CX3-40F connected to both the InfiniBand network and Fabric A. The EMC Clarions are high-end disk arrays equipped with 83x 300GB FibreChannel disks which provides a net capacity of 38TB storage (and can be readily expanded downstream). Due to the balanced architecture between the disk array, I/O connectivity, server nodes and interconnect network, the sustained throughput from the Lustre files system is 2GB/s.

The NFS Storage is provided by an EMC Clarion CX3-40F disk array. This disk array can delivery up to 1GB/s of raw bandwidth and has 59TB useable disk. Equipped with dual-controllers and 4 redundant back-end loops, it provides reliable and fast storage (which again can be extended as necessary).

 

Software

The main software components on the system are:

  • Operating System: RHEL 5 
  • Job Scheduler: PBS Pro
  • Cluster Management Tools: NovaScale Master HPC Edition
  • Cluster File System: Lustre

Software tools include a range of libraries, compilers and applications (still under revision at the moment):

Compilers:

  • Intel® C++ Compiler for Linux (Floating Academic 5 Seat Pack (ESD))
  • Intel® Fortran Compiler for Linux (Floating Academic 5 Seat Pack (ESD))
  • Coming soon: Portland Group Compiler Suite
  • Coming soon: PathScale Compiler Suite

Libraries:

  • Intel Math Kernel Library - Cluster Edition Medium Cluster License for Linux
  • FFTW
  • HDF5
  • netCDF
  • gsl

Analysers, Profilers & Debuggers

  • Intel® VTune™ Performance Analyzer for Linux - Floating Academic 1 Seat Pack (ESD)
  • Intel® Trace Analyzer & Collector (ITA / ITC), Large Cluster System License, Single Cluster, unlimited Developers, Academic
  • Allinea: DDT (Distributed Debugging Toolkit)
  • Coming Soon: Allinea OPT