Clusters are presented in chronological order according to the way they were purchased and installed, which roughly corresponds to CPU clock and speed as well. Aggregate MHz is used as a weak measure of total cluster capacity, as it is generally the strongest indicator of CPU bound performance. Each cluster name link connects to a short blurb on the cluster itself below or (in the case of the original Brahma link) to a page on the history of cluster computing in the Duke Physics department.
|Cluster||Dates||Architecture||Number of Nodes||Number of CPUs||Aggregate MHz|
|Brahma||1996-2001 (retired)||Dual 200 MHz Intel PPro, mixed||21 (peak)||30 (peak)||7000 (peak)|
|Brahma 2||1998-present||Dual 400 MHz Intel PIII||12 (still functional)||24||9600|
|Brahma 3||2001-present||Dual 933 Mhz Intell PIII||7||14||13060|
|QCD||2001-present||Alpha 666 Mhz EV67||4||4||2664|
|Ganesh||2000-present||Athlon 1300 MHz||16||16||20833|
|Champ (Athlon)||2002-present||Dual Athlon 1533 MHz||23||46||70518|
|Champ (P4)||2002-present||P4 2000 MHz||4||4||8000|
|Rama||2002-present||Dual Athlon 1600 MHz||16||32||51200|
|Nano||2002-present||Dual Athlon 1600 MHz||32||64||102400|
It is interesting to note the steady effects of Moore's Law and the increase in investment in cluster resources in the department.
The aggregate cycle capacity of the physics department cluster (including various desktop "nodes" that can add to the racked/shelved/named cluster capacity) is thus in the ballpark of 300 GHz. Naturally, performance on particular codes as a function of CPU clock varies significantly across the various architectures, but by any measure this is a lot of compute power.
(These numbers were last updated as of April, 2003, and are subject to change as clusters are retired and new cluster generations are added.)
Brahma 2 was generously donated by Intel as part of an Intel equipment grant to the University. In the first phase of the grant, the original, ageing Brahma cluster was augmented by 16 dual processor 400 MHz PIII systems, which at the time was a bleeding-edge system. The actual systems themselves were Dell Poweredge 2300 servers. These were not the most convenient form factor for cluster nodes, as the cases were designed with departmental or corporate server requirements in mind and were big, heavy, and featured things like a snap-in hard disk bus that (however lovely) were overkill for compute nodes with little need for local disk.
They were fast, though, for their time, and were put in immediate service. During their first three years of service these nodes were kept in nearly continuous use at 100% of capacity -- the cumulative duty cycle of the nodes was easily 90% or greater -- working on problems in condensed matter theory (Brown and Ciftan) and nuclear theory (Mueller). Many papers were published in Physical Review and elsewhere on the basis of the work completed on these systems.
In addition, Robert G. Brown helped organize the Extreme Linux track of the 1999 Linux Expo, and used several of the Poweredge 2300's to construct a small cluster that was one of several demonstrated at the Expo.
As time has passed and Moore's Law inexorably advanced, the usage of these second-generation brahma nodes has somewhat diminished, but they are still doing quite a bit of valuable work and will likely remain in service for another year or more, as the hardware itself holds out. It is very likely that they will be honorably retired by early 2005, if not before, as then-current systems are able to do more work, faster, for what the electricity and cooling alone now costs for the older nodes.
Brahma 3 consists of the final eight systems, now 933 MHz dual PIII's, donated to the department by Intel as a part of the final phase of an Intel equipment grant to the University. The systems were placed into almost immediate service and are still in very heavy use today, primarily working on problems in nuclear theory and condensed matter theory.
These systems, like Brahma 2, are shelf mounted tower units from Dell, but they are in a more or less standard mid-size tower this time and hence are much more convenient to shelve and physically move when required.
The QCD mini-cluster was obtained by Shailesh Chandrasekharan with start-up funds. It is our only DEC/Compaq Alpha cluster -- although it performs better on numerical code relative to its CPU clock compared to the Intel and Athlon CPUs, it proved to be "expensive" performance in many ways. The Alpha required significantly more systems administration effort to install and maintain a linux distribution, it runs quite hot (and hence is expensive to operate), and it cost more on a per FLOP basis than Intel or Athlon alternatives that were also more cost-effective to install and operate. Finally, the Alpha architecture was not helped by the constant travails of Digital, then Compaq -- Alpha as a CPU architecture seems to have no future.
Consequently, we have more or less abandoned it, although naturally we continue to operate the QCD mini-cluster until we can sensibly retire it.
Ganesh was the department's first Athlon cluster, consisting of fifteen 1300 MHz Athlon client nodes in mid-sized towers plus a 1333 MHz server, also in a mid-sized tower (for 16 nodes total). It was also the first mini-cluster not named or considered a part of brahma, as it was purchased by Brown and Ciftan to work on specific problems in condensed matter physics. This cluster cost approximately $15,000 including its switch, wiring, and shelving, making it an extremely cost effective cluster for the time.
We had long adopted the sensible practice of naming brahma cluster nodes by a simple schema such as b1, b2, b3, but (as one notes above) had failed to REname the successive generations with a different name (and hence letter). This proved to be a modest mistake, as one had to "remember" which nodes where the faster b-nodes and which nodes were the slower b-nodes. Naming the g-nodes g00, g01, ... eliminated this problem for this new mini-cluster -- the name prefix letter uniquely determined both architecture and processor/memory generation and configuration, as well as (in this case) cluster ownership.
The cluster remains in active use, still working on problems in critical phenomena for the group of Brown and Ciftan and being shared with other brahma users during the brief times it would otherwise be idle due to a pause in the computational schedule of Brown and Ciftan.
Ganesh is the last cluster purchased in a shelf mount/tower unit form factor for the physics department. This is largely because it became apparent that physical space was about to become an important issue as more and more groups in the department started cluster projects of their own and the total number of nodes and processors started to skyrocket.
To accomodate the new clusters (and the many older nodes of the existing clusters, which were not terribly small) the University funded the extensive renovation of a new cluster/server room for the department. This space is adequate to hold hundreds of rackmount nodes and is not quite half full, but would be unable to hold even the number of CPUs it holds already in a shelf/tower form factor. Consequently, all the newer cluster nodes below are rackmounted, and older shelf mount cluster nodes will be phased and retired over time and replaced with rackmount clusters as funding and opportunity permits.
The tremendous success obtained by the nuclear theory groups of Mueller, Bass, and later Chandrasekharan using the various brahma nodes inspired them to seek DOE support for a larger cluster to be dedicated towards nuclear theory. Grant proposals were submitted and funded, and the CHAMP (Computer-cluster for Hadronic and Many-Body Physics) was born.
CHAMP consists of 64 processors, but to simplify and optimize access those processors are named according to system architecture. There are 23 c-nodes, which are rackmount dual Athlon 1800+ systems (1533 MHz clock) for a total of 46 processors. There are four p-nodes (single CPU 2.0 GHz P4 systems) for four more processors. Finally, seven of the dual 933 MHz PIII systems in the brahma 3 cluster have been dedicated to the nuclear theory work as part of CHAMP to bring to total to 64 processors.
In actuality, there are even more, because of resource sharing between groups and because of small clusters such as the qcd cluster below and even single systems that have been purchased with faculty startup funds or by the University. This flexibility in cluster resource allocation within our department has proven to be very useful when various groups have arrived at a "crunch time" where they are preparing a time sensitive draft of a paper or are getting ready to depart for a conference and need to rapidly complete some last minute computations. It is a good example of how a well organized cluster group (like Duke Physics' brahma) can enhance the research potentialities of all the participants in unexpected ways.
At this point CHAMP is being very heavily used indeed by Bass, Chandrasekharan, and Mueller and their various postdocs and students.
Rama is a cluster consisting of 16 rackmount dual processor Athlon 1900+ systems (1600 MHz). It was purchased by the group of Brown and Ciftan to continue their work on problems of interest in condensed matter theory. As always, the nodes have been kept hard at work since their commissioning in the middle of 2002.
The nano cluster is (currently) the newest cluster in the department. Purchased by the group of Harold Baranger (the department chair) for work in condensed matter and (as the cluster name suggests) nanoscale physics, nano is made up of 32 dual Athlon 1900+'s (1600 MHz), more or less identical in node architecture to the rama cluster. Baranger has a very extensive group of associates, postdocs and students hard at work using this cluster, and has generously shared idle cycles with the rest of the brahma groups as is our departmental custom.