next up previous contents
Next: Overview of Beowulf Design Up: Introduction Previous: What's a Beowulf Good   Contents

Historical Perspective and Religious Homage

The concept underlying the beowulf-style compute cluster is not new, and was not invented by any one group at any one time (including the NASA group headed by Sterling and Becker that coined the name ``beowulf''). Rather it was an idea that was developed over a long period and that grew along with a set of open source tools capable of supporting it (primarily PVM at first, and later MPI). Note that this is not an attempt to devalue the contributions of Sterling and Becker in any way, it is simply a fact.

However, Thomas Sterling and Don Becker at NASA-Goddard (CESDIS) were, as far as I know, the first group to conceive of making a dedicated function supercomputer out of commodity components running entirely open source software and Don Becker, especially, has devoted a huge fraction of his life to the development of the open source software drivers required to make such a vision reality. Don actually wrote most of the ethernet device drivers in use in Linux today, which are the sine qua non of any kind of networked parallel computing1.23. The NASA group also made specific modifications to the Linux kernel to support beowulf design (like channel bonding) that are worthy of mention. Most recently Don Becker and Erik Hendricks and others from the original NASA-Goddard beowulf team have formed Scyld.com1.24, which both maintains the beowulf list and beowulf website and has produced a ``true beowulf in a box'' - the Scyld Beowulf CD - that can be used to transform any pile of PC's into a beowulf in literally minutes.

By providing the sexy name, a useful website, and the related mailing list they formed a nucleation point for all the users of PVM and MPI who were tired of programming in parallel on networks of expensive hardware with proprietary and expensive operating systems (like those offered at the time by IBM, DEC, SGI, Sun Microsystems, and Hewlett-Packard) only to have to buy the whole thing over and over again at very high cost as the hardware evolved. Once Linux had a reasonably reliable network and Intel finally managed to produce a mass-market processor with decent and cost-beneficial numerical performance (the P6), those PVM/MPI users, including myself, rejected those expensive, proprietary systems like radioactive waste and joined with others of a like mind on the beowulf list. This began an open source development/user support cycle that persists and is amazingly effective today.

It is this last contribution, the clear articulation of the idea of the Linux-based beowulf and the focusing of previously disparate energies onto its collaborative development that is likely to be the most important in the long run, as it transcends any particular architectural contributions made in association with the original project. It is an idea that is finally coming to a long awaited maturity - it appears that a number of Linux distributions are going to be providing integrated beowulf/cluster software support ``out of the box'' in their standard distributions quite soon (really, they largely have for some time, although there have been a few missing pieces). The Scyld beowulf is just the first, and most deeply integrated, of what I expect to become many attempts to make network parallel computing a fully integrated feature of everyday Linux rather than something even remotely exotic.

Beowulfs have always been built from M$^2$COTS hardware, which is by definition readily available. Soon beowulf support will similarly be in M$^2$COTS box-set Linux distributions (instead of being scattered hither-and-yon across the web). That takes care of the hardware and software side of things. All that's missing is the knowledge of how to put the two together to make beowulfs work for you, a hole that I'm shamelessly hoping to exploit, errrm, uh, ``fill'' with this book1.25. With all this to further support parallel development, can commercial-grade parallel software be far behind?

With imitation being the most sincere form of flattery, it is amusing that the beowulf concept has been transported by name to other architectures, some of them most definitely not open source on COTS hardware (there are FreeBSD beowulfs, NT based ``beowulfs'', Solaris based ``beowulfs'', and so forth, where I quite deliberately put the term beowulf in each of these latter cases within quotes to indicate my skepticism that - with the exception of the FreeBSD efforts - the clusters in question could truly qualify as beowulfs1.26.

Not that I'm totally religious about this - a lot of the clusters I'll discuss below, although COTS and open source, are not really beowulfs either although they function about the same way. I am fairly religious about the open source part; it is a True Fact that nobody sane would consider building a high performance beowulf without the full source of all its software components, especially the kernel. I also really, really like Linux. However, even ignoring the historical association of beowulfery and Linux, there are tremendous practical advantages associated with access to the full operating system source even for people with mundane needs.

Issues of control, repair, improvement, cost, or just plain understandability all come down strongly in favor of open source solutions to complex problems of any sort. Not to mention scalability and reliability. This is true in spades for beowulfery, which tends to nonlinearly magnify any small instability in its component platforms into horrible problems when jobs are run over lots of nodes.

If you are foolish enough to buy into the notion that WinNT or Win2K (for example) can be used to build a ``beowulf'' that will somehow be more stable than or outperform a Linux-based beowulf, you're paying good money1.27 for an illusion, as you will realize very painfully the first time your systems misbehave and Microsoft claims that it Isn't Their Fault. They could even be right. It wouldn't matter. It's out of your control and you'll likely never know, since long before you find out your patience will be exhausted and you'll go right out and reinstall Linux on the hardware (for free), do a recompile, and live happily ever after1.28. Use the NT CD's (however much they cost originally) for frisbees with your dog, or as coasters for your coffee cup1.29.

At this point in time beowulfs (both ``true beowulfs'' and beowulf-style M$^2$COTS clusters of all sorts) are proven technology and can easily be shown to utterly overwhelm any other computing model in cost-benefit for all but a handful of very difficult bleeding edge computational problems. A beowulf-style cluster can often equal or even beat a ``big iron'' parallel supercomputer in performance while costing a tiny fraction as much to build or run1.30. The following is a guide on how to analyze your own situation and needs to determine how best to design a beowulf or beowulf-style cluster to meet your needs at the lowest possible cost. Enjoy.

next up previous contents
Next: Overview of Beowulf Design Up: Introduction Previous: What's a Beowulf Good   Contents
Robert G. Brown 2003-05-12