Beowulf Logo

Beowulf Papers

CESDIS LOGO

Daniel Ridge, Donald Becker, Phillip Merkey, Thomas Sterling
Becker, Phillip Merkey
Beowulf: Harnessing the Power of Parallelism in a Pile-of-PCs
Proceedings, IEEE Aerospace, 1997

Abstract

The rapid increase in performance of mass market commodity microprocessors and significant disparity in pricing between PCs and scientific workstations has provided an opportunity for substantial gains in performance to cost by harnessing PC technology in parallel ensembles to provide high end capability for scientific and engineering applications. The Beowulf project is a NASA initiative sponsored by the HPCC program to explore the potential of Pile-of-PCs and to develop the necessary methodologies to apply these low cost system configurations to NASA computational requirements in the Earth and space sciences. Recently, a 16 processor Beowulf costing less than $50,000 sustained 1.25 Gigaflops on a gravitational N-body simulation of 10 million particles with a Tree code algorithm using standard commodity hardware and software components. This paper describes the technologies and methodologies employed to achieve this breakthrough. Both opportunities afforded by this approach and the challenges confronting its application to real-world problems are discussed in the framework of hardware and software systems as well as the results from benchmarking experiments. Finally, near term technology trends and future directions of the Pile-of-PCs concept are considered.

PostScript


Chance Reschke, Thomas Sterling, Daniel Ridge, Daniel Savarese, Donald Becker, Phillip Merkey
A Design Study of Alternative Network Topologies for the Beowulf Parallel Workstation
Proceedings, High Performance and Distributed Computing, 1996

Abstract

Coupling PC-based commodity technology with distributed computing methodologies provides an important advance in the development of single-user dedicated systems. Beowulf is a class of experimental parallel workstations developed to evaluate and characterize the design space of this new operating point in price-performance. A key factor determining the realizable performance under real-world workloads is the means devised for interprocessor communications. A study has been performed to characterize the design parameters of a family of interconnect topoligies feasible with low cost mass market network technologies. Findings are presented which compare the advantage of complex segmented topologies over earlier parallel ``channel bonded'' schemes. Behavior sensitivities to packet size and traffic density are determined. It is shown that under many circumstances the more complex topologies result in better performance, and under favorable circumstances software routing techniques experience little performance degradation when compared to more expensive hardware switch mechanisms.

HTML
PostScript


Thomas Sterling, Donald J. Becker, Daniel Savarese, Michael R. Berry, Chance Res Achieving a Balanced Low-Cost Architecture for Mass Storage Management through Multiple Fast Ethernet Channels on the Beowulf Parallel Workstation
Proceedings, International Parallel Processing Symposium, 1996

Abstract:

Network-of-Workstations (NOW) seek to leverage commercial workstation technology to produce high performance computing systems at costs appreciably lower than parallel computers specifically designed for that purpose. The capabilities of technologies emerging from the PC commodity mass market are rapidly evolving to converge with those of workstations while at significantly lower cost. A new operating point in the price-performance design space of parallel system architecture may be derived through parallelism of PC subsystems. The Pile-of-PCs, PopC (pronounced ``pop-see''), approach is being explored through the Beowulf Parallel Workstation developed to provide order-of-magnitude increases in disk capacity and bandwidth for a single user environment at costs commensurate with conventional high-end workstations. This paper explores a critical aspect of the architecture trade-off space for Beowulf associated with the balance of parallel disk throughput and internal network bandwidth. The findings presented demonstrate that parallel channels of commodity 100 Mbps Ethernet are both necessary and sufficient to support the data rates of multiple concurrent file transfers on a sixteen processor Beowulf parallel workstation.

HTML
PostScript


Donald J. Becker, Thomas Sterling, Daniel Savarese, Bruce Fryxell, Kevin Olson
Communication Overhead for Space Science Applications on the Beowulf Parallel Workstation
Proceedings,High Performance and Distributed Computing, 1995

Abstract

The Beowulf parallel workstation combines 16 PC-compatible processing subsystems and disk drives using dual Ethernet networks to provide a single-user environment with 1 Gops peak performance, half a Gbyte of disk storage, and up to 8 times the disk I/O bandwidth of conventional workstations. The Beowulf architecture establishes a new operating point in price-performance for single-user environments requiring high disk capacity and bandwidth. The Beowulf research project is investigating the feasibility of exploiting mass market commodity computing elements in support of Earth and space science requirements for large data-set browsing and visualization, simulation of natural physical processes, and assimilation of remote sensing data. This paper reports the findings from a series of experiments for characterizing the Beowulf dual channel communication overhead. It is shown that dual networks can sustain 70% greater throughput than a single network alone but that bandwidth achieved is more highly sensitive to message size than to the number of messages at peak demand. While overhead is shown to be high for global synchronization, its overall impact on scalability of real world applications for computational fluid dynamics and N-body gravitational simulation is shown to be modest.

HTML
PostScript


Donald J. Becker, Thomas Sterling, Daniel Savarese, John E. Dorband, Udaya A. Ranawak, Charles V. Packer
BEOWULF: A PARALLEL WORKSTATION FOR SCIENTIFIC COMPUTATION
Proceedings, International Conference on Parallel Processing, 95

Abstract

Network-of-Workstations technology is applied to the challenge of implementing very high performance workstations for Earth and space science applications. The Beowulf parallel workstation employs 16 PC-based processing modules integrated with multiple Ethernet networks. Large disk capacity and high disk to memory bandwidth is achieved through the use of a hard disk and controller for each processing module supporting up to 16 way concurrent accesses. The paper presents results from a series of experiments that measure the scaling characteristics of Beowulf in terms of communication bandwidth, file transfer rates, and processing performance. The evaluation includes a computational fluid dynamics code and an N-body gravitational simulation program. It is shown that the Beowulf architecture provides a new operating point in performance to cost for high performance workstations, especially for file transfers under favorable conditions.

HTML
PostScript