headerlogo
scyourway
Award Finalist/Winner

SC Conference - Activity Details



GrayWulf: Scalable Clustered Architecture for Data Intensive Computing

Team Members:
Alexander Szalay  (Johns Hopkins University)
Maria Nieto-Santisteban  (Johns Hopkins University)
Jan Vandenberg  (Johns Hopkins University)
Alainna Wonders  (Johns Hopkins University)
Randal Burns  (Johns Hopkins University)
Eric Perlman  (Johns Hopkins University)
Ani Thakar  (Johns Hopkins University)
Mike McCarty  (Johns Hopkins University)
Dean Zariello  (Johns Hopkins University)
Gordon Bell  (Microsoft Research)
Tony Hey  (Microsoft Research)
Roger Barga  (Microsoft Research)
Yogesh Simmhan  (Microsoft Research)
Michael Thomassy  (Microsoft Corporation)
Lubor Kollar  (Microsoft Corporation)
Catherine Van Ingen  (Microsoft Research)
Robert Grossman  (University of Illinois at Chicago)
David Hanley  (University of Illinois at Chicago)
Yunhong Gu  (University of Illinois at Chicago)
Michael Sabala  (University of Illinois at Chicago)
Jim Heasley  (University of Hawaii)
Tim Carrol  (Dell Inc.)
Eric Barnes  (Dell Inc.)
Mike Rowland  (Dell Inc.)
Challenges Session
SC08 Storage Challenge
Tuesday,  11:00AM - 11:30AM
Room 17A/17B
Abstract:
Data intensive computing presents a significant challenge for traditional supercomputing architectures that maximize FLOPS since CPU speed has surpassed IO capabilities of HPC systems and BeoWulf clusters. We present the architecture for a three tier commodity component cluster designed for a range of data intensive computations operating on petascale data sets. The design goal is a balanced system in terms of IO performance and memory size, according to Amdahl’s Laws. GrayWulf pays tribute to Jim Gray who stimulated the system and its design. The hardware currently installed at JHU exceeds one petabyte of storage and has 0.5 bytes/sec of I/O and 1 byte of memory for each CPU cycle. The GrayWulf provides almost an order of magnitude better balance than existing systems. Our benchmarks are based on date from the petascale Pan-STARRS project, building the largest sky survey to date. The benchmarks involve sequential searches over hundreds of terabytes.
   IEEE Computer Society  /  ACM     2 0   Y E A R S   -   U N L E A S H I N G   T H E   P O W E R   O F   H P C