SC Conference - Activity Details

Automatic Linear Transform Program Generation for the Cell BE

Srinivas Chellappa  (Carnegie Mellon University)
Franz Franchetti  (Carnegie Mellon University)
Markus Pueschel  (Carnegie Mellon University)
Posters Session
Tuesday,  05:15PM - 07:00PM
Room Rotunda Lobby
The complexity of the Cell BE's architecture makes it difficult and time consuming to develop multithreaded, vectorized, high-performance numerical libraries for it. Our approach to solving this problem is to use Spiral, a program generation system, to automatically generate and optimize linear transform libraries for the Cell. We first extend the Spiral framework for the Cell architecture to automatically generate high-performance discrete Fourier, Cosine, and Sine transform kernels that run on a single SPE. The performance of our kernels is comparable to hand tuned code where available, and reaches 14--20 Gflop/s on a single SPE for input vectors resident in the local memories. Next, we produce optimized multithreaded code for the discrete Fourier transform that runs on multiple SPEs and obtains upto a 2x speed-up on 4 SPEs. Our poster will provide details of the formal methods used by our approach, and will present our latest performance results.
   IEEE Computer Society  /  ACM     2 0   Y E A R S   -   U N L E A S H I N G   T H E   P O W E R   O F   H P C