SC Conference - Activity Details

Parallel Program Analysis and Optimization for High-Performance Computing

Seung-Jai Min  (Purdue University)
Doctoral Research Showcase Session
Thursday,  03:30PM - 03:45PM
Room 17A/17B
To increase the productivity of parallel programming in today's diverse and heterogeneous programming environment, it is beneficial to develop an automatic program translation framework that converts one parallel program model into forms that expose the most efficient parallelism of the target architectures. We have developed two source-to-source translation systems; the Lean Distributed Shared Memory (LDSM) system executes OpenMP programs on distributed memory clusters and the OpenMP-to-GPU translator converts OpenMP programs into CUDA GPU programs. The goal of this research is to extend the ease of shared memory parallel programming to distributed memory clusters and GPUs. In this talk, we present our translation methods that can efficiently convert OpenMP programs into MPI and CUDA GPU programs and discuss optimization techniques to improve the performance of translated programs, both for regular and irregular applications. We evaluate the translation framework using OpenMP programs from the SPEC OMP2001 and NAS parallel benchmarks.
   IEEE Computer Society  /  ACM     2 0   Y E A R S   -   U N L E A S H I N G   T H E   P O W E R   O F   H P C