 |
 |
 |
Student Contribution |
SC Conference - Activity Details
Massively Parallel Genomic Sequence Search on the Blue Gene/P Architecture
Authors:
|
Heshan Lin
(North Carolina State University)
|
|
Pavan Balaji
(Argonne National Laboratory)
|
|
Ruth Poole
(IBM Corporation)
|
|
Carlos Sosa
(IBM Corporation)
|
|
Xiaosong Ma
(North Carolina State University)
|
|
Wu-chun Feng
(Virginia Tech)
|
Papers Session
|
Biomedical Informatics
|
|
Wednesday, 03:30PM - 04:00PM
|
|
Room Ballroom E
|
Abstract:
This paper presents our first experiences in mapping
and optimizing genomic sequence search onto the massively
parallel IBM Blue Gene/P (BG/P) platform. Specifically, we
performed our work on mpiBLAST, a parallel sequence-search
code that has been optimized on numerous supercomputing
environments. In doing so, we identify several critical performance
issues. Consequently, we propose and study different
approaches for mapping sequence-search and parallel I/O tasks
on such massively parallel architectures.We demonstrate that our
optimizations can deliver nearly linear scaling (93% efficiency)
on up to 32,768 cores of BG/P. In addition, we show that such
scalability enables us to complete a large-scale bioinformatics
problem — sequence searching a microbial genome database
against itself to support the discovery of missing genes in genomes
— in only a few hours on BG/P. Previously, this problem was
viewed as computationally intractable in practice.
|
|
|