 |
 |
|
SC Conference - Activity Details
50.5 Mflops/dollar and 8.5 Tflops Cosmological N-body Simulation on a GPU Cluster
Authors:
|
Tsuyoshi Hamada
(Nagasaki University)
|
|
Keigo Nitadori
(University of Tokyo)
|
|
Tomonari Masada
(Nagasaki University)
|
|
Makoto Taiji
(RIKEN)
|
Posters Session
|
Tuesday, 05:15PM - 07:00PM
|
|
Room Rotunda Lobby
|
Abstract:
Many works of GPGPU were reported also
in astrophysical N-body simulations, which is one of the
grand-challenge problems in computational sciences. However, most of
these works based on simple O(N^2) algorithm, and their performances
were not higher than performances with conventional CPUs based on O(Nlog N) algorithm like the tree or Particle-Particle Particle-Mesh
algorithms. Because of the difficulty in efficient implementation of
these algorithms on GPUs, a GPU cluster had no practical advantage to
general PC clusters for N-body simulation. In this paper, we report
new parallel implementation of the tree algorithm that works with high
efficiency on GPUs. Our novel tree code realized N-body simulation
on a GPU cluster at higher performance than that on general PC
clusters. In practice, we performed a cosmological simulation with 562
million particles on the GPU cluster using 128 GeForce8800GTS
at the cost of 168,172 dollars. The sustained performance was 20.1 Tflops, which was equivalent to 8.50 Tflops on general CPU. The
achieved cost/performance was 50.5 Mflops/dollar.
|
|
|