![]() |
||||
| At the ConferenceExhibitsTransportationLodgingDiningNightlife | ||||
![]() |
||||
SC Conference - Activity DetailsCoordinated Fault Tolerance in High-end Computing Environments Primary Session Leader:
Pete Beckman
(Argonne National Laboratory)
Secondary Session Leaders:
The Coordinated Infrastructure for Fault Tolerant Systems (CIFTS) initiative
provides a standard framework, through the Fault Tolerance Backplane (FTB),
where any component of the software stack can report or be notified of faults
through a common interface - thus enabling coordinated fault tolerance and
recovery. At SC'07, we had an enthusiastic audience of industry leaders,
academia, and research institutions participate in the CIFTS BOF.
Expanding on our previous success, the objectives of the SC'08 BOF are:
1. Discuss the experiences gained, challenges faced in comprehensive fault
management on petascale leadership machines, and the impact of the CIFTS
framework in this environment. Teams developing FTB-enabled software such as
MVAPICH2, MPICH2, Open MPI, Cobalt, and others, will share their experiences.
2. Discuss the recent enhancements and planned developments for CIFTS and
solicit audience feedback.
3. Bring together individuals responsible for high-end, petascale computing
infrastructures, who have an interest in developing fault tolerance specifically
for these environments.
|
||||
|