Award Finalist/Winner
Student Contribution

SC Conference - Activity Details

Policy-driven Data Management for Distributed Scientific Collaborations using a Rule Engine

Sara Alspaugh  (University of Virginia)
ACM Student Competition Session
Tuesday,  05:15PM - 07:00PM
Room Rotunda Lobby
Data-intensive distributed science applications depend on efficient access to data sets and to high performance computational resources. Data sets generated on an experimental apparatus or on computational resources must be distributed widely to scientists in the collaboration, often according to policies set by the collaborating institutions. These policies pertain to the dissemination, security, and reliability of data sets. In this work, we integrate an open source rule engine with existing services for grid data management to perform policy-driven data distribution. We implement and evaluate two realistic distribution policies for distributed science applications. The first policy specifies a tier-based pattern to distribute published data products in a manner similar to that used in high energy physics applications. The second policy maintains a specified number of replicas for each file. Our initial results indicate that a rule engine is well-suited to the problem of policy-based data management for distributed science applications.
   IEEE Computer Society  /  ACM     2 0   Y E A R S   -   U N L E A S H I N G   T H E   P O W E R   O F   H P C