Rucio and CS3API to enable data management for the ScienceMesh cloud

Description

CERNBox is a sync and share collaborative cloud storage solution at CERN used by more than 37K users and storing over 12PB of data. The service has evolved from a simple platform providing sync and share services to a collaboration hub to empower scientists. The service has been upgraded to include the Reva daemon in its core, implementing the vendor-neutral CS3APIs interfaces to bridge the gap between heterogeneous storage systems and a wide variety of application providers.

Rucio is an open-source software framework that provides functionality to scientific collaborations to organize, manage, monitor, and access their distributed data and dataflows across heterogeneous infrastructures. Rucio was originally developed to meet the requirements of the high-energy physics experiment ATLAS, and is continuously enhanced to support diverse scientific communities. Since 2016 Rucio orchestrated multiple Exabytes of data access and data transfers globally.

The goal of this proposal to bring support to Rucio to connect to CS3 compliant clouds. Such integration will enable Rucio to become a key player in the pan-European ScienceMesh federated cloud and a key technology to orchestrate data transfers amongst European institutions.

Tasks

The tasks are as follows:

Requirements

Expected results

By the end of GSoC’21 we expect at the end of the deliverable a Rucio plugin that can connect to CS3-compatible clouds, demonstrating data transfers between two endpoints in ScienceMesh.

Mentors

  1. Rucio GitHub
  2. Rucio Documentation
  3. Rucio system overview journal article (Springer)
  4. Rucio operational experience article (IEEE Computer Society)

Corresponding Project

Participating Organizations