PODIO, or plain-old-data I/O, is a C++ library to support the creation and handling of data models in particle physics. It is based on the idea of employing plain-old-data (POD) data structures wherever possible, avoiding deep-object hierarchies and virtual inheritance. This is to improve runtime performance and simplify the implementation of persistency services (writing the in-memory objects to disk). To support the usage of modern software technologies, PODIO was written with concurrency in mind and gives basic support for vectorisation technologies. PODIO is already actively used by the Future Circular Collider (FCC) project and is under evaluation by the Linear Collider community (ILC).
PODIO already has a persistency implementation based on ROOT I/O, but is coded in such a way that adding other persistency backends is quite straightforward.
Hierarchical Data Format (or HDF5) is a generic data format used for scientific data in many fields. It enjoys wide support, with many different implementations and optimisations; it is an ideal format to store PODIO data in.
This GSoC task is to implement an HDF5 backend for PODIO.
There is an evaluation exercise for this project that we ask all prospective students to complete.