Dataset-manipulation tools for simulated collider events


Simulated collider events are an essential tool in particle physics: most physics data-analyses are based around comparisons of data from the real particle accelerator to simulations made in “Monte Carlo (MC) generator” programs, as are studies to design new analyses, detectors, or colliders. They are also an essential method of communication between experimental and theoretical physicists.

At present there are two main data formats for MC event datasets — HepMC and LHE, for full-detail and much smaller “hard process” event types respectively — and a newer HDF5-based format for hard-process calculations on massive parallel-computing (HPC) facilities. But while programming interfaces exist for these (in particular the main formats), there is an absence of user-friendly, command-line tools for event-file inspection, visualisation, and manipulation.

Task ideas

This project will provide a coherent set of command-line tools for inspecting, manipulating, and displaying MC event datasets, bringing together existing code snippets into an official set of standard commands to be used across particle physics.

Familiarity with C++, Python, bash, and git are essential; Graphviz and HDF5 can be learned in-situ.


