A long-standing mystery of fundamental physics is the existence of dark matter (DM), a type of matter that has little interaction with ordinary matter but is supported by various astrophysical and cosmological observations and is six times more abundant than ordinary matter in the universe. Several Large Hadron Collider (LHC) experiments are conducting searches aimed at detecting dark matter. Unsupervised and semi-supervised learning outlier detection techniques are advantageous to these searches, for casting a wide net on a variety of possibilities for how dark matter manifests, as they impose minimal constraints from specific physics model details, but rather learn to separate characteristics of rare signals starting from the knowledge of the background they’ve been trained on. Developing innovative search techniques for probing dark matter signatures is crucial for broadening the DM search program at the LHC, and BEAD is a Python package that uses deep learning based methods for anomaly detection in HEP data for such new physics searches. BEAD has been designed with modularity in mind, to enable usage of various unsupervised latent variable models for any task.
BEAD has five main running modes:
Data handling: Deals with handling file types, conversions between them and pre-processing the data to feed as inputs to the DL models.
Training: Trains a model to learn implicit representations of the background data that may come from multiple sources(/generators) to get a single, encriched latent representation of it.
Inference: Using a model trained on an enriched background, the user can feed in samples where to detect anomalies in.
Plotting: After running Inference, or Training, one can generate plots. These include performance plots as well as different visualizations of the learned data.
Diagnostics: Enabling this mode allows running profilers that measure a host of metrics connected to the usage of the compute node to help optimization of the code (using CPU-GPU metrics).
The package is under active development. The student in this project will work on the machine learning models available in BEAD, and implementing new models to perform anomaly detection, initially on simulated data.
Possible projects include:
Ideas from the student working on this project are also welcome.
An improved performance of selected models, with documentation and figures of merit that may include:
PyTorch