Geant4-FastSim - Transformer-based architecture for fast shower simulation

Description

In Large Hadron Collider (LHC) experiments, at CERN in Geneva, the calorimeter is a detector that measures the energy of particles. These particles interact electromagnetically and/or hadronically with the material of the calorimeter, creating cascades of secondary particles or showers. Describing the showering process relies on simulation methods that precisely define all particle interactions with matter. A detailed and accurate simulation is based on the Geant4 toolkit. The simulation of showers, with large amounts of particles created and tracked, is inherently slow. With the upcoming high luminosity upgrade of the LHC with more complex events and a much increased trigger rate, the amount of required simulated events will increase. Machine Learning (ML) techniques such as generative models are used as fast simulation alternatives to learn to generate showers in a calorimeter, i.e. simulating the calorimeter response to certain particles.

Considering the recent advancements brought by foundation models (GPT-3, Dall-E-2, etc.) in AI, the idea is to study the applicability of a large-scale transformer-based model for fast simulation. In this regard, we already have a model in development with an architecture similar to ViT (Vision Transformers). Although a transformer is supposed to be a generalized architecture in terms of how it can handle any type of data (text, images, audio, etc.), its downside is having a tradeoff between the computation requirements and the generalizability of the architecture. As generalizability decreases, inductive bias increases, and hence it gets computationally efficient and vice-versa. There are a lot of successors to ViT e.g., Swin Transformer that emphasises on incorporating inductive biases for images to make the architecture computationally efficient. However, particle showers tend to be considerably different from images in terms of sparsity, range of values, dimensions, etc. Hence, the goal of this project would be to explore transformer-based architectures with custom attention, patching, positional encodings, and the overall design.

Furthermore, a byproduct of this project is that the student will get to work with generative transformers which are currently at the forefront of AI research, and learn to use them in the context of high-granularity shower data.

First Steps

Understand how transformers work and how the original transformer architecture is adapted for images.
Understand the shower data.
Propose modifications that can bring benefit when using transformers for shower data.

Project Milestones

Survey promising transformer-based architectures.
Implement the transformer-based architecture.
Perform ablation studies around its hyperparameters.

Expected Results

A custom transformer-based architecture specifically suited for particle showers.

Requirements

Solid knowledge of ML, DL, and Transformers
Strong Python, TensorFlow/PyTorch skills

Evaluation Tasks

Python and ML exercises.

Mentors

Dalila Salamani (CERN)
Piyush Raikwar (CERN)

Additional Information

Difficulty level (low / medium / high): medium
Duration: 350 hours
Mentor availability: June-October

Corresponding Project

Geant4

Participating Organizations

CERN