TMVA SOFIE - GPU Support for Machine Learning Inference

Description

SOFIE (System for Optimized Fast Inference code Emit) is a Machine Learning Inference Engine within TMVA (Toolkit for Multivariate Data Analysis) in ROOT. SOFIE offers a parser capable of converting ML models trained in Keras, PyTorch, or ONNX format into its own Intermediate Representation, and generates C++ functions that can be easily invoked for fast inference of trained neural networks. Using the IR, SOFIE can produce C++ header files that can be seamlessly included and used in a ‘plug-and-go’ style.

SOFIE currently supports various Machine Learning operators defined by the ONNX standards, as well as a Graph Neural Network (GNN) implementation. It supports the parsing and inference of Graph Neural Networks trained using DeepMind Graph Nets.

As SOFIE continues to evolve, there’s a need to enable inference on GPUs. This project aims to explore different GPU stacks (such as CUDA, ROCm, ALPAKA) and implement GPU-based inference functionalities in SOFIE. There is already a SYCL implementation for SOFIE, developed in 2023, which can serve as a reference for future development.

Task ideas

In this project, the contributor will gain experience with GPU programming and its role in Machine Learning inference. They will start by understanding SOFIE and running inference on CPUs. After researching GPU stacks and methods of their integration with SOFIE, the contributor will implement GPU support for inference, ensuring the code is efficient and well-integrated with GPU technologies.

Expected results and milestones

Requirements

Mentors

Additional Information

Corresponding Project

Participating Organizations