Multi-thread safe resource monitoring infrastructure for the ATLAS experiment

Introduction

ATLAS is one of the four major experiments at the Large Hadron Collider (LHC) at CERN. It is a general-purpose particle physics experiment run by an international collaboration and is designed to exploit the full discovery potential and the huge range of physics opportunities that the LHC provides.

Athena is the ATLAS software framework that manages almost all ATLAS bulk production workflows: the simulation of LHC events inside the ATLAS detector, the reconstruction of physics quantities from the detector’s output, and the preparation of slimmed down files that physicists use for their analysis. It is written in C++ and Python and based on the Gaudi framework that is also used by the LHCb experiment. As part of the readiness for the LHC Run-3 data-taking, which is expected to begin in 2021, the Athena framework is being upgraded to run in multi-threaded (MT) environment, the project is named AthenaMT.

Description

Performance of ATLAS simulation and reconstruction code is of immense importance in order to work with ever-growing datasets within constrained computing resources, even more so as we look forward to data from the next stage of the LHC, the high-luminosity upgrade (HL-LHC). Understanding Athena’s performance guides the developers to improve the quality of the software and also ensures the ATLAS production workflows can run smoothly on the existing computing infrastructure. It is essential to closely monitor the CPU, memory, and disk usage, identify bottlenecks as they appear and fix any issues rapidly. Without this monitoring, repercussions as severe as loosing data can arise.

The current Athena infrastructure that is in place to monitor the resource usage of ATLAS jobs can only handle single-thread tasks. The goal of this project is to design and implement a new Athena performance monitoring infrastructure that will handle MT jobs, natively. Although the monitoring software is closely tied to the Gaudi/Athena framework, it requires only minimal understanding of the underlying code base and the project is fairly self-contained. The code development will predominantly be in C++.

Task ideas

Expected results

Requirements

Mentors

Corresponding Project

Participating Organizations