Advanced Belle II Software Validation

The Belle II Software suite is the custom software tasked with handling the simulation, reconstruction and analysis of the data coming in from the Belle II detector. To keep pace with the current research and technological evolutions, the code undergoes constant changes, to optimize the algorithms for better physics performance and to ensure resource efficiency.

The Belle II software has had several monitoring schemes implemented so far, along with a high-level validation tool. This tool runs on existing large datasets producing distribution plots of quantities relevant for physics analyses and displaying the results on a webpage. This project seeks to enhance the user experience of the Belle II software validation page and make workflows more efficient with appropriate technical solutions.

Problem Outline

These are the things that can be improved upon, in the existing system.

Proper tracking of degradation history with the help of bookkeeping software.
Having references inside plots generated by the validation system.
Make it easier to find associated artefacts to encourage reuse.
Provide developers with more information about failures found to help with root cause analysis.

Project Progression

Local base

Initial days were spent setting up the basf2 software in the local system to be able to run the validation tool, host a local validation server and as well as to develop the code. Relevant data files were also obtained that are necessary inputs for certain validation modules. Finally, a GitLab project was set up to mimic the production project and to be a playing field to test initial implementations.

Code review

Whilst the base was being built on the sideline the existing implementation review was underway. The first aim was to get a birds-eye view of the overall system and then do a deep dive into the relevant modules, and identify the modules of interest. Time was also spent familiarizing myself with the software development and coding conventions.

CherrPy meets GitLab

Two of the key tech that will feature throughout the work are CherryPy and GitLab, so a good amount of time was spent understanding and then getting hands dirty with trying and testing the integration of the two, which is key for the bulk of the improvements planned. With that, the first task undertaken was to extend details displayed along with plots of validation runs to include a list of relevant issues. A couple of features have also been added that would allow the reviewer of the results to directly create issues from the plot display window or update existing relevant issues in GitLab.

linked_issue_update

Plot/logfile endpoints

As a part of validation run result processing, emails are sent out for failed scripts/plots to the respective module owners. Apart from a brief summary regarding the plot/script responsible for the future, only links to the homepage of the validation was provided, forcing people to scour through the results to find the image of the failed plot/log file of the failed script. Modifications made to the mail-utils now will allow the mail-bot to include direct links to the relevant plots/script log files making it easier for the module owners to inspect and analyse what went wrong.

Consolidate and display datafiles

Many of the validation scripts depend on simulated event data files, which require effort, time and space to produce on store. Whilst there is reuse of datafiles across a module, there might still be some degeneracy when looking at the full set of datafiles used by all the modules. A new page has now been added to the validation server which will display all the datafiles produced by the validation steering scripts by all the modules. The datafiles will be downloadable from the validation page and metadata of the file can also be viewed. This along with the information about the steering file responsible for the generation all available at a single point should potentially help reuse of the existing datafiles across different modules.

datafiles_page

Tools Used

Python 3
CherryPy
Ractive.js
GitLab API

Lessons Learnt

Identify new external dependencies your code changes require and make sure to communicate those to the respective responsible people in the early stages of the development rather than waiting till your done with your development.
Reviewing and testing code thoroughly for backward compatibility is an additional way to catch bugs and exceptions.

Future Work

Assign issues created in GitLab to the responsible person.
Assess if search functionality could further help enhance the validation page.
Track the number of days an issue has been open and how the respective plots have changed in the subsequent validation runs.

GSoC Experience

The entire experience has been a learning ordeal that I have thoroughly enjoyed, and am looking forward to continue making more meaningful contributions, both to Belle2 and to the Open Source community. I am extremely grateful for all the support I have been receiving from my mentors. I was provided with the platform to explore new ideas. An added bonus was being able to take part in the regular software development meeting to get to know about the other work being done as well and present and receive feedback on my implementation as well. I faced no hiccups, with my mentors taking care of any and all of my administrative needs as well as any questions I might have. Thanks to my main mentor scheduling regular meetings, I was able to stay on top of my work and never get hung up on any road blocks.