Large-scale computing backend for Jupyter notebooks - HTCondor batch job submission and monitoring using the Ganga toolkit

Description

Nowadays, in the data analysis field, there is a noticeable trend towards web-based interactive analysis, where the user interacts with an on-line service by means of a web-browser. One of the preferred interfaces for this interactive analysis is Jupyter notebooks.

CERN provides a notebook service, called SWAN, where users can create notebooks on demand and access the software and data they need to run those notebooks. This model of data analysis works for small, exploratory computations, where the interactivity of the notebook is the perfect match, but physicists also often need to submit more expensive computations to larger resources. Traditionally, this has been done at CERN by submitting batch jobs – using open-source toolkits such as Ganga – that operate on a set of input data and produce some results.

In order to integrate the interactive mode with the batch mode, this project aims to develop a plugin for Jupyter notebooks that allows users to easily offload big computations to external resources and monitor them.

Task ideas

Expected results

Working implementation of a Python/Javascript library that allows to easily submit and monitor batch jobs from a Jupyter notebook

Requirements

Python, JavaScript, Jupyter notebooks

Mentors

GSoC 2018 Projects Related To Large-scale computing backend for Jupyter notebooks - HTCondor batch job submission and monitoring using the Ganga toolkit

Project Proposals

Corresponding Project

Participating Organizations