Proto-programme for a HEP and Data Science event at Institut Pascal High Energy Physics is a very complex field dealing with large data sets of several PetaBytes (for example from the LHC). Since the turn of the century, with the advent of internet giants, Data Science has emerged as a science in itself. There is growing interest on importing Data Science algorithms and techniques in High Energy Physics, and to have professional Data Scientists to collaborate with HEP physicists on novel ways to solve HEP problems. Paris-Saclay has pioneered this with the Higgs Machine Learning Challenge in 2014 ( ), which team forms the nucleus of the organisation of the event proposed here.

The event will have three tracks of one to two weeks, each one dealing with one significant HEP problem, with about 30-40 people from both High Energy Physics and Data Science. We anticipate some of the more versatile computer scientists to stay the whole duration of the event and help cross-fertilization. If the time frame is favorable, the event will end with the Data Science @ HEP workshop which is a new yearly conference series started in 2015 at CERN (some of us are part of the organising committee), which can be organised in the Paris area with an expected audience of about 250 people. The event shall take place in 2019 or 2020

Each successive track will have a similar layout alternating formal presentation, brain stormings and hackathons, and ample free time for unorganised interactions.


!! First track starting 15 July 2019 : the First Real Time Analysis Workhop !!

Click here for more information.


The goals of the Learning to Discover Program are:

● expand the group of researchers that are conversant in Machine Learning, Statistics and HEP in order to create a permanent network.

● elaborate new well formalized problems that faithfully reflect some HEP important challenges.

● present and discuss state of the art results in the above-mentioned fields that should be relevant for these problems.

The structure of each track will be the following

● Monday morning : introductory long talks, and long round table (each participant will have contributed a one slide presentation)

● Tuesday morning : launch of hackathon, scope and possible approaches

● Next mornings : formal presentations

● Last day : hackathon : what we’ve learned, workshop summaries from different angles

●  Afternoons are left open for improvised discussions. Each morning, a minute taker is appointed to record all discussion

For each track, the following outreach action are anticipated:

● A conference for the general public

● A special hackathon session will be organised for Paris-Saclay students.

Contact will be established with the relevant Ecole Doctorales so that it can be validated as a part of the training


Organizing Committee
Cécile Germain (UPSud)
Isabelle Guyon (UPSud)
Vava Gligorov (LPNHE/CNRS)
Balazs Kegl (LAL/CNRS)
David Rousseau (LAL/CNRS)