Learning to Discover

First segment was held from the 15th to the 26th of July 2019: The First Real Time Analysis Workhop

The second installement is from the 14th to the 25th of October 2019

The third installment is scheduled for the 20th to the 31st of July 2020


A proto-program for a HEP and Data Science event at Institut Pascal.

High Energy Physics is a very complex field dealing with large data sets of several PetaBytes (for example from the LHC). Since the turn of the century, with the advent of internet giants, Data Science has emerged as a science in itself. There is growing interest on importing Data Science algorithms and techniques in High Energy Physics, and to have professional Data Scientists to collaborate with HEP physicists on novel ways to solve HEP problems. Paris-Saclay has pioneered this with the Higgs Machine Learning Challenge in 2014 (http://higgsml.lal.in2p3.fr ), which team forms the nucleus of the organisation of the event proposed here.

The event will have three segments lasting one to two weeks, each one dealing with one significant HEP problem, with about 30-40 participants from both High Energy Physics and Data Science.

We anticipate that some of the more versatile computer scientists will attend the whole duration of the event, to assist with integration. If the time frame is favorable, the event will end with the Data Science @ HEP workshop which is a new yearly conference series (started in 2015) at CERN (Some of us are part of the organising committee).

Each successive track will have a similar layout alternating formal presentation, brain stormings and hackathons, and ample free time for informal interactions.


The goals of the Learning to Discover Program are to:
  • expand the group of researchers that are conversant in Machine Learning, Statistics and HEP in order to create a solid network.
  • elaborate new, well formalized problems that faithfully reflect some HEP important challenges.
  • present and discuss state of the art results in the above-mentioned fields that should be relevant for these problems.
The structure of each segment will be the following
  • Monday morning: Introductory long talks, and long round table (each participant will contribute a one-slide presentation)
  • Tuesday morning: Launch of hackathon, scope and possible approaches
  • Following mornings: Formal presentations
  • Final day: Hackathon: What we’ve learned, workshop summaries from different perspectives
  • Afternoons are left open for improvised discussions. Each morning, a minute taker is appointed to record all discussion
For each segment, the following outreach action are anticipated:
  • A conference for the general public
  • A special hackathon session will be organised for Paris-Saclay students.

Contact will be established with the relevant Ecole Doctorales so that it can be validated as a part of the training


Organizing Committee

Cécile Germain (UPSud)
Isabelle Guyon (UPSud)
Vava Gligorov (LPNHE/CNRS)
Balazs Kegl (LAL/CNRS)
David Rousseau (LAL/CNRS)