General theme and interests
While the main theme of our research is the multidisciplinary topic of computational physics, at its core, it is about processing complex and large amounts of data to achieve particular goals for a unique use-case. Having such a perspective enables the following domains of interest:
- Automated ML model design
- Data-efficient ML models
- High-performance training and inference
- Data pipeline design
- Automated Design-Space Exploration (DSE)
- Data synthesis
- Complexity-reduced simulations
- ML model deployment
The offered student projects are closely embedded in existing and ongoing research. Successful projects will contribute directly to the state-of-the-art and are expected to generate reusable results. Every student project shall deliver a number of artefacts including, in most cases a mini-survey, a code repository, a Minimal Reproducible Example (MRE) and possibly a reusable data set.
Students from both Physics and Computer Science disciplines can work on these topics. However, they are expected to possess the necessary knowledge and experience with computer programming. These projects are available to students from the University of Twente (UTwente) and the University of Amsterdam (UvA).
In case of enquiries, you can reach us at: aW5mb0BWaXJ0dWFsRGV0ZWN0b3IuY29t
Projects
MP-SIM-01
Title: High-Performance Simulations for High-Energy Physics Experiments - Multiple Enhancements
Status: Open
Artefacts: n/a
Abstract
Simulation and synthetic data play a pivotal role in High-Energy Physics (HEP) research, offering physics-accurate but slow frameworks and faster alternatives balancing speed and accuracy. This project extends the REDVID simulation framework, incorporating features like interaction with Monte Carlo event generators, basic magnetic field effects, pile-up effects, and considerations for reproducibility. Aimed at facilitating Machine Learning (ML) model design research for particle track reconstruction at the HL-LHC, the enhanced REDVID enables the generation of training data.
MP-SIM-02
Title: High-Performance Simulations for High-Energy Physics Experiments - Timing
Status: Open
Artefacts: n/a
Abstract
Simulation and synthetic data are integral to High-Energy Physics (HEP) research, offering physics-accurate but computationally demanding frameworks. Parametric and complexity-aware simulation tools simplify complexities, generating manageable data sets. This project extends the REDVID simulation framework to incorporate a time dimension, crucial for tracking particle interactions in detectors. As part of ongoing efforts for Machine Learning (ML) model design for particle track reconstruction at the HL-LHC, the enhanced REDVID enables the generation of training data with time information, facilitating 4-dimensional tracking.
MP-SIM-03
Title: High-Performance Simulations for High-Energy Physics Experiments - Track Generation and Complexity Levels
Status: Open
Artefacts: n/a
Abstract
Simulation and synthetic data play a crucial role in High-Energy Physics (HEP) research, with physics-accurate frameworks offering realistic data but being computationally intensive. This project extends the REDVID simulation framework to generate data at various complexity levels, integrating features like spherical coordinates, sub-detector layers, and track randomisation protocols. By focusing on complexity-reduced simulations, the aim is to facilitate Machine Learning (ML) model design research for particle track reconstruction at the HL-LHC. The student will define simulation recipes for different complexity levels, contributing to the ongoing efforts in ML model development.
MP-SIM-04
Title: High-Performance Simulations for High-Energy Physics Experiments - Electron Simulation
Status: Open
Artefacts: n/a
Abstract
Simulation and synthetic data generation play a crucial role in High-Energy Physics (HEP) research, with physics-accurate frameworks providing realistic but slow data syntheses. Fast simulation frameworks balance speed and accuracy, enabling various applications, including their role in Machine Learning (ML) model design research. This project extends the REDVID simulation framework to support different particles, especially electrons, essential for accurately simulating particle behaviour. The addition includes implementing electron interactions with matter, bremsstrahlung radiation, and energy loss effects, with a focus on their impact on ML model training. The improved REDVID aids in generating training data for ML models aimed at particle track reconstruction for the HL-LHC, or investigations into the new features' impact on ML models.
MP-SIM-05
Title: High-Performance Simulations for High-Energy Physics Experiments - Muon Simulation
Status: Open
Artefacts: n/a
Abstract
Simulation and synthetic data are integral to High-Energy Physics (HEP) research, offering both physics-accurate but slow frameworks and faster, simplified alternatives. This project aims to enhance the REDVID simulation framework by incorporating key features, notably support for muons. Muons possess distinct characteristics such as higher penetration power, instability leading to decay, interaction with matter, and susceptibility to magnetic fields. The goal is to implement these traits into REDVID, facilitating the generation of training data for Machine Learning (ML) models aimed at particle track reconstruction for the HL-LHC. The student shall study, select and implement a minimum set of distinguishing characteristics to REDVID.
MP-PERF-01
Title: Performance Analysis and Benchmarking for Simulations
Status: Will follow MP-SIM-01, MP-SIM-02, MP-SIM-03, MP-SIM-04, MP-SIM-05
Artefacts: n/a
Abstract: n/a
MP-DATA-01
Title: Automated Data Distillation with Corner Case Preservation
Status: Open
Artefacts: n/a
Abstract: n/a
MP-NAS-P-01
Title: Neural Architecture Search - Optimisations for Model Parallelism
Status: Completed
Abstract
This project focuses on leveraging Neural Architecture Search (NAS) methodologies to prioritise model parallelism during the inference stage. The aim is to develop a NAS strategy that emphasises model parallelism as a key quality metric. The student will devise a generic and reusable measure to evaluate models for suitability for parallelism, automating the criteria evaluation process for diverse models. Public image data sets and models from the ONNX repository, or other well-established models will be considered for this study. As part of ongoing efforts in ML model development for particle track reconstruction at the HL-LHC, the project contributes to addressing high hardware resource demands by developing methodologies and tools for efficient model design.
MP-NAS-D-01
Title: Neural Architecture Search - Optimisations for Data-Efficiency
Status: Will follow MP-NAS-P-01
Artefacts: n/a
Abstract: n/a
MP-HW-P-01
Title: Partitioned Model Deployment on FPGA Platforms - Latency Improvement
Status: Open
Artefacts: n/a
Abstract
Machine Learning (ML) models are becoming integral in scientific applications, notably in the upcoming High-Luminosity Large Hadron Collider (HL-LHC) upgrade at CERN. This project focuses on deploying ML models, specifically Transformer models, on FPGAs to improve the efficiency of subatomic particle trajectory reconstruction. Given the substantial data from HL-LHC, transforming this task into an online or pseudo-online process is crucial. The primary challenge lies in fitting these large models onto FPGA hardware, necessitating model partitioning. The student will explore state-of-the-art techniques and tools like hls4ml, aiming to create an automated workflow for FPGA deployment and benchmark its performance.