REDVID Simulation Framework

Description

REDuced VIrtual Detector (REDVID) is a simulation framework and a synthetic data generator written in Python. As a reduced simulator, REDVID simulates the propagation of subatomic particles in a virtual detector model with a given geometry, inspired by the detectors installed at the Large Hadron Collider (LHC). The simulation model is complexity-reduced and is intended for generating source data to train Machine Learning (ML) algorithms and to perform ML-assisted solution exploration. The data is in the form of hit point coordinates in space and trajectory function parameters.

Sample events

A few events from simulations with varying recipes are shown for demonstration purposes. The plots below vary in track definitions. The track count is limited to five to improve legibility. From left to right, these plots depict the full event view, the hit points view and the tracks view, respectively. Note the incorporated detector model geometry as depicted in Figure 1.

Detector geometry
An example virtual detector and its layers

The following plots consider this virtual detector with a 90-degree rotated orientation for plots. Note that the Z-axis has to go through the detector. We keep the scale down in these examples for legibility purposes, i.e., we are generating a low number of tracks per event.

3D space, noisy hit coordinates, linear tracks

A sample event with five linear tracks starting at the geometric origin and being randomly directed. The randomisation of these tracks follows the first track randomisation protocol, i.e., Protocol 1 - Last layer hit guarantee. Refer to our relevant publications for further details on track randomisation protocols. Different views for this event are depicted in Figure 2.

Full event Hit points Tracks
Views for full event, hit points and tracks

3D space, noisy hit coordinates, helical uniform tracks

This example event goes a step further in complexity compared to events with linear tracks. Helical uniform tracks do not occur in realistic settings. However, the generated data sets are of independent value for research. Figure 3 depicts such an event with five helical uniform tracks.

Full event Hit points Tracks
Views for full event, hit points and tracks

3D space, noisy hit coordinates, helical expanding tracks

Helical expanding tracks are the closest type REDVID can generate to real-world tracks. Other complexity increasing features do not directly influence a track's formation principles. For instance, all these examples have hit point coordinate smearing enabled. Figure 4 showcases an event with five helical expanding tracks, following the same track randomisation protocol as earlier.

Full event Hit points Tracks
Views for full event, hit points and tracks

Feature set

REDVID is highly configurable and many features available in the main configuration file can be tweaked according to user requirements. We provide the available and planned features in Table 1, without exhaustive descriptions. Current availability is indicated using status indicators:

Category Feature Status
Execution features Anchor path
Multiple output modes
Automated execution parallelism
Automated large job division
Automated batch processing
Automated batch processing parallelism
Performance monitoring
Visualisations
Import/load spawned detectors
Data set coordinate system









Experiment features Custom/auto experiment ID
Event count
Fixed track count
Variable track count with range
Track direction, designated/random
Shift over the Z-axis





Experiment features: 2D tracks Slope limits
y-intercept limits

Experiment features: 3D tracks Track randomisation protocols
Sub-detector track aggregation
Track type: Linear
Track type: Helical uniform
Track type: Helical expanding
Track type: Multiple types
Track level: Primary tracks
Track level: Secondary tracks
Early terminating tracks
Jet track type: Linear
Jet track level: Primary jets
Jet track level: Secondary jets











Experiment features: Hits Hit point calculation methods
Hit point smearing
Hit point recording probability
Holes (unrecorded hits)



Geometry features Custom/auto detector ID
Dimension
Detector space Cartesian axis boundaries
Detector space Spherical boundaries



Geometry features: 2D Origin coordinates: (x, y)
Sub-det. presence: Pixel, SS, LS
Sub-det. layer count, per type
Sub-det. centre coordinates
Sub-det. layer distance
Sub-det. outer radius
Sub-det. outer-inner radii delta






Geometry features: 3D Origin smearing
Origin smearing type
Origin coordinates: (r, θ, z)
Sub-det. presence: Pixel, SS, LS, Barrel
Sub-det. layer count, per type
Sub-det. centre coordinates
Sub-det. layer distance
Sub-det. outer radius
Sub-det. outer-inner radii delta
Sub-det. end z
Sub-det. end-start z delta










Configuration options and features supported by REDVID

Code repository

The code is open-source and publicly available. Refer to the included configuration file for a complete list of available parameters and their effect. To understand the overall functionality and usage of the tool, refer to the provided README, RELEASE NOTES, and documentation, as well as the related publication [1].

REDVID Simulation Framework:


Data sets

Collections of example, representative data sets are generated using the REDVID simulation framework which contain complexity-reduced subatomic particle collision event data for linear [2] and helical [3] tracks. Particle trajectory information and hit coordinates from interactions with reduced-order virtual detector models are included. The data are generated in 3D domain and follows the cylindrical coordinate system for hit point coordinates in space and trajectory function parameters.

The included five tarballs each belong to a different data generation recipe. While all recipes include 10000 collision events, the number of tracks included in events varies from 1 track per event to 10000 tracks per event. This is noticeable from the tarball names.

The data set is intended to be used as synthesised input for research involving ML-assisted pipeline design exploration, as well as ML model design exploration, e.g., Neural Architecture Search (NAS). To understand the data and its generation in detail, refer to the provided README, as well as the related publication [1]. Further details regarding the ML research incorporating these data sets are available in our Connecting The Dots 2023 (CTD 2023) proceedings paper [5].

REDVID Collision Event Data – Linear Tracks and Hits:

REDVID Collision Event Data – Helical Tracks and Hits:


Authors and acknowledgement

The REDVID simulation framework, the generated data sets and the shared results are authored by:

The collaborating team includes:

Previous collaborating members:


Publications

Publications and contributions about REDVID

[1] Uraz Odyurt, Stephen Nicholas Swatman, Ana-Lucia Varbanescu, Sascha Caron. 2024. Computational Science – ICCS 2024. "Reduced Simulations for High-Energy Physics, a Middle Ground for Data-Driven Physics Research".
DOI: 10.1007/978-3-031-63751-3_6
DOI: 10.48550/arXiv.2309.03780
[2] Uraz Odyurt, Stephen Nicholas Swatman. 2023. "REDVID Collision Event Data – Linear Tracks and Hits".
[data set]
DOI: 10.5281/zenodo.8183750
[3] Uraz Odyurt. 2024. "REDVID Collision Event Data – Helical Tracks and Hits".
[data set]
DOI: 10.5281/zenodo.10514245
[4] Uraz Odyurt, Sascha Caron, Ana-Lucia Varbanescu. 2024. "Efficient Tracking Algorithm Evaluations through Multi-Level Reduced Simulations".
DOI: TBA
=> Accepted for CHEP 2024

Publications and contributions using REDVID

[5] Uraz Odyurt, Nadezhda Dobreva, Zef Wolffs, Yue Zhao, Antonio Ferrer Sánchez, Roberto Ruiz de Austri Bazan, José D. Martín-Guerrero, Ana-Lucia Varbanescu, Sascha Caron. 2023. In Proceedings of the Connecting The Dots (CTD 2023). "Novel Approaches for ML-Assisted Particle Track Reconstruction and Hit Clustering".
DOI: TBA
DOI: 10.48550/arXiv.2405.17325
[6] Zef Wolffs, Antonio Ferrer Sánchez, José D. Martín-Guerrero, Jose Salt, Matous Vozák, Nadezhda Dobreva, Roberto Ruiz de Austri Bazan, Sascha Caron, Uraz Odyurt, Yue Zhao. 2023. ML4Jets workshop. "Towards Novel Charged Particle Tracking Approaches with Transformer and U-Net Models".
[Talk]
[7] Nadezhda Dobreva, Yue Zhao, Zef Wolffs, Uraz Odyurt, Sascha Caron. 2024. 6th Inter-experiment Machine Learning Workshop. "Transformers for Particle Track Reconstruction and Hit Clustering".
[Poster]
[8] Yue Zhao, Nadezhda Dobreva, Zef Wolffs, Uraz Odyurt, Sascha Caron. 2024. European AI for Fundamental Physics Conference (EuCAIFCon 2024). "Transformer-inspired models for particle track reconstruction".
[Flashtalk with Poster]
[9] Sascha Caron, Nadezhda Dobreva, Antonio Ferrer Sánchez, Uraz Odyurt, Roberto Ruiz de Austri Bazan, Zef Wolffs, Yue Zhao. 2024. "Efficient ML-Assisted Particle Track Reconstruction Designs".
DOI: TBA
=> Accepted for CHEP 2024
[10] Sascha Caron, Nadezhda Dobreva, Antonio Ferrer Sánchez, José D. Martín-Guerrero, Uraz Odyurt, Roberto Ruiz de Austri Bazan, Zef Wolffs, Yue Zhao. 2024. "TrackFormers: In Search of Transformer-Based Particle Tracking for the High-Luminosity LHC Era".
DOI: TBA
DOI: 10.48550/arXiv.2407.07179