GNN-based pipeline#

The GNN-based pipeline utilized in this repository is primarily based on the pipeline developed by the Exa.TrkX collaboration. This pipeline is specifically tailored for track finding within the Vertex Locator (VeLo) tracking detector at LHCb. To enhance its performance, several pertinent improvements have been incorporated, related to the LHCb forward nature and the track topologies of interest.

The pipeline encompasses the following steps:

Preprocessing: Initially, the input .parquet.lz4 files undergo preprocessing. During this stage, additional columns can be computed, and specific selection criteria can be applied as needed. Each event is subsequently saved in .parquet files, which serve as the basis for the next steps.
Processing: Building upon the preprocessed data, this step involves reading the files and transforming them into PyTorch Geometric data objects. These objects contain only essential information. Additionally, genuine edges are computed during processing.
Embedding + kNN: to build a rough graph, the hits are embedded into an \(n\)-dimensional space. This embedding brings hits that may form connections closer to each other within this space. Subsequently, a \(k\)-nearest neighbour (kNN) algorithm is applied to establish a preliminary graph of potential edges.
GNN + Track Building: A Graph Neural Network (GNN) classifies the edges within the graph and eliminates fake connections. Subsequently, a triplet graph is constructed, and edge-edge connections are classified. Any false edge-edge connections are filtered out, and tracks are assembled using an algorithm that involves a connected component algorithm.
Evaluation: The final tracks generated are compared to known particles, and a comprehensive evaluation is conducted utilising the montetracko library.