pipeline.TrackBuilding package#

This package handles the track building from the graph of edges, or the graph of connected edges.

pipeline.TrackBuilding.builder module#

A module that defines the various track builders.

class pipeline.TrackBuilding.builder.EdgeTrackBuilder(model, edge_score_cut)[source]#

Bases: ModelBuilderBase

construct_downstream(batch)[source]#

Run the inference on a PyTorch Data. In-place.

class pipeline.TrackBuilding.builder.TripletTrackBuilder(model, edge_score_cut, triplet_score_cut, single_edge_score_cut=None, strategy=None)[source]#

Bases: ModelBuilderBase

construct_downstream(batch)[source]#

Run the inference on a PyTorch Data. In-place.

load_batch(input_path)[source]#

Load a PyTorch Data object from its path. Might apply necessary pre-processing.

Return type:

Data

pipeline.TrackBuilding.builder.batch_from_edges_to_tracks(batch, edge_index)[source]#
pipeline.TrackBuilding.builder.batch_from_triplets_to_tracks(batch, triplet_indices, edge_index, triplet_scores=None, triplet_score_cut=None, edge_score=None, single_edge_score_cut=None, strategy=None)[source]#
Return type:

Data

pipeline.TrackBuilding.components module#

pipeline.TrackBuilding.components.connected_components(df_edges, max_node_idx)[source]#

Apply the connected component algorithm. If df_edges is on GPU, cugraph is used. Otherwise, scipy is used.

Parameters:
  • df_edges (TypeVar(DataFrame, DataFrame, DataFrame)) – Dataframe of edges, with columns hit_idx_left and hit_idx_right

  • max_node_idx (int) – Maximal node index

Return type:

TypeVar(DataFrame, DataFrame, DataFrame)

Returns:

Dataframe of connected components, with columns vertex and labels.

pipeline.TrackBuilding.components.cure_max_node_idx(df_labels, max_node_idx, node_column='vertex', label_column='labels')[source]#

Add the missing node indices in the dataframe returned by the cugraph weakly connected component algorithm.

Parameters:
  • df_labels (DataFrame) – Dataframe of connected components with columns node_column and label_column, which indicate which connected component (given by label_column) each node belongs to.

  • max_node_idx (int) – maximal node index. The node indices are assumed to range from 0 to this maximal value included.

  • node_column (str) – name of the node index column in df_labels

  • label_column (str) – name of the connected component label in df_labels

Return type:

DataFrame

Notes

The connected component algorithm of cugraph assumes that the maximal node index is connected to an edge. In order for this algorithm to properly return the labels of disconnected components, the maximal node index needs to be present in edge indices of the graph.

pipeline.TrackBuilding.edges2tracks module#

pipeline.TrackBuilding.edges2tracks.build_tracks_from_edges(edge_index, n_hits)[source]#
Return type:

Tensor

pipeline.TrackBuilding.perfect_trackbuilding module#

Define the best tracking performance we can get.

class pipeline.TrackBuilding.perfect_trackbuilding.PerfectTrackBuildingBuilder(builder)[source]#

Bases: BuilderBase

construct_downstream(batch)[source]#

Run the inference on a PyTorch Data. In-place.

load_batch(input_path)[source]#

Load a PyTorch Data object from its path. Might apply necessary pre-processing.

Return type:

Data

pipeline.TrackBuilding.triplets module#

A module that contains helper functions for building tracks from triplets.

pipeline.TrackBuilding.triplets.connected_edges_to_connected_hits(edge_index, df_connected_edges)[source]#

Turn connected components of edges into connected components of hits.

Parameters:
  • edge_index (Tensor) – tensor of edge indices

  • df_connected_edges (TypeVar(DataFrame, DataFrame, DataFrame)) – a dataframe with columns edge_idx and track_id, defining the connected components of edges.

Return type:

TypeVar(DataFrame, DataFrame, DataFrame)

Returns:

Dataframe with 2 columns hit_idx and track_id

Form tracks from links between tracks corresponding to forks from one track to more than 1 track. Every track link is considered as a new track.

Parameters:
  • df_track_links_with_fork (TypeVar(DataFrame, DataFrame, DataFrame)) – dataframe with columns track_id_left and track_id_right that defined tracks linked to one another

  • df_connected_edges (TypeVar(DataFrame, DataFrame, DataFrame)) – dataframe of connected components of edges, with columns edge_idx and track_id

Return type:

TypeVar(DataFrame, DataFrame, DataFrame)

Returns:

Dataframe of connected components of edges, with the new forked tracks

pipeline.TrackBuilding.triplets.get_filtered_triplet_indices(triplet_indices, triplet_scores, triplet_score_cut)[source]#

Filter the triplets that have a score lower than the required minimal score.

Parameters:
  • triplet_indices (Dict[str, Tensor]) – dictionary that associates a type of triplet with the triplet indices

  • triplet_scores (Dict[str, Tensor]) – dictionary that associates a type of triplet with the scores of the triplets

  • triplet_score_cut (Union[float, Dict[str, float]]) – minimal triplet score required

Return type:

Dict[str, Tensor]

Returns:

dictionary that associates a type of triplet with the filtered triplet indices

pipeline.TrackBuilding.triplets.update_dataframe_connected_edges(df_connected_edges, df_new_labels)[source]#
Return type:

TypeVar(DataFrame, DataFrame, DataFrame)

Update the dataframe of track links with a new definition of tracks. Tracks are here connected edges.

Parameters:
  • df_track_links (TypeVar(DataFrame, DataFrame, DataFrame)) – dataframe of track links with columns track_id_left and track_id_right

  • df_new_labels (TypeVar(DataFrame, DataFrame, DataFrame)) – dataframe that defines new connection between tracks, with columns old_track_id and track_id

Return type:

TypeVar(DataFrame, DataFrame, DataFrame)

Returns:

Updated dataframe of track links, where the old track IDs are replaced by the new one, as defined in df_new_labels.

pipeline.TrackBuilding.triplets2tracks module#

A module that defines functions to go from triplets to tracks.

pipeline.TrackBuilding.triplets2tracks.build_tracks_from_triplets_1cc(triplet_indices, edge_index, edge_score=None, single_edge_score_cut=None)[source]#
Return type:

tarray.DataFrame

pipeline.TrackBuilding.triplets2tracks.build_tracks_from_triplets_2cc(triplet_indices, edge_index, strategy=None)[source]#
Return type:

DataFrame

pipeline.TrackBuilding.triplets2tracks.connect_elbows(triplet_indices, n_edges)[source]#

Connect the left and right elbows.

Parameters:
  • triplet_indices (Dict[str, Tensor]) – dictionary that associates a type of triplet with the triplet indices

  • n_edges (int) – number of edges

Return type:

Tuple[TypeVar(DataFrame, DataFrame, DataFrame), Tensor]

Returns:

Tuple of a dataframe and a tensor. The dataframe defines the the small connected components formed by connecting the elbows, through its two columns edge_idx and track_id. The tensor corresponds to the articulation link indices between these connected components.

pipeline.TrackBuilding.triplets2tracks.filter_single_edges(df_labels, edge_score, edge_score_cut=0.7)[source]#
pipeline.TrackBuilding.triplets2tracks.split_articulations(triplet_index)[source]#
Return type:

Tuple[Tensor, Tensor]

pipeline.TrackBuilding.triplets2tracks.split_triplet_indices_two_connected_components(triplet_indices, edge_index=None, strategy=None)[source]#

Split triplets into 2 categories:

  • Triplets to merge now with a first connected component algorithm

  • Triplets to merge after removing the duplicate triplets, after the first connected component algorithm.

Parameters:
  • triplet_indices (Dict[str, torch.Tensor]) – dictionary that associates a type of triplet with the triplet indices

  • edge_index (torch.Tensor | None) – edge indices. Only used in the no_multiple_central_hit strategy

  • strategy (str | None) – splitting strategy to use, which correspond to which articulations to merge in the first connected component algorithm. All the strategies are expected to produce the very same result. Default is no_multiple_edge.

Return type:

Tuple[torch.Tensor, torch.Tensor]

Returns:

Triplet indices to merge by the first connected component algorithm, and triplet indices to merge the second one.