scripts folder#

This folder contains scripts to run various steps of the pipeline from the command line or via snakemake.

Subfolders#

collect_test_samples.py script#

A script that generates the test_samples.yaml file that defines the test samples used in this repository.

preprocess_test_sample.py script#

A script that runs the preprocessing of a test sample.

process_test_sample.py script#

A script that runs the preprocessing of a test sample.

build_graph_using_embedding.py script#

A script that runs the graph building using the embedding model learnt at the previous stage.

scripts.build_graph_using_embedding.run(path_or_config, partitions=['train', 'val', 'test'], checkpoint=None, reproduce=True, use_gpu=True, suffix=None, **kwargs)[source]#

Run the inference of the metric learning stage.

Parameters:
  • path_or_config (str | dict) – configuration dictionary, or path to the YAML file that contains the configuration

  • partitions (List[str]) –

    Partitions to run the inference on:

    • train: train dataset

    • val: validation dataset

    • test: all the test datasets

    • A specific test dataset name

  • checkpoint (UnionType[EmbeddingBase, str, None]) – Model already loaded, or path to its checkpoint. If None, try to find it automatically in the artifact folder given the configuration.

  • reproduce (bool) – whether to delete an existing folder

  • use_gpu (bool) – whether to use the GPU (if available)

  • **kwargs – Other keyword arguments passed to the PyTorch.LightingModel.load_from_checkpoint() class method

train_model.py script#

A script that runs the training of a model (embedding or GNN).

scripts.train_model.get_parsed_args()[source]#
Return type:

Namespace

scripts.train_model.train_model(path_or_config, step, identifier=None)[source]#

Run the training of a model.

Parameters:
  • path_or_config (str | dict) – pipeline configuration or path to it.

  • step (str) – Model step, such as embedding or gnn.

  • identifier (Optional[str]) – Identifier added at the end of the step name.

Return type:

Tuple[Trainer, Module]

Returns:

Trainer and trained model.

build_tracks.py script#

Script that runs the edge filtering, triplet building and filtering, and track building from triplets.

scripts.build_tracks.build(path_or_config, partitions=['train', 'val', 'test'], checkpoint=None, reproduce=True, edge_score_cut=None, triplet_score_cut=None, single_edge_score_cut=None, strategy=None, with_triplets=True, **kwargs)[source]#

export_model_to_onnx.py script#

A python script to export a model to an ONNX file.

scripts.export_model_to_onnx.export_model_to_onnx(path_or_config, step, mode=None, output_path=None, options=None, dummy=False)[source]#

Export a model of a pipeline to an ONNX file.

Parameters:
  • path_or_config (str | dict) – Path to the pipeline configuration file or the configuration dictionary.

  • step (Literal['embedding', 'gnn']) – Model step, such as embedding or gnn.

  • mode (Optional[str]) – Export mode.

  • output_path (Optional[str]) – Path where to save the .onnx file containing the model. If not provided, it is defined from the experiment name and step.

  • **options (Optional[Iterable[str]]) – export options

Return type:

None

scripts.export_model_to_onnx.get_parsed_args()[source]#
Return type:

Namespace