Skip to content

Latest commit

 

History

History
159 lines (49 loc) · 3.93 KB

File metadata and controls

159 lines (49 loc) · 3.93 KB
graph LR

    Data_Pipeline["Data Pipeline"]

    Model_Training_Evaluation["Model Training & Evaluation"]

    Model_Architecture["Model Architecture"]

    Configuration_Manager["Configuration Manager"]

    Experiment_Tracking_Checkpointing["Experiment Tracking & Checkpointing"]

    Data_Pipeline -- "provides data to" --> Model_Training_Evaluation

    Configuration_Manager -- "receives configuration from" --> Data_Pipeline

    Model_Training_Evaluation -- "consumes data from" --> Data_Pipeline

    Model_Training_Evaluation -- "uses" --> Model_Architecture

    Model_Training_Evaluation -- "receives parameters from" --> Configuration_Manager

    Model_Training_Evaluation -- "outputs to" --> Experiment_Tracking_Checkpointing

    Model_Architecture -- "is used by" --> Model_Training_Evaluation

    Model_Architecture -- "expects input from" --> Data_Pipeline

    Configuration_Manager -- "provides config to" --> Data_Pipeline

    Configuration_Manager -- "provides config to" --> Model_Training_Evaluation

    Configuration_Manager -- "provides config to" --> Model_Architecture

    Experiment_Tracking_Checkpointing -- "receives metrics and checkpoints from" --> Model_Training_Evaluation

    Experiment_Tracking_Checkpointing -- "provides checkpoints to" --> Model_Training_Evaluation

    click Data_Pipeline href "https://github.com/DeepGraphLearning/GearNet/blob/main/.codeboarding//Data_Pipeline.md" "Details"

Loading

CodeBoardingDemoContact

Details

This project is a Deep Learning Research Framework/Library for Protein Representation Learning. The Data Pipeline component is central to its functionality, handling all aspects of data preparation for protein representation learning, from raw data acquisition to structured protein graphs and dataset splitting.

Data Pipeline [Expand]

Manages the entire data lifecycle, including loading, preprocessing, featurization, and dataset splitting for protein data.

Related Classes/Methods:

Model Training & Evaluation

Orchestrates the training loops, model optimization, validation, and evaluation of protein representation models.

Related Classes/Methods:

Model Architecture

Defines the neural network architectures used for protein representation learning (e.g., graph neural networks).

Related Classes/Methods:

Configuration Manager

Handles loading, parsing, and managing project configurations (e.g., model hyperparameters, dataset paths, training settings) from YAML files.

Related Classes/Methods:

  • utils.config (1:1)

Experiment Tracking & Checkpointing

Manages the logging of training metrics, saving model checkpoints, and potentially resuming training.

Related Classes/Methods:

  • utils.checkpoint (1:1)

  • scripts.train (1:1)