graph LR
Data_Pipeline["Data Pipeline"]
Model_Training_Evaluation["Model Training & Evaluation"]
Model_Architecture["Model Architecture"]
Configuration_Manager["Configuration Manager"]
Experiment_Tracking_Checkpointing["Experiment Tracking & Checkpointing"]
Data_Pipeline -- "provides data to" --> Model_Training_Evaluation
Configuration_Manager -- "receives configuration from" --> Data_Pipeline
Model_Training_Evaluation -- "consumes data from" --> Data_Pipeline
Model_Training_Evaluation -- "uses" --> Model_Architecture
Model_Training_Evaluation -- "receives parameters from" --> Configuration_Manager
Model_Training_Evaluation -- "outputs to" --> Experiment_Tracking_Checkpointing
Model_Architecture -- "is used by" --> Model_Training_Evaluation
Model_Architecture -- "expects input from" --> Data_Pipeline
Configuration_Manager -- "provides config to" --> Data_Pipeline
Configuration_Manager -- "provides config to" --> Model_Training_Evaluation
Configuration_Manager -- "provides config to" --> Model_Architecture
Experiment_Tracking_Checkpointing -- "receives metrics and checkpoints from" --> Model_Training_Evaluation
Experiment_Tracking_Checkpointing -- "provides checkpoints to" --> Model_Training_Evaluation
click Data_Pipeline href "https://github.com/DeepGraphLearning/GearNet/blob/main/.codeboarding//Data_Pipeline.md" "Details"
This project is a Deep Learning Research Framework/Library for Protein Representation Learning. The Data Pipeline component is central to its functionality, handling all aspects of data preparation for protein representation learning, from raw data acquisition to structured protein graphs and dataset splitting.
Data Pipeline [Expand]
Manages the entire data lifecycle, including loading, preprocessing, featurization, and dataset splitting for protein data.
Related Classes/Methods:
Orchestrates the training loops, model optimization, validation, and evaluation of protein representation models.
Related Classes/Methods:
-
scripts.train(1:1)
Defines the neural network architectures used for protein representation learning (e.g., graph neural networks).
Related Classes/Methods:
Handles loading, parsing, and managing project configurations (e.g., model hyperparameters, dataset paths, training settings) from YAML files.
Related Classes/Methods:
utils.config(1:1)
Manages the logging of training metrics, saving model checkpoints, and potentially resuming training.
Related Classes/Methods:
-
utils.checkpoint(1:1) -
scripts.train(1:1)