scd2

end-to-end data pipeline system built as part of the Coursera open-source Data Engineering program. It unifies diverse data sources, implements SCD2 historical tracking, and orchestrates workflows using industry-standard tools.

spark apache-spark python3 dbt data-pipeline apache-airflow scd2 dbt-core data-pipeline-automation

Updated May 25, 2026
Python

Mairondc21 / pipeline_delta_s3

Star

Pipeline 100% Open Source

docker airflow s3 pyspark cicd boto3 ruff datahub scd2 delta-lake great-expectations sqlfluff

Updated Mar 19, 2026
Python

emudamah0906 / polaris-claims-lakehouse

Star

P&C insurance claims lakehouse: Azure ADLS + Databricks (PySpark/Delta) + Snowflake + dbt, real-time FNOL fraud signals via Kafka, Airflow-orchestrated, Terraform-provisioned, OIDC-secured, with data contracts, lineage, and ADRs throughout.

Updated May 19, 2026
Makefile

shivaranjanka / snowflake-healthcare-pipeline

Star

Advanced Healthcare Claims Pipeline using Snowflake, Snowpipe, Streams, Tasks, SCD Type 2, and AWS S3. Automates ingestion, CDC, dimensional modeling, and data quality checks for healthcare patient and claims data.

aws cloud sql analytics tasks snowflake streams data-engineering healthcare cdc data-pipeline scd2 snowpipe

Updated Nov 10, 2025

Mohameddfxxcxx / global-horizon-bank-dwh-project

Star

Fortune-500-grade banking analytics platform: OLTP -> medallion lakehouse -> Kimball star schema -> semantic layer -> 9-tab executive dashboard + 5 ML models (churn, fraud, segmentation, forecasting). Production-ready, governed, fully tested.

Updated Apr 30, 2026
Python

ZuhairBhati / travel_bookings_pipeline

Star

This is a data engineering pipeline built on Databricks + Delta Lake + PySpark that ingests travel booking and customer master data, applies SCD Type 2 logic, and delivers analytics-ready tables. It includes data quality enforcement, dimension versioning, fact aggregation, and performance tuning.

python analytics travel pyspark data-engineering hospitality notebooks databricks bookings etl-pipeline scd2

Updated Oct 8, 2025
Jupyter Notebook

sushmakl95 / dbt-bigquery-analytics-platform

Star

Modern data stack reference: dbt + BigQuery + Airflow (Cloud Composer) with medallion layering, SCD2 snapshots, exposures, freshness SLAs, and 45× cost reduction via partition + cluster + incremental tuning.

Updated Apr 23, 2026
Python

Aayushi-Anand / SCD2_Implementation

Star

Implementation of SCD2 for employee relocation data

etl-pipeline scd2

Updated Feb 28, 2022

ViinayKumaarMamidi / Databricks_Travel_Booking_SCD2_Project

Star

This repo contains details about travel booking project executed on Databricks, Thanks

databricks-notebooks scd2 pyspark-python databricks-workspace dataqualitycheck databricks-workflows pydeequ medallion-architecture

Updated May 9, 2026
Python

shukla2015 / Travel_Booking_SCD2_Project

Star

Production-grade parameterized ETL pipeline implementing SCD Type 2 for travel booking data using Databricks, Delta Lake, and ADLS — includes data quality checks, incremental fact table build, Z-Order optimization, and SQL reporting.

etl pyspark databricks scd2 delta-lake azure-data-engineering pydeequ

Updated Apr 6, 2026
Jupyter Notebook

sushmakl95 / aws-glue-cdc-framework

Star

Production-grade CDC pipeline: MySQL → Debezium → Kinesis → S3 → AWS Glue (PySpark) → Redshift + Postgres + OpenSearch. Multi-sink fanout with SCD2, idempotency tracking, and 13 modular Terraform modules.

Updated Apr 23, 2026
Python

OsamaMustafa32 / Enterprise_Retail_Data_Lakehouse

Star

Batch retail data lakehouse on Databricks: Delta Live Tables (bronze → silver → gold), Unity Catalog, synthetic data generator, and an executive analytics dashboard.

python sql pyspark databricks data-quality-checks etl-pipeline scd2 delta-lake data-lakehouse delta-live-tables unity-catalog medallion-architecture

Updated Apr 2, 2026
Python

DustinPineau / cms_portfolio

Star

End-to-end Medicare data engineering pipeline: API ingestion, PostgreSQL 17, dbt, dimensional modeling (Kimball/SCD2), Apache Airflow orchestration, and Evidence.dev dashboard. Built on a QEMU/KVM Rocky Linux VM.

python cms portfolio sql etl postgresql data-engineering dbt data-pipeline medicare evidence apache-airflow kimball scd2 dimensional-modeling

Updated Apr 28, 2026
PLpgSQL

Cindy-txr / Employee-data-platform

Star

Production-style Data Warehouse project using Airflow + PostgreSQL with CDC event layer, SCD2 modeling, checkpoint-based incremental loading, and idempotent pipelines.

python docker postgres airflow sql kafka analytics data-warehouse data-engineering cdc tel scd2

Updated May 21, 2026
Python

dhruvi-a / dbt-analytics-engineering-case-study

Star

Production-style dbt case study with SCD-style modeling, point-in-time joins, incremental marts, tests, and analyst-facing SQL.

sql etl data-warehouse dbt data-modeling jinja data-quality scd2 analytics-engineering duckdb incremental-models

Updated May 26, 2026

szkad / piquillo-bi-platform-peru

Star

Plataforma BI end-to-end para agroexportadora peruana ficticia de pimiento piquillo. SQL Server DW con SCD2, ETL con stored procedures, dashboard Power BI con RLS.

python sql-server etl power-bi data-warehouse data-engineering business-intelligence dax peru kimball star-schema agroindustria scd2

Updated May 24, 2026
TSQL

Improve this page

Add a description, image, and links to the scd2 topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the scd2 topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scd2

Here are 31 public repositories matching this topic...

KaterynaD / dbt_scd2_plus

spatil6 / ETL-SCD2

ai-tech-karthik / banking-data-pipeline

akshayush / SCD2-Implementation--using-pyspark

abdullah-mahmoud-de / Automated-Data-Pipelines-Spark-dbt-Airflow

Mairondc21 / pipeline_delta_s3

emudamah0906 / polaris-claims-lakehouse

shivaranjanka / snowflake-healthcare-pipeline

Mohameddfxxcxx / global-horizon-bank-dwh-project

ZuhairBhati / travel_bookings_pipeline

sushmakl95 / dbt-bigquery-analytics-platform

Aayushi-Anand / SCD2_Implementation

ViinayKumaarMamidi / Databricks_Travel_Booking_SCD2_Project

shukla2015 / Travel_Booking_SCD2_Project

sushmakl95 / aws-glue-cdc-framework

OsamaMustafa32 / Enterprise_Retail_Data_Lakehouse

DustinPineau / cms_portfolio

Cindy-txr / Employee-data-platform

dhruvi-a / dbt-analytics-engineering-case-study

szkad / piquillo-bi-platform-peru

Improve this page

Add this topic to your repo