class MLEngineer:
def __init__(self):
self.name = "Bhargav Kumar Nath"
self.role = "ML Engineer & Systems Researcher"
self.education = {
"masters" : "MSc Data Science & Analytics @ University of Leeds (2024β2025)",
"bachelors": "BTech Computer Science @ Assam Don Bosco University (2020β2024)",
}
self.location = "Leeds, UK π¬π§"
self.projects_shipped = 11
self.seeking = "Full-time ML / Data Science roles"
def current_focus(self):
return {
"systems_engineering": [
"LLM Inference Optimization PagedAttention in Rust + CUDA",
"Agentic RAG Pipelines LangGraph, Qdrant, RAGAS",
"High-Throughput Signal Intelligence for Quant Finance",
"100M+ Event Pipelines with DuckDB + Polars",
],
"research": [
"Hardware-Aware Neural Architecture Search (NAS)",
"Mixed-Precision LLM Quantization & Model Compression",
"Causal ML & Heterogeneous Treatment Effects",
"Graph Neural Networks for Molecular Property Prediction",
],
}
def philosophy(self) -> str:
return (
"A model is a mathematical fantasy, but an ML system is a living entity.\n"
"I design for the shifting reality of the human world,\n"
"not the static perfection of a laboratory."
)π My Journey in ML (Click to expand)
My path started with hands-on data engineering work and grew into a deep obsession with the boundary between research and production. I ship things that work in real data centres, on commodity hardware, under real latency constraints.
What drives me:
- β‘ Systems Performance Pushing hardware to its limits: PagedAttention in Rust, CUDA kernels, KV-cache optimization achieving 8β32Γ throughput gains
- π€ Agentic AI Building reliable LLM pipelines that reason, retrieve, and act with verifiable faithfulness scores (0.91 on RAGAS)
- π― Causal Intelligence Moving beyond A/B testing to true treatment effect estimation, identifying micro-segments driving 70% of total uplift
- π¬ Scientific ML Applying GNNs and hybrid architectures to accelerate material science and drug discovery
- π Quantitative Finance Designing signal intelligence platforms for real-time algorithmic trading decisions
Three Core Principles:
- Escape the State-of-the-Art Trap Leaderboard victories rarely survive reality. Establish honest baselines first.
- Data Over Algorithms Architectures come and go; long-term success depends on data quality and distribution understanding.
- Deployment as the Starting Line A shipped model needs continuous monitoring to stay reliable. Production is where the real work begins.
Currently completing my MSc at the University of Leeds, specializing in advanced ML, big data architecture, and MLOps. Actively seeking full-time opportunities to build impactful ML systems.
|
Systems Achievement: Memory management system for LLM inference implementing PagedAttention achieving 8β32Γ throughput improvement (424 sequences/GB vs. 53) by eliminating up to 90% VRAM waste from pre-allocated KV caches. Key Innovations:
Tech Stack: Impact: Enables serving larger batch sizes on constrained hardware bridging research-grade LLMs and edge deployment. |
Engineering Achievement: High-throughput signal intelligence platform for quantitative hedge funds enabling sub-second signal extraction from fragmented high-velocity alternative data streams for real-time algorithmic trading decisions. Key Features:
Tech Stack: Impact: Gives quant analysts a single pane of glass for real-time market intelligence. |
|
Research Contribution: Hardware-aware NAS framework reducing LLM VRAM by 40% with a 20% throughput gain compressing evolutionary search time from days to minutes on TinyLlama-1.1B. Key Innovations:
Tech Stack: Impact: Enables edge deployment of large models on resource-constrained devices. |
Engineering Achievement: Production agentic RAG pipeline for financial document analysis achieving 0.91 faithfulness score on RAGAS and +56% F1 improvement over naive retrieval. Key Innovations:
Tech Stack: Impact: Reduces analyst time on document review with verifiably accurate, grounded answers. |
|
Engineering Achievement: Analytics system processing 109.9M event logs on commodity hardware achieving 97% memory reduction (14.7 GB β 1.9 GB) and 4.5Γ conversion lift via propensity-modeled targeting. Key Innovations:
Tech Stack: Impact: Enterprise-scale behavioral analytics on a laptop no cluster required. |
Research Contribution: Unified causal experimentation engine estimating Heterogeneous Treatment Effects (HTE) achieving <1ms inference latency and identifying micro-segments driving 70% of total uplift (+$0.14/user). Key Innovations:
Tech Stack: Impact: Reduces wasted spend by targeting users with highest causal lift. |
π More Projects Click to Expand
|
π PricePoint Dynamics UK Supermarket Intelligence Competitive intelligence system analyzing 9.5M+ daily prices across 67,000+ products. MAE Β£0.139 (RΒ²=0.98), proving Aldi as market price leader with 4β7 day lead time.
|
π MALLORN Rare Transient Detection in Astronomy Multi-channel RNN pipeline detecting rare Tidal Disruption Events at 4.86% class prevalence. +197% F1 improvement over GRU baseline (0.53 F1 score).
|
|
π Synthetic Intelligence Privacy-Preserving Data Generation Generative tabular data framework with +5.1% AUPRC over SMOTE and linear O(N) complexity via model-driven rejection sampling with manifold alignment guarantees.
|
π§ͺ Melting Point Prediction Hybrid GNN Architecture GNN + XGBoost fusion for thermodynamic property prediction. 20% MAE reduction vs. pure deep learning, <50ms latency (24.59K MAE).
|
|
ποΈ Fitness Tracker Production Spark ML Pipeline Enterprise ML system processing 358K+ records from 1.9K+ users. 98% classification accuracy with 198 FFT-derived temporal features and 98% data compression.
|
𧬠Neural Architecture Search Genetic Algorithms Evolutionary CNN optimization achieving 97.15% accuracy on medical imaging via custom genetic operators: selection, crossover, and mutation with fault-tolerant checkpointing.
|
|
π§ Deep Learning Lab Interactive TypeScript Engine Dependency-free mathematical neural network engine built from scratch in TypeScript for hands-on hyperparameter experimentation with live training noise injection.
|
π UK Supermarket Competitive Intelligence Extended Analysis Deep-dive into pricing strategy dynamics across major UK supermarket chains with causal analysis of competitor response patterns and Granger causality testing.
|
Languages:
Systems: "Rust Β· C Β· Bash/Shell"
Data Science: "Python Β· R Β· SQL"
Frontend: "TypeScript Β· JavaScript"
Machine Learning:
Deep Learning: "PyTorch Β· Keras | CNN Β· RNN Β· Transformers Β· GNN"
Classical ML: "Scikit-Learn Β· XGBoost Β· LightGBM | Ensemble Methods"
Specialized: "CausalML Β· Uplift Modeling Β· NAS Β· Model Compression Β· Agentic AI"
High Performance Computing:
GPU: "CUDA Β· CuPy Β· TensorRT Β· PyO3 (Rust-Python bindings)"
Inference: "PagedAttention Β· KV-Cache Optimization Β· Mixed-Precision (FP16/INT8/INT4)"
LLM & Agentic AI:
Frameworks: "LangGraph Β· LangChain Β· Hugging Face Transformers"
Vector DBs: "Qdrant Β· FAISS | Hybrid Retrieval"
Evaluation: "RAGAS Β· Sentence-BERT | Faithfulness Β· Relevance Β· Groundedness"
Data Engineering:
Big Data: "Apache Spark (PySpark) Β· Hadoop Β· Apache Kafka Β· Airflow"
In-Process: "DuckDB Β· Polars Β· Pandas Β· NumPy"
Databases: "PostgreSQL Β· Redis Β· MySQL"
Formats: "Parquet Β· Arrow Β· JSON"
MLOps & Cloud:
Containerization: "Docker Β· Kubernetes"
Tracking: "MLflow Β· Weights & Biases"
Cloud: "AWS Β· GCP"
Serving: "FastAPI Β· Streamlit Β· Next.js Β· React"
CI/CD: "GitHub Actions"
Specialized:
Cheminformatics: "RDKit Β· PyTorch Geometric Β· OpenCV"
Optimization: "Optuna Β· Ray Tune Β· Genetic Algorithms Β· Hessian Analysis"
Statistics: "Statsmodels Β· SciPy Β· Bayesian Inference Β· Hypothesis Testing"π¨ Full Tech Stack Badges Click to Expand
|
M/S Sanjog Trading
Impact: Built first production data pipelines, translating raw business data into actionable pricing intelligence |
IIT Guwahati
Impact: Reduced scheduling conflicts and improved operational efficiency for academic planning systems |
Airports Authority of India
Impact: Predictive maintenance insights enabling proactive asset lifecycle management |
%%{init: {
"theme": "base",
"themeVariables": {
"primaryColor": "#7aa2f7",
"primaryTextColor": "#ffffff",
"primaryBorderColor": "#7dcfff",
"lineColor": "#e0af68",
"secondaryColor": "#9ece6a",
"tertiaryColor": "#bb9af7",
"fontSize": "18px"
}
}}%%
mindmap
root((ML Research))
LLM Systems
PagedAttention & CUDA
Inference Optimization
KV-Cache Management
Agentic AI
LangGraph Orchestration
Hybrid RAG Pipelines
RAGAS Evaluation
Neural Architecture Search
Hardware-Aware NAS
Mixed-Precision Quantization
Multi-Objective Optimization
Causal ML
Uplift Modeling
Treatment Effect Estimation
Counterfactual Reasoning
Scientific ML
Graph Neural Networks
Molecular Property Prediction
Drug Discovery
Quantitative Finance
Signal Intelligence
Alternative Data Processing
Real-time Analytics
|
Current Focus:
Status: π’ Active Engineering |
Current Focus:
Status: π’ Active Engineering |
Current Focus:
Status: π‘ Exploration Phase |
|
A comprehensive journey through AI's transformation from rule-based expert systems to modern neural architectures. Traces the paradigm shifts that enabled today's breakthroughs. π·οΈ |
Comparing gradient-based methods with evolutionary strategies for escaping local minima. Practical insights from neural architecture search. π·οΈ |
Examining the intersection of AI advancement and environmental, social, and governance accountability in an era of accelerating compute demands. π·οΈ |
|
Throughput Gain LLM Inference PageForge |
Events Processed Customer Intelligence Platform |
Conversion Uplift Propensity Modeling |
Memory Reduction DuckDB + Polars Pipeline |
|
Faithfulness Score Agentic RAG FinSight |
VRAM Reduction LLM Quantization EMPAS |
Inference Latency Experimentation Engine |
F1 Improvement Rare Transient Detection |
|
Projects Shipped End-to-end ML Systems |
Price Records Analyzed Market Intelligence System |
F1 vs. Naive RAG FinSight-Alpha |
CNN Accuracy Genetic Algorithm NAS |