⚠️ ALPHA RELEASE - This is an early alpha version under active development. We are actively testing on different machines and adding production deployment capabilities. Use at your own risk and expect breaking changes.
Area Code ODW is a production-ready starter repository with all the necessary building blocks for building operational data warehouses with real-time data processing, analytics, and visualization capabilities.
Get up and running in minutes with our automated setup:
# 1. Ensure Docker Desktop is running
# 2. Install dependencies & start development environment
pnpm odw:dev
# 3. Seed databases with sample data
pnpm odw:dev:seed
# 4. Open the data warehouse frontend
http://localhost:8501/This will:
- Install all Python dependencies and create virtual environment
- Start the data warehouse service (Moose app) on port 4200
- Start the Streamlit frontend on port 8501
- Launch Kafdrop UI for message queue monitoring on port 9999
- Open the dashboard in your browser automatically
# Development
pnpm odw:dev # Start all services
pnpm odw:dev:clean # Clean all services
# Individual services
pnpm --filter dw-frontend dev # Frontend only
pnpm --filter data-warehouse dev # Data Warehouse only
pnpm --filter kafdrop dev # Kafdrop only| Category | Technologies |
|---|---|
| Frontend | Streamlit, Python 3.12+ |
| Backend | Moose, Python, FastAPI |
| Database | ClickHouse (analytical), PostgreSQL (temporal) |
| Message Queue | Redpanda (Kafka-compatible) |
| Workflow Engine | Temporal |
| Caching | Redis |
| Data Processing | Kafka Python, ClickHouse Connect |
| Infrastructure | Docker, Docker Compose |
| Build Tool | Python setuptools, pip |
area-code/
├── odw/ # Operational Data Warehouse
│ ├── services/ # Backend services
│ │ ├── data-warehouse/ # Main Moose data warehouse service
│ │ │ ├── app/
│ │ │ │ ├── apis/ # REST API endpoints for data consumption
│ │ │ │ ├── blobs/ # Blob data extraction workflows
│ │ │ │ ├── events/ # Events data extraction workflows
│ │ │ │ ├── ingest/ # Data models and stream transformations
│ │ │ │ ├── logs/ # Log data extraction workflows
│ │ │ │ └── views/ # Materialized views for analytics
│ │ │ ├── docs/ # Documentation and walkthrough guides
│ │ │ ├── setup.sh # Setup and management script
│ │ │ ├── moose.config.toml # Moose framework configuration
│ │ │ └── requirements.txt # Python dependencies
│ │ └── connectors/ # External data source connectors
│ │ └── src/
│ │ ├── blob_connector.py # Blob storage data connector
│ │ ├── events_connector.py # Events data connector
│ │ ├── logs_connector.py # Logs data connector
│ │ └── connector_factory.py # Connector factory pattern
│ └── apps/
│ └── dw-frontend/ # Streamlit web application
│ ├── pages/ # Frontend page components
│ ├── utils/ # Frontend utility functions
│ └── main.py # Streamlit application entry point
- Modern Streamlit web application with real-time data visualization
- Interactive dashboards for analytics and monitoring
- Real-time status monitoring and queue management
- Responsive design with custom styling and tooltips
- Moose analytical APIs + ClickHouse with automatic OpenAPI/Swagger docs
- Moose streaming Ingest Pipelines + Redpanda for real-time data processing
- Moose workflows + Temporal for durable long-running data synchronization
- REST API endpoints for data consumption and analytics
- Materialized views for optimized query performance
- Modular data connector system for external data sources
- Factory pattern for easy connector integration
- Support for blob storage, events, and logs data sources
- Extensible architecture for custom data connectors
- Redpanda: Kafka-compatible message queue for real-time data streaming
- ClickHouse: High-performance analytical database
- Temporal: Workflow engine for reliable data processing
- Redis: Caching and session management
- Kafdrop: Web UI for monitoring message queues
The application uses a modern data architecture:
- ClickHouse (Analytical Database)
- Redpanda (Message Queue)
- Temporal (Workflow Engine)
- Redis (Caching)
🚧 UNDER DEVELOPMENT - Production deployment capabilities are actively being developed and tested. The current focus is on stability and compatibility across different machine configurations.
We're working on a production deployment strategy that will be available soon.
🔬 TESTING STATUS - We are actively testing on various machine configurations. Currently tested on Mac M3 Pro (18GB RAM), M4 Pro, and M4 Max. We're expanding testing to more configurations.
- Python Version: Ensure you're using Python 3.12+ (required for Moose)
- Docker: Make sure Docker Desktop is running before setup
- Memory Issues: ClickHouse and Redpanda require significant memory (4GB+ recommended)
- Port Conflicts: Ensure ports 4200, 8501, 9999, 18123, 19092 are available
Tested Configurations:
- ✅ Mac M3 Pro (18GB RAM)
- ✅ Mac M4 Pro
- ✅ Mac M4 Max
- 🔄 More configurations being tested
# Navigate to data warehouse directory
cd odw/services/data-warehouse
# Clean and restart
./setup.sh reset
# Check status
./setup.sh status- 🚀 Moose Repository - The framework that powers this demo
- 📚 Moose Documentation - Complete guide to building with Moose
- 💬 Moose Community Slack - Join the discussion
📋 ALPHA FEEDBACK - We welcome feedback and bug reports during this alpha phase. Please include your machine configuration when reporting issues.
- 💬 Moose Discussions - Get help with Moose
- 🐛 Moose Issues - Report Moose-related bugs
- 📚 Moose Documentation - Comprehensive guides
For issues and questions about this demo:
- Check the troubleshooting section above
- Open an issue on GitHub with your machine configuration
- Join our development discussions for alpha testing feedback
This repository demonstrates Moose's capabilities for building production-ready operational data warehouses with:
- Real-time analytics with ClickHouse
- Event streaming with Redpanda
- Reliable workflows with Temporal
- Interactive dashboards with Streamlit