ANIS (Autonomous Neural Intelligence Supervisor) is an enterprise-grade Personal AI Factory Controller designed to orchestrate complex data ingestion, transformation, analysis, and reporting workflows through a single, intent-driven interface. The system combines Custom GPT Actions, n8n workflow orchestration, Google Workspace automation, and a serverless OCR microservice to deliver a fully automated, auditable, scalable, and production-ready AI data pipeline.
ANIS is intentionally engineered as a control plane, not a monolithic processor. It delegates execution to specialized agents (Ingest, Clean, Analyze, Report) while enforcing strict contracts, schemas, logging, and observability across the entire data lifecycle.
- ๐งฉ Project Overview
- ๐ง System Philosophy & Design Principles
- ๐ฏ Objectives & Goals
- โ Acceptance Criteria
- ๐ป Prerequisites
- โ๏ธ Installation & Setup
- ๐ API Documentation
- ๐ค Custom GPT Configuration
- ๐ฅ๏ธ UI / Frontend Architecture
- ๐ข Status Codes
- ๐ Features
- ๐งฑ Tech Stack & Architecture
- ๐ ๏ธ Workflow & Implementation
- ๐ง Agent Responsibilities
- ๐๏ธ Data Lake Design
- ๐งช Testing & Validation
- ๐ Validation Summary
- ๐งฐ Verification Tools
- ๐งฏ Troubleshooting
- ๐ Security & Secrets
- โ๏ธ Deployment
- โก Quick-Start Cheat Sheet
- ๐งพ Usage Notes
- ๐ง Performance & Optimization
- ๐ Enhancements
- ๐งฉ Maintenance & Future Work
- ๐ Key Achievements
- ๐งฎ High-Level Architecture
- ๐๏ธ Folder Structure
- ๐งญ How to Demonstrate Live
- ๐ก Summary, Closure & Compliance
ANIS provides a unified command interface that allows users (human or system) to trigger complex automation pipelines using a single structured JSON command. The platform abstracts away workflow complexity while preserving transparency, traceability, and governance.
Core capabilities include:
- Automated Gmail attachment ingestion
- RAW โ CLEAN โ GOLD data lake transitions
- OCR-based PDF text extraction
- Structured normalization of CSV, XLS, JSON, TXT
- AI-driven analysis and KPI generation
- Daily scheduled execution via cron
- Single Responsibility Agents โ Each agent performs exactly one domain function
- Contract-First Design โ All interactions validated via schemas
- Auditability by Default โ Every action logged
- Stateless Execution โ Workflows remain restart-safe
- Enterprise Observability โ Logs, metrics, and artifacts persisted
- Establish a centralized AI automation control plane driven by structured intent
- Enable deterministic, schema-driven execution across ingestion, cleaning, analysis, and reporting
- Decouple AI reasoning (GPT) from execution logic (n8n workflows)
- Provide audit-ready data pipelines with full traceability
- Support both interactive (on-demand) and scheduled automation
| Area | Acceptance Requirement |
|---|---|
| API | All requests validated via OpenAPI and JSON schemas |
| Agents | Each agent executes independently with clear responsibility |
| Data | RAW โ CLEAN โ GOLD data lifecycle enforced |
| Logging | Every execution logged with timestamp and status |
| Security | No secrets committed to repository |
| Scheduling | Cron workflows execute without manual intervention |
- Node.js โฅ 18
- n8n โฅ 1.x (self-hosted or cloud)
- Google Workspace (Gmail, Drive, Sheets)
- OpenAI API access
- Vercel account for OCR microservice
- Clone the repository
- Create environment variables from
.env.example - Install serverless dependencies
- Import n8n workflows (agents, interactive, scheduled)
- Configure Google OAuth credentials
- Deploy OCR service on Vercel
Endpoint:
POST /webhook/anis
Core Request Fields:
| Field | Description |
|---|---|
| agent | Target agent (ingest | clean | analyze | report) |
| source | Optional data source parameters |
| options | Execution controls |
| return | Expected response format |
All requests and responses are validated against versioned schemas to ensure backward compatibility and contract safety.
| Component | Purpose |
|---|---|
| action-schema.yaml | Defines allowed commands and payload structure |
| instructions.md | Constrains GPT behavior and output format |
| description.md | System-level role definition |
| conversation-starters.md | Guided user interaction examples |
GPT operates strictly as an intent interpreter. It does not execute logic directly and cannot bypass schemas or workflows.
This project intentionally avoids a traditional UI layer. Instead, it uses:
- Custom GPT as the conversational interface
- n8n as the visual execution canvas
- Google Sheets as operational dashboards
State Flow:
User Intent โ GPT โ Webhook โ Workflow State โ Logs / Files
Styling, visualization, and reporting are delegated to Google Workspace and GPT responses.
| Code | Meaning |
|---|---|
| 200 | Success |
| 400 | Invalid payload |
| 401 | Unauthorized |
| 500 | Execution failure |
ANIS (Autonomous Neural Intelligence Supervisor) is a production-grade AI Factory Control Plane that unifies LLM intent, workflow orchestration, and enterprise data engineering into a single deterministic, auditable, and scalable platform. Unlike typical AI automations, ANIS enforces strict governance, contract-first execution, and end-to-end data lineage.
| Domain | Capability | Enterprise-Grade Implementation |
|---|---|---|
| AI Governance | Schema-Locked GPT Control | GPT is sandboxed by OpenAPI + JSON Schema. It cannot generate arbitrary commands or bypass workflows. |
| Orchestration | Agent-Based Execution Fabric | Each business function is isolated into independently deployable Ingest, Clean, Analyze, and Report agents. |
| Data Engineering | RAW โ CLEAN โ GOLD Data Lake | Immutable RAW inputs, reproducible CLEAN data, and versioned GOLD analytics. |
| Observability | Enterprise Event Ledger | Every API call, transformation, KPI, and file write is logged into Google Sheets with timestamps. |
| Unstructured Data | OCR & Document Intelligence | Serverless OCR extracts text from PDFs and images and feeds it into the CLEAN pipeline. |
| Automation | Cron-Driven Autonomy | Fully automated daily execution via scheduled workflows. |
User / System
โ
Custom GPT (Intent โ Structured JSON)
โ
OpenAPI Schema Validation
โ
n8n Control Plane
โ
Ingest โ Clean โ Analyze โ Report
โ
Enterprise Data Lake + KPI Ledger
| Layer | Technology | Role |
|---|---|---|
| AI Interface | Custom GPT + OpenAPI | Intent parsing, schema-validated command generation |
| Orchestration | n8n | Workflow execution engine and control plane |
| Data Lake | Google Drive | RAW / CLEAN / GOLD storage |
| Metadata & Logs | Google Sheets | Catalogs, KPIs, audit trails |
| OCR | Vercel Serverless | PDF & image text extraction |
| Contracts | JSON Schema + YAML | Validation and deterministic execution |
โโโโโโโโโโโโโโโโ
โ User / API โ
โโโโโโโโฌโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโโโโ
โ Custom GPT โ
โ (Intent Interpreter) โ
โโโโโโโโฌโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโโโโ
โ OpenAPI + JSON Schemaโ
โ (Contract Layer) โ
โโโโโโโโฌโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโโโโ
โ n8n Orchestration โ
โ (Execution Fabric) โ
โโโโโโโโฌโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Ingest | Clean | Analyze | โ
โ Report (Stateless Agents) โ
โโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Google Drive (RAW/CLEAN/GOLD)โ
โ Google Sheets (Logs/KPIs) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
User Prompt โ GPT โ Intent โ JSON Command โ OpenAPI Schema Validation โ ANIS Webhook โ n8n Control Plane โ Agent Pipelines โ Data Lake + KPI Ledger
โโโโโโโโโโโโโโโ
โ Ingest โ โ Gmail, APIs, Drive
โโโโโโโฌโโโโโโโโ
โ
โโโโโโโโโโโโโโโ
โ Clean โ โ Normalize, OCR, validate
โโโโโโโฌโโโโโโโโ
โ
โโโโโโโโโโโโโโโ
โ Analyze โ โ KPIs, metrics, insights
โโโโโโโฌโโโโโโโโ
โ
โโโโโโโโโโโโโโโ
โ Report โ โ Summaries, links
โโโโโโโโโโโโโโโ
- Stateless workflows allow safe retries.
- Schema validation prevents malformed executions.
- All data transformations are reproducible.
- Failures are isolated to individual agents.
| Agent | Primary Responsibility | Key Outputs |
|---|---|---|
| Ingest Agent | Acquire raw data from external sources (Gmail, Drive, APIs) | RAW files, metadata entries |
| Clean Agent | Normalize, validate, and convert raw data into structured formats | CLEAN datasets (CSV / JSON) |
| Analyze Agent | Compute KPIs, metrics, and analytical insights | GOLD datasets, KPI tables |
| Report Agent | Generate summaries, reports, and shareable outputs | Reports, Drive links |
Each agent is independently deployable, restart-safe, and stateless, ensuring fault isolation and operational resilience.
ANIS enforces a strict, enterprise-grade data lake lifecycle to guarantee traceability, reproducibility, and governance.
| Zone | Description | Mutability |
|---|---|---|
| RAW | Original ingested data (unchanged, immutable) | Read-only |
| CLEAN | Normalized, schema-aligned datasets | Rebuildable |
| GOLD | Analytics-ready, business-consumable outputs | Versioned |
RAW โ CLEAN โ GOLD
| ID | Area | Command | Expected Output | Explanation |
|---|---|---|---|---|
| T01 | Ingest | POST /webhook/anis | RAW files created | Gmail ingestion |
| T02 | Clean | Agent clean | CLEAN files created | Normalization |
- All inbound requests validated via OpenAPI schemas
- All transformations validated against structural schemas
- All outputs verified before persistence
- No silent failures or implicit transformations
Validation is enforced at every boundary to ensure deterministic behavior across environments.
| Tool | Purpose |
|---|---|
| Postman | Manual API verification |
| n8n UI | Workflow execution tracing |
| Google Sheets | Log and KPI verification |
| Drive Audit Logs | Artifact validation |
| Issue | Likely Cause | Resolution |
|---|---|---|
| Webhook returns 400 | Schema violation | Validate request payload |
| No files generated | OAuth permission issue | Reauthorize Google credentials |
| Scheduled job not running | Cron workflow disabled | Enable workflow in n8n |
- Secrets stored in
.env - OAuth credentials isolated
- Webhook endpoints protected
- Serverless OCR deployment
- Environment isolation
- Stateless execution model
git clone repo cp .env.example .env npm install n8n start
- Designed for non-technical operators
- All execution controlled via structured intent
- No manual data manipulation required
- Safe for repeated execution
- Parallel agent execution where applicable
- Stateless workflows reduce memory overhead
- Incremental processing minimizes rework
- Serverless OCR scales automatically
- Multi-tenant support
- Role-based access control
- Advanced KPI dashboards
- Pluggable data sources
- Schema versioning strategy
- Automated regression validation
- Agent marketplace expansion
- Enterprise monitoring integration
- Production-grade AI control plane
- Full auditability and governance
- Zero hardcoded logic
- Enterprise-ready automation framework
โโโโโโโโโโโโโโโโโโโโ
โ Human / System โ
โโโโโโโโโฌโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Custom GPT Control โ
โ (Intent โ JSON) โ
โโโโโโโโโฌโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโโโโโโ
โ OpenAPI + JSON Schema โ
โ (Contract Enforcement)โ
โโโโโโโโโฌโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโโโโโโ
โ n8n Control Plane โ
โ (Workflow Execution) โ
โโโโโโโโโฌโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Ingest โ Clean โ Analyze โ โ
โ Report (Stateless AI Agents) โ
โโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Google Drive (RAW/CLEAN/GOLD) โ
โ Google Sheets (Logs & KPIs) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
This architecture guarantees governed, deterministic, and auditable AI execution, making ANIS suitable for enterprise analytics, compliance-driven workflows, and production-grade AI operations.
ANIS-PERSONAL-AI-FACTORY-CONTROLLER/ โ โโโ diagrams/ โ โโโ high-level-architecture.png โ โโโ gpt-execution-flow.png โ โโโ scheduled-execution-flow.png โ โโโ data-lake-layout.png โ โโโ gpt/ โ โโโ action-schema.yaml โ โโโ instructions.md โ โโโ description.md โ โโโ conversation-starters.md โ โโโ name.md โ โโโ schemas/ โ โโโ webhook-request.schema.json โ โโโ webhook-response.schema.json โ โโโ control-sheet.schema.md โ โโโ screenshots/ โ โโโ google-drive/ โ โโโ google-sheets/ โ โ โโโ data-catalog/ โ โ โโโ event-log/ โ โ โโโ tasks-inbox/ โ โโโ gpt-controller/ โ โโโ workflows/ โ โโโ interactive/ โ โโโ scheduled/ โ โโโ serverless/ โ โโโ ocr-pdf-text-extraction-service/ โ โโโ workflows/ โ โโโ agents/ โ โ โโโ ingest_agent.json โ โ โโโ clean_agent.json โ โ โโโ analyze_agent.json โ โ โโโ report_agent.json โ โ โ โโโ interactive/ โ โ โโโ ANIS_HUB_gpt_webhook.json โ โ โ โโโ scheduled/ โ โโโ ANIS_DAILY_CRON.json โ โโโ analyze_agent_sub_workflow.json โ โโโ report_agent_sub_workflow.json โ โโโ .env.example โโโ .gitignore โโโ README.md
This section provides a fully explicit, end-to-end live demonstration guide for the ANIS Personal AI Factory Controller. It is intentionally verbose and operationally precise to enable live demos, technical interviews, architecture walkthroughs, and stakeholder reviews without ambiguity.
- Primary (Recommended): Custom GPT โ OpenAPI Action โ n8n Webhook
- Secondary: Direct API/Webhook invocation (Postman / curl)
- Automated: Scheduled execution via cron workflows
Example GPT Prompt:
Ingest Gmail attachments from the last 7 days, clean and normalize the data, analyze the results, and generate a report.
Internal Execution Flow:
- Custom GPT interprets user intent
- Prompt is validated against the OpenAPI action schema
- GPT generates a single, schema-compliant JSON command
- Command is dispatched to the ANIS webhook
- n8n orchestrates agent-based workflows
- Outputs are written to Google Drive and Google Sheets
- Structured results are returned to GPT
POST /webhook/anis
Content-Type: application/json
{
"agent": "ingest",
"source": {
"gmailQuery": "has:attachment",
"days": 7
},
"options": {
"attachmentsOnly": true,
"fileTypes": ["pdf", "csv", "xlsx"]
},
"return": "summary"
}
Expected Results:
- Attachments fetched from Gmail
- Files uploaded to Google Drive (RAW zone)
- Metadata recorded in DATA_CATALOG
- Execution logged in EVENT_LOG
{
"agent": "clean",
"return": "log"
}
Expected Results:
- RAW files normalized and converted
- CLEAN datasets generated (CSV / JSON / TXT)
- Schema-aligned data structures enforced
- Transformation events logged
{
"agent": "analyze",
"return": "kpis"
}
Expected Results:
- CLEAN datasets analyzed
- KPIs computed and validated
- GOLD datasets produced
- Analysis outputs appended to DATA_CATALOG
{
"agent": "report",
"return": "files"
}
Expected Results:
- Final reports generated
- Summaries and KPIs consolidated
- Reports uploaded to Google Drive
- Shareable links returned in response
Enable the ANIS_DAILY_CRON workflow in n8n to demonstrate:
- Autonomous ingestion
- Automatic cleaning and normalization
- Scheduled analysis and reporting
- Zero manual intervention
| Component | Location |
|---|---|
| RAW Files | Google Drive โ RAW |
| CLEAN Data | Google Drive โ CLEAN |
| GOLD Outputs | Google Drive โ GOLD |
| Event Logs | Google Sheets โ EVENT_LOG |
| KPIs | Google Sheets โ DATA_CATALOG |
| Reports | Google Drive โ REPORT |
ANIS represents a mature, enterprise-grade AI automation control plane designed with explicit emphasis on governance, determinism, auditability, and production readiness.
- Agent-based workflows enforce strict separation of concerns
- Each agent operates with a single, clearly defined responsibility
- Schema-driven execution eliminates ambiguity and non-determinism
- Stateless orchestration enables safe retries and fault tolerance
- All inputs validated against OpenAPI and JSON schemas
- Controlled data normalization and conversion pipelines
- Predictable outputs across environments
- No implicit or hidden execution paths
- Every action logged with timestamps and agent identity
- RAW โ CLEAN โ GOLD data lineage enforced
- Event logs provide full execution history
- Outputs are reproducible and reviewable
- No credentials committed to source control
- Environment-based secret injection
- OAuth scopes isolated per service
- Webhook contracts enforced via schemas
- Supports both interactive and scheduled execution
- Designed for non-technical operators
- Failure isolation at agent level
- Production-safe by default
ANIS is not a prototype or experimental build. It is a well-engineered, enterprise-ready automation system that demonstrates how AI-driven intent, workflow orchestration, and governed data pipelines can be unified into a single, compliant, extensible platform.
The project stands as a reference implementation for modern, schema-driven, agent-based automation systems suitable for real-world production environments.