Project Proposal: Codebase & Docs Q&A Assistant (RAG)

Overview

A locally-runnable RAG (Retrieval-Augmented Generation) chatbot that lets you ask natural language questions about any codebase or documentation folder. You point it at a repo or docs directory, it indexes everything, and you can ask things like:

"How does authentication work in this project?"
"Where is the database connection initialized?"
"What does the processPayment function do and where is it called?"

This mirrors exactly what NPX does with their proposal generation tool — ingest a corpus of documents, embed them into a vector store, and use an LLM to answer questions grounded in that content.

Tech Stack

Layer	Tool	Why
Embeddings + LLM	Ollama (local)	Free, private, matches NPX's stack exactly
Vector Database	Weaviate (Docker)	NPX's actual stack
Ingestion + orchestration	Python (LangChain or manual)	Simple, you already know it
Frontend (optional)	React + Next.js	NPX's stack, makes it demo-able
File parsing	Python (ast, pathlib, tiktoken)	Parse code + docs into chunks

You can swap Ollama for OpenAI API if you want faster/better responses during dev — just use the same interface.

Architecture

[ Codebase / Docs Folder ]
        |
        v
[ Ingestion Pipeline ]  <-- Python script
  - Walk directory tree
  - Parse .py, .ts, .md, .txt, .json files
  - Chunk by file / function / heading
  - Generate embeddings (Ollama: nomic-embed-text)
        |
        v
[ Weaviate Vector Store ]  <-- Docker container
  - Store chunks + metadata (filename, line range, language)
        |
        v
[ Query Pipeline ]  <-- Python / API
  - Take user question
  - Embed the question
  - Retrieve top-k relevant chunks from Weaviate
  - Build prompt: "Given this context: {chunks} — answer: {question}"
  - Send to LLM (Ollama: llama3 or mistral)
        |
        v
[ Response ]  <-- streamed answer with source file references

Project Structure

rag-codebase-assistant/
├── ingestion/
│   ├── walker.py          # Recursively walk and filter files
│   ├── chunker.py         # Split files into meaningful chunks
│   ├── embedder.py        # Generate embeddings via Ollama
│   └── indexer.py         # Push chunks + embeddings into Weaviate
├── retrieval/
│   ├── query.py           # Embed question, query Weaviate, return top-k
│   └── prompt.py          # Build prompt with retrieved context
├── llm/
│   └── ollama_client.py   # Wrapper for Ollama chat completions
├── api/
│   └── main.py            # FastAPI server exposing /ask endpoint
├── frontend/              # Optional React/Next.js chat UI
│   └── ...
├── docker-compose.yml     # Weaviate + optional Ollama container
├── ingest.py              # CLI entrypoint: python ingest.py ./my-repo
├── chat.py                # CLI entrypoint: python chat.py
└── README.md

Implementation Plan

Phase 1 — Ingestion Pipeline

Set up Weaviate locally via Docker (docker-compose up)
Write walker.py — recursively collect files, filter by extension (.py, .ts, .md, .txt, ignore node_modules, .git, build dirs)
Write chunker.py — split files into chunks:
- For code: chunk by function/class using AST parsing (Python) or regex (TS)
- For markdown/docs: chunk by heading sections
- Max chunk size: ~500 tokens with ~50 token overlap
Write embedder.py — call Ollama's embedding endpoint (nomic-embed-text model)
Write indexer.py — create Weaviate schema and upsert chunks with metadata
Wire together in ingest.py CLI

Phase 2 — Query + Answer Pipeline

Write query.py — embed incoming question, query Weaviate for top 5 chunks by cosine similarity

Write prompt.py — build a prompt like:

You are a helpful assistant for a software codebase.
Use only the following context to answer the question.
If the answer isn't in the context, say so.

Context:
{retrieved_chunks}

Question: {user_question}
Answer:

Write ollama_client.py — call Ollama chat endpoint, stream response
Wire together in chat.py CLI with a simple input loop

Phase 3 — API + Frontend (makes it demo-able)

Wrap query pipeline in a FastAPI /ask endpoint
Build a minimal React chat UI (Next.js):
- Text input for question
- Streamed response display
- Source file references shown under each answer
Connect frontend to FastAPI backend

Phase 4 — Polish for Portfolio

Add a README with setup instructions and a demo GIF
Test it against a real open source repo (e.g. your FindIT project)
Add a --repo flag that auto-clones a GitHub URL and ingests it
Deploy Weaviate + API to Azure (matches NPX's cloud stack)

Weaviate Schema

schema = {
    "class": "CodeChunk",
    "properties": [
        {"name": "content", "dataType": ["text"]},       # the actual code/text
        {"name": "filepath", "dataType": ["text"]},      # relative file path
        {"name": "language", "dataType": ["text"]},      # python, typescript, markdown
        {"name": "chunkType", "dataType": ["text"]},     # function, class, section, file
        {"name": "startLine", "dataType": ["int"]},      # line number start
        {"name": "endLine", "dataType": ["int"]},        # line number end
    ],
    "vectorizer": "none"  # we supply our own embeddings
}

Docker Compose

version: '3.8'
services:
  weaviate:
    image: semitechnologies/weaviate:latest
    ports:
      - "8080:8080"
    environment:
      QUERY_DEFAULTS_LIMIT: 20
      AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
      PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
      DEFAULT_VECTORIZER_MODULE: 'none'
    volumes:
      - weaviate_data:/var/lib/weaviate

  ollama:
    image: ollama/ollama:latest
    ports:
      - "11434:11434"
    volumes:
      - ollama_data:/root/.ollama

volumes:
  weaviate_data:
  ollama_data:

Key Prompts for Codex

Use these when working through implementation:

"Implement walker.py — recursively walk a directory, return all files with extensions in a given allowlist, skip common ignore patterns like node_modules, .git, pycache, dist"
"Implement chunker.py — for Python files use the ast module to split by function and class definitions. For markdown split by ## headings. For other text files split by character count with overlap. Return a list of dicts with keys: content, start_line, end_line, chunk_type"
"Implement embedder.py — call Ollama's POST /api/embeddings endpoint with model nomic-embed-text and return the embedding vector"
"Implement indexer.py — connect to Weaviate at localhost:8080, create the CodeChunk schema if it doesn't exist, batch upsert a list of chunk dicts with their embedding vectors"
"Implement query.py — embed a question string using Ollama, query Weaviate for the top 5 nearest CodeChunk objects by vector similarity, return their content and filepath"
"Implement a FastAPI app in api/main.py with a POST /ask endpoint that accepts a JSON body with a 'question' field and returns a streamed response"

What to Say About This Project

In an interview at NPX:

"I built a RAG pipeline that lets you query any codebase or documentation in natural language. It uses Weaviate for vector storage and Ollama for local LLM inference — which I chose specifically because they're in your stack. The core idea is the same as your proposal generation tool: ingest a document corpus, embed it, and use retrieval to ground the LLM's answers in real content rather than hallucinations."

That's a sentence that will land.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Project Proposal: Codebase & Docs Q&A Assistant (RAG)

Overview

Tech Stack

Architecture

Project Structure

Implementation Plan

Phase 1 — Ingestion Pipeline

Phase 2 — Query + Answer Pipeline

Phase 3 — API + Frontend (makes it demo-able)

Phase 4 — Polish for Portfolio

Weaviate Schema

Docker Compose

Key Prompts for Codex

What to Say About This Project

FilesExpand file tree

rag_codebase_assistant_proposal.md

Latest commit

History

rag_codebase_assistant_proposal.md

File metadata and controls

Project Proposal: Codebase & Docs Q&A Assistant (RAG)

Overview

Tech Stack

Architecture

Project Structure

Implementation Plan

Phase 1 — Ingestion Pipeline

Phase 2 — Query + Answer Pipeline

Phase 3 — API + Frontend (makes it demo-able)

Phase 4 — Polish for Portfolio

Weaviate Schema

Docker Compose

Key Prompts for Codex

What to Say About This Project