Why we built /rag from scratch instead of self-hosting RAGFlow

When we decided to add document Q&A to the platform, the obvious first move was to look at what already exists. RAGFlow is the most popular open-source RAG platform on GitHub — 80,000+ stars, full document parsing, OCR, an agent framework, a built-in admin UI. It looked like exactly what we needed. It wasn't.

RAGFlow is a product, not a library

RAGFlow ships as a Docker Compose stack: its own MySQL, its own Elasticsearch (or Infinity) for vector search, Redis, MinIO for object storage, and a Go backend tying it together. The documented minimum is 16GB RAM and 50GB disk. That's not a library you import into an existing app — it's a separate, always-on service you'd need to deploy, monitor, and pay for, completely independent of the Vercel + Supabase stack everything else on this platform already runs on.

For one namespace among 34, adopting an entirely separate piece of infrastructure with its own database and its own ops burden would have been a bad trade — the maintenance cost of running RAGFlow would have dwarfed the value of the feature it provides.

RAG is a technique, not a platform

The actual mechanics of retrieval-augmented generation are: split a document into chunks, turn each chunk into a vector embedding, store the vectors, find the closest ones to a query, optionally feed the matches to an LLM for a generated answer. Every one of those steps is an ordinary function call or API call. None of it requires a dedicated platform.

Chunking — pure code. We split on paragraph and sentence boundaries with a fixed overlap, no dependency.
Embeddings — one API call to OpenAI's text-embedding-3-small.
Vector storage and search — Supabase already runs Postgres, and Postgres has pgvector. No second database.
Generation — one API call to an LLM with the matched chunks in the prompt.

What we actually built

Four Next.js route handlers: POST /rag/documents (upload, chunk, embed, store), GET /rag/documents and GET/DELETE /rag/documents/{id} for management, POST /rag/query for pure semantic search, and POST /rag/chat for retrieval plus a generated, cited answer. A migration adds two tables — lapi_rag_documents and lapi_rag_chunks, the latter with an HNSW index on a vector(1536) column — and one Postgres function for cosine-similarity search scoped to a user and optionally a single document.

Same Vercel deployment, same Supabase project, same auth and billing middleware every other namespace already uses. No new infrastructure to operate.

What we gave up by not using RAGFlow

This is a real trade-off, not a free lunch. RAGFlow has document parsing for PDFs, DOCX, and scanned images with OCR; we launched text-only. It has hybrid search (dense vector + sparse keyword) and a reranking step; we have plain vector similarity. It has a visual agent-building interface; we have an API. For our v1 — upload text, ask questions, get cited answers, gated to paid plans — none of that complexity was load-bearing yet. PDF support is the planned next step, most likely via a dedicated parsing service called from the existing upload endpoint, not by adopting RAGFlow's full stack.