Python · Supabase pgvector

RAG Ingestion

Python pipeline for ingesting documents into pgvector.

About this template

Python script that chunks documents (PDF, TXT, MD), generates embeddings via the OpenAI API, and stores vectors in Supabase pgvector. Reads configuration from a .env file and accepts a directory or single file as input. Includes hash-based deduplication and idempotent upsert so documents can be reprocessed without creating duplicates. Ready to use in RAG pipelines with any framework (LangChain, LlamaIndex, n8n AI nodes).

What you get

ingest.py — main ingestion script
requirements.txt with pinned dependencies
SQL schema for embeddings table
README in PT-BR and EN with usage examples
ARCHITECTURE.md describing the pipeline
.env.example with all variables
Commercial LICENSE

Prerequisites

Python 3.10+ with pip
Supabase project with pgvector extension enabled
OpenAI API key for embedding generation
Dependencies listed in requirements.txt (included)

Built on

Python OpenAI Supabase pgvector

Don't want to set it up yourself?

I offer done-for-you implementation on your own infrastructure starting at ~$300 USD. The template price ($19.99) credits in full against any tier.

WhatsApp Email