Self-hosted Instagram archive
Zanzo watches the reels you save or DM and turns each one into structured, searchable data — the recipe with its quantities, the event with its date, the place with its name. Not bookmarks. An archive that answers questions.
FastAPI · Postgres + pgvector · Gemini · Whisper / Deepgram · Next.js
live pipeline
1/6
how it works
A poller watches a dedicated bot account's saved collection and DMs. Forward any reel from your own account — no URL pasting, no browser extension, no app switching.
Each reel is transcribed (multilingual — English, Hindi, Telugu and more), classified into a category, and run through a schema-specific extractor. Silent reels get read visually, frame by frame.
Hybrid semantic search means “that tokyo ramen place” finds the reel even if nobody ever said ramen. Events export to your calendar as .ics in one click.
the pipeline
01
fetch
media pulled & stored in S3/MinIO
02
transcribe
Deepgram nova-2, Whisper fallback
03
classify
six categories, multimodal
04
extract
per-category JSON schemas
05
embed
pgvector, 1536 dims
idempotent stages · resumable · non-destructive on quota errors
what comes out
Every category has its own extraction schema. An event reel yields a start time, a venue and a ticket link — so “add to calendar” is one click, not a re-watch.
Events get starts_at, venue, ticket_url. Recipes get ingredients with quantities and ordered steps. Tech reels get the actual commands shown on screen.
Vector similarity over pgvector fused with text matching. Search what you remember, not what was said.
Deepgram nova-2 with per-reel language detection — English, Hindi, Telugu and more. Falls back to local Whisper at zero cost.
Reels with no useful audio — or that point at on-screen content — are read visually by Gemini, frames and captions together.
“Comment GUIDE for the link”? The bot comments, watches its DMs, and files the link into the item. Capped, delayed, opt-in.
JWT auth, per-user libraries, Instagram identity verified by DM code and bound to the stable account ID — handle renames can't break it.
open source
The whole stack is on GitHub, end to end. Run it on a laptop with Docker, or on AWS for the price of a coffee or two a month. No subscription. No third-party cloud holding your data.
Backend — FastAPI, the ingestion poller, the AI pipeline, auth, and the engagement reconciler.
Python · FastAPI · SQLAlchemy · Redis
Dashboard — this interface. Feed, category views, search, settings and the admin panel.
Next.js 16 · React 19 · Tailwind v4
Run it honest.Automating an Instagram account is against Instagram's terms and can get that account banned. Zanzo is built for a dedicated bot account — never your main — with conservative pacing, daily caps, and a kill switch in the dashboard. You bring your own account, your own keys, and your own risk.