Self-hosted Instagram archive

Saved it. Never lost it.

Zanzo watches the reels you save or DM and turns each one into structured, searchable data — the recipe with its quantities, the event with its date, the place with its name. Not bookmarks. An archive that answers questions.

FastAPI · Postgres + pgvector · Gemini · Whisper / Deepgram · Next.js

how it works

DM a reel to your bot account.
That's the whole workflow.

01

Save or DM

A poller watches a dedicated bot account's saved collection and DMs. Forward any reel from your own account — no URL pasting, no browser extension, no app switching.

02

The pipeline reads it

Each reel is transcribed (multilingual — English, Hindi, Telugu and more), classified into a category, and run through a schema-specific extractor. Silent reels get read visually, frame by frame.

03

Search it. Act on it.

Hybrid semantic search means “that tokyo ramen place” finds the reel even if nobody ever said ramen. Events export to your calendar as .ics in one click.

the pipeline

01

fetch

media pulled & stored in S3/MinIO

02

transcribe

Deepgram nova-2, Whisper fallback

03

classify

six categories, multimodal

04

extract

per-category JSON schemas

05

embed

pgvector, 1536 dims

idempotent stages · resumable · non-destructive on quota errors

what comes out

Not tags. Fields.

Every category has its own extraction schema. An event reel yields a start time, a venue and a ticket link — so “add to calendar” is one click, not a re-watch.

Category-aware extraction

Events get starts_at, venue, ticket_url. Recipes get ingredients with quantities and ordered steps. Tech reels get the actual commands shown on screen.

Hybrid semantic search

Vector similarity over pgvector fused with text matching. Search what you remember, not what was said.

Multilingual transcription

Deepgram nova-2 with per-reel language detection — English, Hindi, Telugu and more. Falls back to local Whisper at zero cost.

Visual extraction

Reels with no useful audio — or that point at on-screen content — are read visually by Gemini, frames and captions together.

Resource harvesting

“Comment GUIDE for the link”? The bot comments, watches its DMs, and files the link into the item. Capped, delayed, opt-in.

Multi-user & admin

JWT auth, per-user libraries, Instagram identity verified by DM code and bound to the stable account ID — handle renames can't break it.

open source

Your saves live on your hardware.

The whole stack is on GitHub, end to end. Run it on a laptop with Docker, or on AWS for the price of a coffee or two a month. No subscription. No third-party cloud holding your data.

Run it honest.Automating an Instagram account is against Instagram's terms and can get that account banned. Zanzo is built for a dedicated bot account — never your main — with conservative pacing, daily caps, and a kill switch in the dashboard. You bring your own account, your own keys, and your own risk.