AI-powered invoice workflow that ingests PDFs from Gmail, extracts fields, runs fraud checks, and generates approval rep...
Draw a professional architecture diagram for an AI-powered invoice processing system called "PapyR — Invoice Agent". Use a dark-themed, modern style with color-coded swimlanes. The diagram must show 5 horizontal swimlanes top to bottom: Swimlane 1 — User Layer (blue) Gmail Inbox Gmail Add-on Sidebar (Google Apps Script) — triggered by user opening an email Arrow: user clicks "Process Invoice" or "Scan Inbox" Swimlane 2 — Edge / Tunnel (grey) ngrok tunnel (dev) / real domain (prod) nginx reverse proxy (TLS termination, mkcert wildcard cert for *.invoice.localtest.me) Routes to: FastAPI on port 8000, Keycloak on port 8080, MinIO on port 9000 Swimlane 3 — Application Services (purple) Four boxes side by side: FastAPI (port 8000) — REST API Gateway, async, uvicorn Keycloak 26 (port 8080) — Identity Provider, Google OAuth broker, stores Gmail tokens per user Flower (port 5555) — Celery task monitoring UI Redis (port 6379) — Celery broker + result backend + RedBeat scheduler (email polling cron) Swimlane 4 — Agent Pipeline (orange) — show as a left-to-right sequential flow with arrows: Agent 1 — Ingestion Receives PDF from Gmail Add-on via POST /invoices/upload-raw Validates PDF (magic bytes check) Deduplicates by gmail_message_id (409 if already exists) Stores raw PDF in MinIO bucket: raw/<user_id>/<invoice_id>.pdf Creates Invoice record in PostgreSQL (status: ingested) Enqueues Celery task → Redis queue: extraction → Agent 2 — Extraction (Claude claude-opus-4-6 / Anthropic) Downloads PDF from MinIO Sends PDF as base64 to Claude claude-opus-4-6 Vision API Extracts structured JSON: vendor, invoice #, dates, amounts, currency, PO number, IBAN, line items, confidence score Updates Invoice in PostgreSQL (status: extracted) Enqueues → Redis queue: validation → Agent 3 — Validation / Fraud Detection Rule-based fraud checks: LOW_CONFIDENCE, HIGH_AMOUNT (>€50k), ROUND_AMOUNT, MISSING_PO, MISSING_IBAN Computes fraud score (0–1). Score ≥ 0.5 → status: flagged → Telegram alert Updates Invoice in PostgreSQL (status: validated or flagged) Enqueues → Redis queue: report_gen → Agent 4 — Report Generation (WeasyPrint) Renders HTML → PDF approval report (fraud score badge, extracted fields, fraud flags) Renders CSV summary (QuickBooks-compatible) Uploads both to MinIO bucket: reports/<user_id>/<invoice_id>.pdf/.csv Updates Invoice in PostgreSQL (status: done) Swimlane 5 — Storage Layer (green) Three boxes side by side: PostgreSQL 16 (port 5432) — invoice_db (users, invoices, processing_jobs), keycloak_db MinIO (ports 9000/9001) — S3-compatible object storage, buckets: invoices/ (raw PDFs), reports/ (PDFs + CSVs) Telegram — receives fraud alert notifications when fraud_score ≥ 0.5 Cross-cutting arrows to show: Gmail Add-on → nginx → FastAPI (HTTPS via ngrok tunnel) FastAPI ↔ Keycloak (token validation) FastAPI ↔ Redis (task queue) FastAPI ↔ PostgreSQL (read/write) All agents ↔ MinIO (PDF upload/download) All agents ↔ PostgreSQL (status updates) Agent 3 → Telegram (fraud alert) Gmail Add-on → Google Drive REST API (CSV export) Add a legend box with: Docker Compose (local dev) / k3d k3s cluster (demo day) Secrets managed by Doppler (project: papyr) AI: Claude claude-opus-4-6 (Anthropic)
Sign in to join the discussion
Sign in to commentBe the first to comment
Share your thoughts on this architecture
about 1 month ago
I think the biggest production risk is identity and token handling: Keycloak is shown as brokering Google OAuth and storing Gmail tokens per user, while the Gmail Add-on triggers processing from inside the mailbox. That makes Keycloak both your auth boundary and a high-value secret store, but I don’t see clear separation of user session auth vs delegated Gmail access, token encryption/rotation, or how Celery workers get least-privilege access to user mail context.
Maxwell Famoriyo
@maxwellfamoriyo
Open an interactive version — fork it, generate AI variants, or share it with your team.
Make this template your own
Takes 30 seconds • No credit card required
1 person has remixed this design
Estimated monthly cost
$113.54/month
26 cloud services in this architecture
Ready to build this?
Clone this architecture into your workspace and deploy it to your cloud account.
Takes 30 seconds • No credit card required
Please create a cloud architecture for an ad network. Features that we need include: user authentication on our…
Design a serverless e-commerce platform with real-time inventory management. Use AWS Lambda for order processing,…
Create a global e-commerce platform with multi-cloud redundancy. Use AWS in us-east-1 for the primary application with…
Create a basic web application on AWS with EC2 instances behind an Application Load Balancer, using RDS for the…