PapyR AI Invoice Processing Pipeline

AI-powered invoice workflow that ingests PDFs from Gmail, extracts fields, runs fraud checks, and generates approval rep...

48 views
Used by 1 architect
0

Create a free account to remix this architecture

Sign up freeor sign in

PapyR AI Invoice Processing Pipeline

AI-powered invoice workflow that ingests PDFs from Gmail, extracts fields, runs fraud checks, and generates approval reports. Uses API, queue, storage, and database services.

advanced
Control Plane — Agent Lane
Invoice Ingestion Agent
API
5 tools:pdf_validate, deduplicate +3
Orchestrates 4 services
Claude Extraction Agent
claude-opus-4-6EVENT
4 tools:download_pdf, vision_extract +2
Orchestrates 3 services
Fraud Validation Agent
EVENT
4 tools:rule_engine, score_fraud +2
Orchestrates 3 services
Report Generation Agent
EVENT
4 tools:render_html, generate_pdf +2
Orchestrates 3 services
api papyrs3 invoicesrds postgresredis papyrs3 invoicesrds postgresredis papyrrds postgresredis papyrsns alertsrds postgress3 reportsredis papyr
AWS

Tags

#web-app
#three-tier
#aws
#security
#high-availability

Architecture Description

Draw a professional architecture diagram for an AI-powered invoice processing system called "PapyR — Invoice Agent". Use a dark-themed, modern style with color-coded swimlanes. The diagram must show 5 horizontal swimlanes top to bottom: Swimlane 1 — User Layer (blue) Gmail Inbox Gmail Add-on Sidebar (Google Apps Script) — triggered by user opening an email Arrow: user clicks "Process Invoice" or "Scan Inbox" Swimlane 2 — Edge / Tunnel (grey) ngrok tunnel (dev) / real domain (prod) nginx reverse proxy (TLS termination, mkcert wildcard cert for *.invoice.localtest.me) Routes to: FastAPI on port 8000, Keycloak on port 8080, MinIO on port 9000 Swimlane 3 — Application Services (purple) Four boxes side by side: FastAPI (port 8000) — REST API Gateway, async, uvicorn Keycloak 26 (port 8080) — Identity Provider, Google OAuth broker, stores Gmail tokens per user Flower (port 5555) — Celery task monitoring UI Redis (port 6379) — Celery broker + result backend + RedBeat scheduler (email polling cron) Swimlane 4 — Agent Pipeline (orange) — show as a left-to-right sequential flow with arrows: Agent 1 — Ingestion Receives PDF from Gmail Add-on via POST /invoices/upload-raw Validates PDF (magic bytes check) Deduplicates by gmail_message_id (409 if already exists) Stores raw PDF in MinIO bucket: raw/<user_id>/<invoice_id>.pdf Creates Invoice record in PostgreSQL (status: ingested) Enqueues Celery task → Redis queue: extraction → Agent 2 — Extraction (Claude claude-opus-4-6 / Anthropic) Downloads PDF from MinIO Sends PDF as base64 to Claude claude-opus-4-6 Vision API Extracts structured JSON: vendor, invoice #, dates, amounts, currency, PO number, IBAN, line items, confidence score Updates Invoice in PostgreSQL (status: extracted) Enqueues → Redis queue: validation → Agent 3 — Validation / Fraud Detection Rule-based fraud checks: LOW_CONFIDENCE, HIGH_AMOUNT (>€50k), ROUND_AMOUNT, MISSING_PO, MISSING_IBAN Computes fraud score (0–1). Score ≥ 0.5 → status: flagged → Telegram alert Updates Invoice in PostgreSQL (status: validated or flagged) Enqueues → Redis queue: report_gen → Agent 4 — Report Generation (WeasyPrint) Renders HTML → PDF approval report (fraud score badge, extracted fields, fraud flags) Renders CSV summary (QuickBooks-compatible) Uploads both to MinIO bucket: reports/<user_id>/<invoice_id>.pdf/.csv Updates Invoice in PostgreSQL (status: done) Swimlane 5 — Storage Layer (green) Three boxes side by side: PostgreSQL 16 (port 5432) — invoice_db (users, invoices, processing_jobs), keycloak_db MinIO (ports 9000/9001) — S3-compatible object storage, buckets: invoices/ (raw PDFs), reports/ (PDFs + CSVs) Telegram — receives fraud alert notifications when fraud_score ≥ 0.5 Cross-cutting arrows to show: Gmail Add-on → nginx → FastAPI (HTTPS via ngrok tunnel) FastAPI ↔ Keycloak (token validation) FastAPI ↔ Redis (task queue) FastAPI ↔ PostgreSQL (read/write) All agents ↔ MinIO (PDF upload/download) All agents ↔ PostgreSQL (status updates) Agent 3 → Telegram (fraud alert) Gmail Add-on → Google Drive REST API (CSV export) Add a legend box with: Docker Compose (local dev) / k3d k3s cluster (demo day) Secrets managed by Doppler (project: papyr) AI: Claude claude-opus-4-6 (Anthropic)

Community Discussion

Sign in to join the discussion

Sign in to comment

Be the first to comment

Share your thoughts on this architecture

AI Analysis(1)

AI Review
Tradeoff

about 1 month ago

I think the biggest production risk is identity and token handling: Keycloak is shown as brokering Google OAuth and storing Gmail tokens per user, while the Gmail Add-on triggers processing from inside the mailbox. That makes Keycloak both your auth boundary and a high-value secret store, but I don’t see clear separation of user session auth vs delegated Gmail access, token encryption/rotation, or how Celery workers get least-privilege access to user mail context.

MF

Maxwell Famoriyo

@maxwellfamoriyo

View portfolio
Try it in the Playground

Open an interactive version — fork it, generate AI variants, or share it with your team.

What You'll Get

Editable diagram in your workspace
Interactive cost estimates
AI-powered customization
Save and share your version

Start Customizing

Make this template your own

Takes 30 seconds • No credit card required

1 person has remixed this design

Share

Statistics

Views
48
Clones
1
Likes
0
Readiness
81/100

Details

Category
web app

Estimated monthly cost

$113.54/month

Published
4/3/2026

Services Used

26 cloud services in this architecture

RouteTable
SecurityGroup
ALB
TargetGroup
EC2Instance
RDSInstance
ElastiCache
S3Bucket
SecretsManager
KMS
CloudWatch
CloudWatchAlarm
CloudTrail
GuardDuty
SecurityHub
APIGateway
SNS
Aws-subnet-public-1a
Aws-subnet-public-1b
Aws-subnet-private-app-1a
Aws-subnet-private-app-1b
Aws-subnet-private-db-1a
Aws-subnet-private-db-1b
InternetGateway
WAF
NATGateway

Ready to build this?

Clone this architecture into your workspace and deploy it to your cloud account.

Takes 30 seconds • No credit card required

You Might Also Like

Cloud Architecture for Ad Network with Authentication
382

Cloud Architecture for Ad Network with Authentication

Please create a cloud architecture for an ad network. Features that we need include: user authentication on our…

AWS
AZURE
Serverless E-Commerce Platform with Real-Time Inventory
217

Serverless E-Commerce Platform with Real-Time Inventory

Design a serverless e-commerce platform with real-time inventory management. Use AWS Lambda for order processing,…

AWS
Global E-Commerce Platform with Multi-Cloud Failover
187

Global E-Commerce Platform with Multi-Cloud Failover

Create a global e-commerce platform with multi-cloud redundancy. Use AWS in us-east-1 for the primary application with…

AWS
AZURE
GCP
Scalable 3-Tier Web Application
152

Scalable 3-Tier Web Application

Create a basic web application on AWS with EC2 instances behind an Application Load Balancer, using RDS for the…

AWS