Invoice/Receipt Intake using AWS (S3 + Textract + DynamoDB) + n8n + Slack.
PRAFUL PATEL ☁️🚀, Highly skilled and motivated Cloud Engineer with a proven track record of designing, implementing, and managing robust cloud infrastructure solutions. With years of hands-on experience, I am deeply passionate about creating scalable and resilient cloud architectures that drive innovation and deliver optimal business outcomes. 🛠 Key Competencies:
Cloud Platforms: AWS, Azure, GCP, OCI Infrastructure as Code: Terraform, Ansible Containers & Orchestration: Docker, Kubernetes Scripting: Python, Bash/Shell CI/CD & Version Control: GitHub, Jenkins, CircleCI Monitoring & Analytics: Grafana, Prometheus, Datadog, New Relic Backup & Recovery: Veeam Operating Systems: Linux, Windows DevOps Tools: AWS Code Build, Code Pipeline, Azure DevOps
📚 Continuous Learning: Staying ahead in the rapidly evolving cloud landscape is my priority. I am committed to expanding my skill set and embracing emerging cloud technologies to drive efficiency and innovation. Passionate Cloud/DevOps enthusiast dedicated to designing, building, and deploying cutting-edge technology solutions. As a devoted YouTuber, I love sharing insights through informative videos and crafting technical blogs that delve into areas like ☁️ Cloud, 🛠️ DevOps, 🐧 Linux, and 📦 Containers. 💻 Open Source Advocate: Contributing to open-source projects is a vital part of my journey. I actively engage in projects centered around Cloud, DevOps, Linux, and Containers, fostering collaboration and innovation within the community. 💌 Let's Connect: I am enthusiastic about virtual collaborations and meeting fellow professionals. Let's explore how I can contribute to your organization's cloud goals. Feel free to connect or DM me.
🌐 Portfolio: Check out my portfolio 🔗 LinkedIn: Connect on LinkedIn 🛠️ GitHub: Explore my projects 🎥 YouTube: Watch my videos 📝 Medium: Read my articles 🌐 Dev.to: Check out my posts
Option A — Fastest + clean (recommended for Week 1)
Flow
User uploads invoice to S3 bucket
ai-intake-docsS3 event triggers EventBridge rule
EventBridge sends event to API Gateway HTTP API
API Gateway calls n8n Webhook
/s3-intaken8n pulls the file from S3 (GetObject)
n8n calls Textract AnalyzeExpense
n8n maps fields →
{vendor,total,date,line_items}n8n stores result in DynamoDB table
ai_resultsn8n posts summary to Slack
Why this is good
Near real-time
Easy to demo
No queue complexity
Works well for weekly posting
Option B — Enterprise buffered (upgrade)
Flow
S3 upload → EventBridge → SQS → n8n (poll SQS) → S3 GetObject → Textract → DynamoDB → Slack
Why it’s better
Handles bursts (100s uploads)
Retry/replay is easier
Keeps n8n from getting hammered
2) AWS resources you need (both options)
S3
Bucket:
ai-intake-docsFolder convention:
invoices/YYYY/MM/...Encryption: SSE-S3 (default)
Block public access: ON
DynamoDB
Table:
ai_resultsPartition key:
job_id(String)Optional attributes:
s3_bucket,s3_key,vendor,total,invoice_date,created_at,raw_textract
Textract
- Use API:
AnalyzeExpense(best for invoices/receipts)
IAM
n8n needs permissions:
s3:GetObject on bucket objects
textract:AnalyzeExpense
dynamodb:PutItem on table
3) Step-by-step implementation (Option A)
Step A1 — Create S3 bucket
AWS Console → S3 → Create bucket →
ai-intake-docsBlock public access: enabled
Default encryption: enabled
(Optional) create folder
invoices/
Step A2 — Create DynamoDB table
DynamoDB → Create table:
Table name:
ai_resultsPartition key:
job_id(String)
Leave defaults (on-demand is fine)
Step A3 — Create API Gateway HTTP API (Webhook gateway)
API Gateway → Create API → HTTP API
Add integration:
- URL = your n8n webhook endpoint
Example:https://n8n.yourdomain.com/webhook/s3-intake
- URL = your n8n webhook endpoint
Add route:
POST /s3-events
Deploy stage:
$default
Security (good enough for portfolio)
Add an API key or a shared secret header (recommended)
In n8n, check header like
x-shared-secret
Step A4 — Create EventBridge rule for S3 uploads
EventBridge → Rules → Create rule
Event source: AWS events
Pattern:
{
"source": ["aws.s3"],
"detail-type": ["Object Created"],
"detail": {
"bucket": { "name": ["ai-intake-docs"] }
}
}
Target: API Gateway
Choose your HTTP API
Route:
POST /s3-events
Now: every new upload triggers your API Gateway → n8n.
Step A5 — Create IAM credentials for n8n
If n8n is on EC2: best is instance role.
If local: use an IAM user access key.
Minimal IAM policy (Week 1)
{
"Version": "2012-10-17",
"Statement": [
{ "Effect": "Allow", "Action": ["s3:GetObject"], "Resource": "arn:aws:s3:::ai-intake-docs/*" },
{ "Effect": "Allow", "Action": ["textract:AnalyzeExpense"], "Resource": "*" },
{ "Effect": "Allow", "Action": ["dynamodb:PutItem"], "Resource": "arn:aws:dynamodb:*:*:table/ai_results" }
]
}
Attach it to:
EC2 instance role used by n8n OR
IAM user used by n8n AWS credentials
4) n8n workflow (import-ready) + mapping code
What the webhook payload looks like
EventBridge sends something like:
{
"detail": {
"bucket": { "name": "ai-intake-docs" },
"object": { "key": "invoices/2026/01/invoice1.pdf" }
}
}
Import-ready n8n workflow JSON
Workflow name: Week1 - S3→Textract→DynamoDB→Slack
Webhook path: /s3-intake
After import, set credentials:
AWS credential in S3/Textract/DynamoDB nodes
Slack credential in Slack node
Optionally add shared secret check
{
"name": "Week1 - S3→Textract→DynamoDB→Slack",
"nodes": [
{
"parameters": {
"path": "s3-intake",
"httpMethod": "POST",
"responseMode": "lastNode"
},
"name": "Webhook (/s3-intake)",
"type": "n8n-nodes-base.webhook",
"typeVersion": 2,
"position": [200, 300]
},
{
"parameters": {
"jsCode": "const detail = $json.detail || {};\nconst bucket = detail.bucket?.name;\nconst key = detail.object?.key;\n\nif (!bucket || !key) {\n throw new Error('Missing bucket/key in event payload');\n}\n\nreturn [{ bucket, key, received_at: new Date().toISOString() }];"
},
"name": "Extract S3 Bucket+Key",
"type": "n8n-nodes-base.code",
"typeVersion": 2,
"position": [430, 300]
},
{
"parameters": {
"operation": "getObject",
"bucketName": "={{$json.bucket}}",
"objectKey": "={{$json.key}}"
},
"name": "S3 GetObject",
"type": "n8n-nodes-base.awsS3",
"typeVersion": 1,
"position": [670, 300],
"credentials": {
"aws": { "id": "YOUR_AWS_CRED", "name": "AWS Account" }
}
},
{
"parameters": {
"operation": "analyzeExpense",
"binaryPropertyName": "data"
},
"name": "Textract AnalyzeExpense",
"type": "n8n-nodes-base.awsTextract",
"typeVersion": 1,
"position": [920, 300],
"credentials": {
"aws": { "id": "YOUR_AWS_CRED", "name": "AWS Account" }
}
},
{
"parameters": {
"jsCode": "function pick(fields, label) {\n const f = fields.find(x => (x.Type?.Text || '').toLowerCase() === label.toLowerCase());\n const v = f?.ValueDetection?.Text || null;\n return v;\n}\n\nconst tex = $json;\nconst doc = tex.ExpenseDocuments?.[0];\nif (!doc) throw new Error('No ExpenseDocuments returned by Textract');\n\nconst summaryFields = doc.SummaryFields || [];\nconst lineItems = [];\n\nconst groups = doc.LineItemGroups || [];\nfor (const g of groups) {\n for (const li of (g.LineItems || [])) {\n const lf = li.LineItemExpenseFields || [];\n const desc = lf.find(x => (x.Type?.Text || '').toLowerCase() === 'item')?.ValueDetection?.Text\n || lf.find(x => (x.Type?.Text || '').toLowerCase() === 'description')?.ValueDetection?.Text\n || null;\n const qty = lf.find(x => (x.Type?.Text || '').toLowerCase() === 'quantity')?.ValueDetection?.Text || null;\n const price = lf.find(x => (x.Type?.Text || '').toLowerCase() === 'price')?.ValueDetection?.Text\n || lf.find(x => (x.Type?.Text || '').toLowerCase() === 'unit_price')?.ValueDetection?.Text\n || null;\n const amount = lf.find(x => (x.Type?.Text || '').toLowerCase() === 'amount')?.ValueDetection?.Text || null;\n\n if (desc || amount || price) lineItems.push({ desc, qty, price, amount });\n }\n}\n\nconst vendor = pick(summaryFields, 'VENDOR_NAME') || pick(summaryFields, 'SUPPLIER_NAME');\nconst total = pick(summaryFields, 'TOTAL') || pick(summaryFields, 'AMOUNT_DUE');\nconst date = pick(summaryFields, 'INVOICE_RECEIPT_DATE') || pick(summaryFields, 'DATE');\nconst invoiceId = pick(summaryFields, 'INVOICE_RECEIPT_ID') || pick(summaryFields, 'INVOICE_ID');\n\nconst job_id = `${Date.now()}-${Math.random().toString(16).slice(2)}`;\n\nreturn [{\n job_id,\n vendor,\n total,\n invoice_date: date,\n invoice_id: invoiceId,\n line_items: lineItems,\n raw_textract: tex\n}];"
},
"name": "Map Textract → Fields",
"type": "n8n-nodes-base.code",
"typeVersion": 2,
"position": [1170, 300]
},
{
"parameters": {
"operation": "put",
"tableName": "ai_results",
"simple": true,
"item": {
"job_id": "={{$json.job_id}}",
"use_case": "invoice_intake",
"vendor": "={{$json.vendor || ''}}",
"total": "={{$json.total || ''}}",
"invoice_date": "={{$json.invoice_date || ''}}",
"invoice_id": "={{$json.invoice_id || ''}}",
"created_at": "={{new Date().toISOString()}}",
"result_json": "={{JSON.stringify({vendor:$json.vendor,total:$json.total,invoice_date:$json.invoice_date,invoice_id:$json.invoice_id,line_items:$json.line_items})}}"
}
},
"name": "DynamoDB PutItem",
"type": "n8n-nodes-base.awsDynamoDb",
"typeVersion": 1,
"position": [1420, 300],
"credentials": {
"aws": { "id": "YOUR_AWS_CRED", "name": "AWS Account" }
}
},
{
"parameters": {
"authentication": "predefinedCredentialType",
"resource": "message",
"operation": "post",
"channel": "finance-alerts",
"text": "={{`🧾 Invoice Processed\\nVendor: ${$json.vendor || 'Unknown'}\\nTotal: ${$json.total || 'Unknown'}\\nDate: ${$json.invoice_date || 'Unknown'}\\nItems: ${(JSON.parse($json.result_json).line_items || []).length}\\nJob: ${$json.job_id}`}}"
},
"name": "Slack Alert",
"type": "n8n-nodes-base.slack",
"typeVersion": 2,
"position": [1670, 300],
"credentials": {
"slackApi": { "id": "YOUR_SLACK_CRED", "name": "Slack account" }
}
}
],
"connections": {
"Webhook (/s3-intake)": { "main": [[{ "node": "Extract S3 Bucket+Key", "type": "main", "index": 0 }]] },
"Extract S3 Bucket+Key": { "main": [[{ "node": "S3 GetObject", "type": "main", "index": 0 }]] },
"S3 GetObject": { "main": [[{ "node": "Textract AnalyzeExpense", "type": "main", "index": 0 }]] },
"Textract AnalyzeExpense": { "main": [[{ "node": "Map Textract → Fields", "type": "main", "index": 0 }]] },
"Map Textract → Fields": { "main": [[{ "node": "DynamoDB PutItem", "type": "main", "index": 0 }]] },
"DynamoDB PutItem": { "main": [[{ "node": "Slack Alert", "type": "main", "index": 0 }]] }
},
"active": false
}
Notes about that workflow
The Textract node name/type may differ slightly depending on your n8n version and installed AWS nodes.
If your n8n build doesn’t have
awsTextractorawsDynamoDb, tell me your n8n version and I’ll convert these to HTTP Request nodes calling AWS APIs directly (still works).
5) Local demo without EventBridge/API Gateway (fast test)
Before wiring AWS events, you can trigger n8n manually with:
curl -X POST https://n8n.yourdomain.com/webhook/s3-intake \
-H "Content-Type: application/json" \
-d '{
"detail": {
"bucket": {"name":"ai-intake-docs"},
"object": {"key":"invoices/test-invoice.jpg"}
}
}'
Upload test-invoice.jpg to S3 first, then run the curl. You’ll see Slack alert + DynamoDB record.
6) Hardening checklist (so it looks enterprise)
Security
Add a shared secret header in API Gateway → n8n:
- Header:
x-shared-secret: <random>
- Header:
In n8n, add a Code node at start to verify header.
Use instance role (no static keys) if n8n runs on EC2.
Reliability
Add a “try/catch” style branch:
- On failure → log to DynamoDB
use_case=invoice_intake_error+ Slack failure channel
- On failure → log to DynamoDB
Add idempotency:
- job_id = hash(bucket+key+etag) so re-uploads don’t duplicate
7) Optional upgrade (Option B with SQS)
When you’re ready to level up:
EventBridge target: SQS queue
n8n: SQS Trigger (poll) → process messages
This gives you buffering + retry + DLQ
If you want, I’ll provide:
SQS + DLQ setup
EventBridge rule target SQS
n8n SQS-trigger workflow JSON





