Skip to main content

Command Palette

Search for a command to run...

Invoice/Receipt Intake using AWS (S3 + Textract + DynamoDB) + n8n + Slack.

Updated
7 min read
P

PRAFUL PATEL ☁️🚀, Highly skilled and motivated Cloud Engineer with a proven track record of designing, implementing, and managing robust cloud infrastructure solutions. With years of hands-on experience, I am deeply passionate about creating scalable and resilient cloud architectures that drive innovation and deliver optimal business outcomes. 🛠 Key Competencies:

Cloud Platforms: AWS, Azure, GCP, OCI Infrastructure as Code: Terraform, Ansible Containers & Orchestration: Docker, Kubernetes Scripting: Python, Bash/Shell CI/CD & Version Control: GitHub, Jenkins, CircleCI Monitoring & Analytics: Grafana, Prometheus, Datadog, New Relic Backup & Recovery: Veeam Operating Systems: Linux, Windows DevOps Tools: AWS Code Build, Code Pipeline, Azure DevOps

📚 Continuous Learning: Staying ahead in the rapidly evolving cloud landscape is my priority. I am committed to expanding my skill set and embracing emerging cloud technologies to drive efficiency and innovation. Passionate Cloud/DevOps enthusiast dedicated to designing, building, and deploying cutting-edge technology solutions. As a devoted YouTuber, I love sharing insights through informative videos and crafting technical blogs that delve into areas like ☁️ Cloud, 🛠️ DevOps, 🐧 Linux, and 📦 Containers. 💻 Open Source Advocate: Contributing to open-source projects is a vital part of my journey. I actively engage in projects centered around Cloud, DevOps, Linux, and Containers, fostering collaboration and innovation within the community. 💌 Let's Connect: I am enthusiastic about virtual collaborations and meeting fellow professionals. Let's explore how I can contribute to your organization's cloud goals. Feel free to connect or DM me.

🌐 Portfolio: Check out my portfolio 🔗 LinkedIn: Connect on LinkedIn 🛠️ GitHub: Explore my projects 🎥 YouTube: Watch my videos 📝 Medium: Read my articles 🌐 Dev.to: Check out my posts

Flow

  1. User uploads invoice to S3 bucket ai-intake-docs

  2. S3 event triggers EventBridge rule

  3. EventBridge sends event to API Gateway HTTP API

  4. API Gateway calls n8n Webhook /s3-intake

  5. n8n pulls the file from S3 (GetObject)

  6. n8n calls Textract AnalyzeExpense

  7. n8n maps fields → {vendor,total,date,line_items}

  8. n8n stores result in DynamoDB table ai_results

  9. n8n posts summary to Slack

Why this is good

  • Near real-time

  • Easy to demo

  • No queue complexity

  • Works well for weekly posting


Option B — Enterprise buffered (upgrade)

Flow
S3 upload → EventBridge → SQS → n8n (poll SQS) → S3 GetObject → Textract → DynamoDB → Slack

Why it’s better

  • Handles bursts (100s uploads)

  • Retry/replay is easier

  • Keeps n8n from getting hammered


2) AWS resources you need (both options)

S3

  • Bucket: ai-intake-docs

  • Folder convention: invoices/YYYY/MM/...

  • Encryption: SSE-S3 (default)

  • Block public access: ON

DynamoDB

  • Table: ai_results

  • Partition key: job_id (String)

  • Optional attributes:

    • s3_bucket, s3_key, vendor, total, invoice_date, created_at, raw_textract

Textract

  • Use API: AnalyzeExpense (best for invoices/receipts)

IAM

  • n8n needs permissions:

    • s3:GetObject on bucket objects

    • textract:AnalyzeExpense

    • dynamodb:PutItem on table


3) Step-by-step implementation (Option A)

Step A1 — Create S3 bucket

  1. AWS Console → S3 → Create bucket → ai-intake-docs

  2. Block public access: enabled

  3. Default encryption: enabled

  4. (Optional) create folder invoices/


Step A2 — Create DynamoDB table

  1. DynamoDB → Create table:

    • Table name: ai_results

    • Partition key: job_id (String)

  2. Leave defaults (on-demand is fine)


Step A3 — Create API Gateway HTTP API (Webhook gateway)

  1. API Gateway → Create API → HTTP API

  2. Add integration:

  3. Add route:

    • POST /s3-events
  4. Deploy stage: $default

Security (good enough for portfolio)

  • Add an API key or a shared secret header (recommended)

  • In n8n, check header like x-shared-secret


Step A4 — Create EventBridge rule for S3 uploads

  1. EventBridge → Rules → Create rule

  2. Event source: AWS events

  3. Pattern:

{
  "source": ["aws.s3"],
  "detail-type": ["Object Created"],
  "detail": {
    "bucket": { "name": ["ai-intake-docs"] }
  }
}
  1. Target: API Gateway

    • Choose your HTTP API

    • Route: POST /s3-events

Now: every new upload triggers your API Gateway → n8n.


Step A5 — Create IAM credentials for n8n

If n8n is on EC2: best is instance role.
If local: use an IAM user access key.

Minimal IAM policy (Week 1)

{
  "Version": "2012-10-17",
  "Statement": [
    { "Effect": "Allow", "Action": ["s3:GetObject"], "Resource": "arn:aws:s3:::ai-intake-docs/*" },
    { "Effect": "Allow", "Action": ["textract:AnalyzeExpense"], "Resource": "*" },
    { "Effect": "Allow", "Action": ["dynamodb:PutItem"], "Resource": "arn:aws:dynamodb:*:*:table/ai_results" }
  ]
}

Attach it to:

  • EC2 instance role used by n8n OR

  • IAM user used by n8n AWS credentials


4) n8n workflow (import-ready) + mapping code

What the webhook payload looks like

EventBridge sends something like:

{
  "detail": {
    "bucket": { "name": "ai-intake-docs" },
    "object": { "key": "invoices/2026/01/invoice1.pdf" }
  }
}

Import-ready n8n workflow JSON

Workflow name: Week1 - S3→Textract→DynamoDB→Slack
Webhook path: /s3-intake

After import, set credentials:

  • AWS credential in S3/Textract/DynamoDB nodes

  • Slack credential in Slack node

  • Optionally add shared secret check

{
  "name": "Week1 - S3→Textract→DynamoDB→Slack",
  "nodes": [
    {
      "parameters": {
        "path": "s3-intake",
        "httpMethod": "POST",
        "responseMode": "lastNode"
      },
      "name": "Webhook (/s3-intake)",
      "type": "n8n-nodes-base.webhook",
      "typeVersion": 2,
      "position": [200, 300]
    },
    {
      "parameters": {
        "jsCode": "const detail = $json.detail || {};\nconst bucket = detail.bucket?.name;\nconst key = detail.object?.key;\n\nif (!bucket || !key) {\n  throw new Error('Missing bucket/key in event payload');\n}\n\nreturn [{ bucket, key, received_at: new Date().toISOString() }];"
      },
      "name": "Extract S3 Bucket+Key",
      "type": "n8n-nodes-base.code",
      "typeVersion": 2,
      "position": [430, 300]
    },
    {
      "parameters": {
        "operation": "getObject",
        "bucketName": "={{$json.bucket}}",
        "objectKey": "={{$json.key}}"
      },
      "name": "S3 GetObject",
      "type": "n8n-nodes-base.awsS3",
      "typeVersion": 1,
      "position": [670, 300],
      "credentials": {
        "aws": { "id": "YOUR_AWS_CRED", "name": "AWS Account" }
      }
    },
    {
      "parameters": {
        "operation": "analyzeExpense",
        "binaryPropertyName": "data"
      },
      "name": "Textract AnalyzeExpense",
      "type": "n8n-nodes-base.awsTextract",
      "typeVersion": 1,
      "position": [920, 300],
      "credentials": {
        "aws": { "id": "YOUR_AWS_CRED", "name": "AWS Account" }
      }
    },
    {
      "parameters": {
        "jsCode": "function pick(fields, label) {\n  const f = fields.find(x => (x.Type?.Text || '').toLowerCase() === label.toLowerCase());\n  const v = f?.ValueDetection?.Text || null;\n  return v;\n}\n\nconst tex = $json;\nconst doc = tex.ExpenseDocuments?.[0];\nif (!doc) throw new Error('No ExpenseDocuments returned by Textract');\n\nconst summaryFields = doc.SummaryFields || [];\nconst lineItems = [];\n\nconst groups = doc.LineItemGroups || [];\nfor (const g of groups) {\n  for (const li of (g.LineItems || [])) {\n    const lf = li.LineItemExpenseFields || [];\n    const desc = lf.find(x => (x.Type?.Text || '').toLowerCase() === 'item')?.ValueDetection?.Text\n      || lf.find(x => (x.Type?.Text || '').toLowerCase() === 'description')?.ValueDetection?.Text\n      || null;\n    const qty = lf.find(x => (x.Type?.Text || '').toLowerCase() === 'quantity')?.ValueDetection?.Text || null;\n    const price = lf.find(x => (x.Type?.Text || '').toLowerCase() === 'price')?.ValueDetection?.Text\n      || lf.find(x => (x.Type?.Text || '').toLowerCase() === 'unit_price')?.ValueDetection?.Text\n      || null;\n    const amount = lf.find(x => (x.Type?.Text || '').toLowerCase() === 'amount')?.ValueDetection?.Text || null;\n\n    if (desc || amount || price) lineItems.push({ desc, qty, price, amount });\n  }\n}\n\nconst vendor = pick(summaryFields, 'VENDOR_NAME') || pick(summaryFields, 'SUPPLIER_NAME');\nconst total  = pick(summaryFields, 'TOTAL') || pick(summaryFields, 'AMOUNT_DUE');\nconst date   = pick(summaryFields, 'INVOICE_RECEIPT_DATE') || pick(summaryFields, 'DATE');\nconst invoiceId = pick(summaryFields, 'INVOICE_RECEIPT_ID') || pick(summaryFields, 'INVOICE_ID');\n\nconst job_id = `${Date.now()}-${Math.random().toString(16).slice(2)}`;\n\nreturn [{\n  job_id,\n  vendor,\n  total,\n  invoice_date: date,\n  invoice_id: invoiceId,\n  line_items: lineItems,\n  raw_textract: tex\n}];"
      },
      "name": "Map Textract → Fields",
      "type": "n8n-nodes-base.code",
      "typeVersion": 2,
      "position": [1170, 300]
    },
    {
      "parameters": {
        "operation": "put",
        "tableName": "ai_results",
        "simple": true,
        "item": {
          "job_id": "={{$json.job_id}}",
          "use_case": "invoice_intake",
          "vendor": "={{$json.vendor || ''}}",
          "total": "={{$json.total || ''}}",
          "invoice_date": "={{$json.invoice_date || ''}}",
          "invoice_id": "={{$json.invoice_id || ''}}",
          "created_at": "={{new Date().toISOString()}}",
          "result_json": "={{JSON.stringify({vendor:$json.vendor,total:$json.total,invoice_date:$json.invoice_date,invoice_id:$json.invoice_id,line_items:$json.line_items})}}"
        }
      },
      "name": "DynamoDB PutItem",
      "type": "n8n-nodes-base.awsDynamoDb",
      "typeVersion": 1,
      "position": [1420, 300],
      "credentials": {
        "aws": { "id": "YOUR_AWS_CRED", "name": "AWS Account" }
      }
    },
    {
      "parameters": {
        "authentication": "predefinedCredentialType",
        "resource": "message",
        "operation": "post",
        "channel": "finance-alerts",
        "text": "={{`🧾 Invoice Processed\\nVendor: ${$json.vendor || 'Unknown'}\\nTotal: ${$json.total || 'Unknown'}\\nDate: ${$json.invoice_date || 'Unknown'}\\nItems: ${(JSON.parse($json.result_json).line_items || []).length}\\nJob: ${$json.job_id}`}}"
      },
      "name": "Slack Alert",
      "type": "n8n-nodes-base.slack",
      "typeVersion": 2,
      "position": [1670, 300],
      "credentials": {
        "slackApi": { "id": "YOUR_SLACK_CRED", "name": "Slack account" }
      }
    }
  ],
  "connections": {
    "Webhook (/s3-intake)": { "main": [[{ "node": "Extract S3 Bucket+Key", "type": "main", "index": 0 }]] },
    "Extract S3 Bucket+Key": { "main": [[{ "node": "S3 GetObject", "type": "main", "index": 0 }]] },
    "S3 GetObject": { "main": [[{ "node": "Textract AnalyzeExpense", "type": "main", "index": 0 }]] },
    "Textract AnalyzeExpense": { "main": [[{ "node": "Map Textract → Fields", "type": "main", "index": 0 }]] },
    "Map Textract → Fields": { "main": [[{ "node": "DynamoDB PutItem", "type": "main", "index": 0 }]] },
    "DynamoDB PutItem": { "main": [[{ "node": "Slack Alert", "type": "main", "index": 0 }]] }
  },
  "active": false
}

Notes about that workflow

  • The Textract node name/type may differ slightly depending on your n8n version and installed AWS nodes.

  • If your n8n build doesn’t have awsTextract or awsDynamoDb, tell me your n8n version and I’ll convert these to HTTP Request nodes calling AWS APIs directly (still works).


5) Local demo without EventBridge/API Gateway (fast test)

Before wiring AWS events, you can trigger n8n manually with:

curl -X POST https://n8n.yourdomain.com/webhook/s3-intake \
  -H "Content-Type: application/json" \
  -d '{
    "detail": {
      "bucket": {"name":"ai-intake-docs"},
      "object": {"key":"invoices/test-invoice.jpg"}
    }
  }'

Upload test-invoice.jpg to S3 first, then run the curl. You’ll see Slack alert + DynamoDB record.


6) Hardening checklist (so it looks enterprise)

Security

  • Add a shared secret header in API Gateway → n8n:

    • Header: x-shared-secret: <random>
  • In n8n, add a Code node at start to verify header.

  • Use instance role (no static keys) if n8n runs on EC2.

Reliability

  • Add a “try/catch” style branch:

    • On failure → log to DynamoDB use_case=invoice_intake_error + Slack failure channel
  • Add idempotency:

    • job_id = hash(bucket+key+etag) so re-uploads don’t duplicate

7) Optional upgrade (Option B with SQS)

When you’re ready to level up:

  • EventBridge target: SQS queue

  • n8n: SQS Trigger (poll) → process messages

  • This gives you buffering + retry + DLQ

If you want, I’ll provide:

  • SQS + DLQ setup

  • EventBridge rule target SQS

  • n8n SQS-trigger workflow JSON

More from this blog

C

cloud/devops

29 posts

PRAFUL PATEL ☁️🚀, Highly skilled and motivated Cloud/DevOps Engineer with a proven track record of designing, implementing, and managing robust cloud infrastructure solutions.