Files

Adolfo Delorenzo 376982f16f docs: complete project research

2026-03-22 00:12:58 -06:00

32 KiB

Raw Blame History

Architecture Research

Domain: Channel-native AI workforce platform (multi-tenant, messaging-channel-first) Researched: 2026-03-22 Confidence: HIGH (core patterns verified against official Slack docs, LiteLLM docs, pgvector community resources, and multiple production-pattern sources)

Standard Architecture

System Overview

External Channels (Slack, WhatsApp)
         │ HTTPS webhooks / Events API
         ▼
┌─────────────────────────────────────────────────────────────┐
│                    INGRESS LAYER                             │
│  ┌─────────────────────┐   ┌─────────────────────────────┐  │
│  │   Channel Gateway   │   │   Stripe Webhook Endpoint   │  │
│  │  (FastAPI service)  │   │   (billing events)          │  │
│  └──────────┬──────────┘   └─────────────────────────────┘  │
└─────────────│───────────────────────────────────────────────┘
              │ Normalized KonstructMessage
              ▼
┌─────────────────────────────────────────────────────────────┐
│                   MESSAGE ROUTING LAYER                      │
│  ┌──────────────────────────────────────────────────────┐   │
│  │  Message Router                                      │   │
│  │  - Tenant resolution (channel org → tenant_id)       │   │
│  │  - Per-tenant rate limiting (Redis token bucket)     │   │
│  │  - Context loading (tenant config, agent config)     │   │
│  │  - Idempotency check (Redis dedup key)               │   │
│  └────────────────────────┬─────────────────────────────┘   │
└───────────────────────────│─────────────────────────────────┘
                            │ Enqueued task (Celery)
                            ▼
┌─────────────────────────────────────────────────────────────┐
│                  AGENT ORCHESTRATION LAYER                   │
│  ┌──────────────────────────────────────────────────────┐   │
│  │  Agent Orchestrator (per-tenant Celery worker)       │   │
│  │  - Agent context assembly (persona, tools, memory)   │   │
│  │  - Conversation history retrieval (Redis + pgvector) │   │
│  │  - LLM call dispatch → LLM Backend Pool              │   │
│  │  - Tool execution (registry lookup + run)            │   │
│  │  - Response routing back to originating channel      │   │
│  └──────────────────────────────────────────────────────┘   │
└───────────────────┬────────────────────────┬────────────────┘
                    │                        │
                    ▼                        ▼
┌────────────────────────┐    ┌──────────────────────────────┐
│   LLM BACKEND POOL     │    │      TOOL EXECUTOR           │
│                        │    │  - Registry (tool → handler) │
│   LiteLLM Router       │    │  - Execution (async/sync)    │
│   ├── Ollama (local)   │    │  - Result capture + logging  │
│   ├── Anthropic API    │    └──────────────────────────────┘
│   └── OpenAI API       │
└────────────────────────┘
         │
         ▼
┌─────────────────────────────────────────────────────────────┐
│                     DATA LAYER                               │
│  ┌──────────────┐  ┌──────────────┐  ┌───────────────────┐  │
│  │  PostgreSQL  │  │    Redis     │  │   MinIO / S3      │  │
│  │  (+ pgvector │  │  (sessions,  │  │  (file attach.,   │  │
│  │  + RLS)      │  │  rate limit, │  │  agent artifacts) │  │
│  │              │  │  task queue, │  │                   │  │
│  │  - tenants   │  │  pub/sub)    │  │                   │  │
│  │  - agents    │  └──────────────┘  └───────────────────┘  │
│  │  - messages  │                                            │
│  │  - tools     │                                            │
│  │  - billing   │                                            │
│  └──────────────┘                                            │
└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│                    ADMIN PORTAL                              │
│  Next.js 14 App Router (separate deployment)                 │
│  - Tenant management, agent config, billing, monitoring      │
│  - Reads/writes to FastAPI REST API (auth via JWT)           │
└─────────────────────────────────────────────────────────────┘

Component Responsibilities

Component	Responsibility	Communicates With
Channel Gateway	Receive and verify inbound webhooks from Slack/WhatsApp; normalize to KonstructMessage; acknowledge within 3s	Message Router (HTTP or enqueue), Redis (idempotency)
Message Router	Resolve tenant from channel metadata; rate-limit per tenant; load tenant/agent context; enqueue to Celery	PostgreSQL (tenant lookup), Redis (rate limit + dedup), Celery (enqueue)
Agent Orchestrator	Assemble agent prompt from persona + memory + conversation history; call LLM; execute tools; emit response back to channel	LLM Backend Pool, Tool Executor, Memory Layer (Redis + pgvector), Channel Gateway (outbound)
LLM Backend Pool	Route LLM calls across Ollama/Anthropic/OpenAI with fallback, retry, and cost tracking	Ollama (local HTTP), Anthropic API, OpenAI API
Tool Executor	Maintain tool registry; execute tool calls from agent; return results; log every invocation for audit	External APIs (per-tool), PostgreSQL (audit log)
Memory Layer	Short-term: Redis sliding window for recent messages; Long-term: pgvector for semantic retrieval of past conversations	Redis, PostgreSQL (pgvector extension)
Admin Portal	UI for tenant CRUD, agent configuration, channel setup, billing, and usage monitoring	FastAPI REST API (authenticated)
Billing Service	Handle Stripe webhooks; update tenant subscription state; enforce feature limits based on plan	Stripe, PostgreSQL (subscription state)

Recommended Project Structure

konstruct/
├── packages/
│   ├── gateway/                     # Channel Gateway service (FastAPI)
│   │   ├── channels/
│   │   │   ├── slack.py             # Slack Events API handler (HTTP mode)
│   │   │   └── whatsapp.py          # WhatsApp Cloud API webhook handler
│   │   ├── normalize.py             # → KonstructMessage
│   │   ├── verify.py                # Signature verification per channel
│   │   └── main.py                  # FastAPI app, routes
│   │
│   ├── router/                      # Message Router service (FastAPI)
│   │   ├── tenant.py                # Channel org ID → tenant_id lookup
│   │   ├── ratelimit.py             # Redis token bucket per tenant
│   │   ├── idempotency.py           # Redis dedup (message_id key, TTL)
│   │   ├── context.py               # Load agent config from DB
│   │   └── main.py
│   │
│   ├── orchestrator/                # Agent Orchestrator (Celery workers)
│   │   ├── tasks.py                 # Celery task: handle_message
│   │   ├── agents/
│   │   │   ├── builder.py           # Assemble agent (persona + tools + memory)
│   │   │   └── runner.py            # LLM call loop (reason → tool → observe)
│   │   ├── memory/
│   │   │   ├── short_term.py        # Redis sliding window (last N messages)
│   │   │   └── long_term.py         # pgvector semantic search
│   │   ├── tools/
│   │   │   ├── registry.py          # Tool name → handler function mapping
│   │   │   ├── executor.py          # Async execution + audit logging
│   │   │   └── builtins/            # Built-in tools (web search, calendar, etc.)
│   │   └── main.py                  # Worker entry point
│   │
│   ├── llm-pool/                    # LLM Backend Pool (LiteLLM wrapper)
│   │   ├── router.py                # LiteLLM Router config (model groups)
│   │   ├── providers/
│   │   │   ├── ollama.py
│   │   │   ├── anthropic.py
│   │   │   └── openai.py
│   │   └── main.py                  # FastAPI app exposing /complete endpoint
│   │
│   ├── portal/                      # Next.js 14 Admin Dashboard
│   │   ├── app/
│   │   │   ├── (auth)/              # Login, signup routes
│   │   │   ├── dashboard/           # Post-auth layout
│   │   │   ├── tenants/             # Tenant management
│   │   │   ├── agents/              # Agent config
│   │   │   ├── billing/             # Stripe customer portal
│   │   │   └── api/                 # Next.js API routes (thin proxy or auth only)
│   │   ├── components/              # shadcn/ui components
│   │   └── lib/
│   │       ├── api.ts               # TanStack Query hooks + API client
│   │       └── auth.ts              # NextAuth.js config
│   │
│   └── shared/                      # Shared Python library (no service)
│       ├── models/
│       │   ├── message.py           # KonstructMessage Pydantic model
│       │   ├── tenant.py            # Tenant, Agent SQLAlchemy models
│       │   └── billing.py           # Subscription, Plan models
│       ├── db.py                    # SQLAlchemy async engine + session factory
│       ├── rls.py                   # SET app.current_tenant helper
│       └── config.py                # Pydantic Settings (env vars)
│
├── migrations/                      # Alembic (single migration history)
├── tests/
│   ├── unit/
│   ├── integration/
│   └── e2e/
├── docker-compose.yml               # All services + infra (Redis, PG, MinIO, Ollama)
└── pyproject.toml                   # uv workspace config, shared deps

Structure Rationale

packages/ per service: Each directory is a standalone FastAPI app or Celery worker with its own main.py. The boundary maps to a Docker container. Services communicate over HTTP or Celery/Redis, not in-process imports.
shared/: Common Pydantic models and SQLAlchemy models live here to prevent duplication and drift. No business logic — only types, DB session factory, and config.
gateway/ channels/: Each channel adapter is a separate file so adding a new channel (e.g., Telegram in v2) is an isolated change with no blast radius.
orchestrator/ memory/: Short-term and long-term memory are separate modules because they have different backends, eviction policies, and query semantics.
portal/ app/: Next.js App Router route grouping with (auth) for pre-auth pages and dashboard/ for post-auth so layout boundaries are explicit.

Architectural Patterns

Pattern 1: Immediate-Acknowledge, Async-Process

What: The Channel Gateway returns HTTP 200 to Slack/WhatsApp within 3 seconds, without performing any LLM work. The actual processing is dispatched to Celery.

When to use: Always. Slack will retry and flag your app as unhealthy if it doesn't receive a 2xx within 3 seconds. WhatsApp Cloud API requires sub-20s acknowledgment.

Trade-offs: Adds Celery + Redis infrastructure requirement. The response to the user is sent as a follow-up message, not as the HTTP response — this is intentional and matches how Slack/WhatsApp users expect bots to behave anyway (typing indicator → message appears).

Example:

# gateway/channels/slack.py
@app.event("message")
async def handle_message(event, say, client):
    # 1. Normalize immediately
    msg = normalize_slack(event)
    # 2. Verify idempotency (skip duplicate events)
    if await is_duplicate(msg.id):
        return
    # 3. Enqueue for async processing — DO NOT call LLM here
    handle_message_task.delay(msg.model_dump())
    # Gateway returns 200 implicitly — Slack is satisfied

Pattern 2: Tenant-Scoped RLS via SQLAlchemy Event Hook

What: Set app.current_tenant on the PostgreSQL connection immediately after acquiring it from the pool. RLS policies use this setting to filter every query automatically, so application code never manually adds WHERE tenant_id = ....

When to use: Every DB interaction in the Message Router and Agent Orchestrator.

Trade-offs: Requires careful pool management — connections must be reset before returning to the pool. The sqlalchemy-tenants library or a custom before_cursor_execute event listener handles this.

Example:

# shared/rls.py
from sqlalchemy import event

@event.listens_for(engine.sync_engine, "before_cursor_execute")
def set_tenant_context(conn, cursor, statement, parameters, context, executemany):
    tenant_id = get_current_tenant_id()  # from contextvars
    if tenant_id:
        cursor.execute(f"SET app.current_tenant = '{tenant_id}'")

Pattern 3: Four-Layer Agent Memory

What: Combine Redis (fast, ephemeral) for short-term context and pgvector (persistent, semantic) for long-term recall. The agent always has the last N messages in context (Redis sliding window). For deeper history, the orchestrator optionally queries pgvector for semantically similar past exchanges.

When to use: Every agent invocation. Short-term is mandatory; long-term retrieval is triggered when conversation references past events or when context window pressure requires compressing history.

Trade-offs: Two backends to operate and keep in sync. A background Celery task flushes Redis conversation state to PostgreSQL/pgvector asynchronously — if it fails, recent messages may not be permanently indexed, but conversation continuity is preserved by Redis until flush succeeds.

Example flow:

User message arrives
    → Load last 20 messages from Redis (short-term)
    → Optionally: similarity search pgvector for relevant past conversations
    → Build context window: [system prompt] + [retrieved history] + [recent messages]
    → LLM call
    → Append response to Redis sliding window
    → Background task: embed + store to pgvector

Pattern 4: LiteLLM Router as Internal Singleton

What: The LLM Backend Pool exposes a single internal HTTP endpoint (/complete). All orchestrator workers call this endpoint. The LiteLLM Router behind it handles provider selection, fallback chains, and cost tracking without the orchestrator needing to know which model is used.

When to use: All LLM calls. Never call Anthropic/OpenAI SDKs directly from the orchestrator — always go through the pool.

Trade-offs: Adds one network hop per LLM call. This is acceptable — LiteLLM's own benchmarks show 8ms P95 overhead at 1k RPS.

Configuration example:

# llm-pool/router.py
from litellm import Router

router = Router(
    model_list=[
        {"model_name": "fast", "litellm_params": {"model": "ollama/qwen3:8b", "api_base": "http://ollama:11434"}},
        {"model_name": "quality", "litellm_params": {"model": "anthropic/claude-sonnet-4-20250514"}},
        {"model_name": "quality", "litellm_params": {"model": "openai/gpt-4o"}},  # fallback
    ],
    fallbacks=[{"quality": ["fast"]}],  # cost-cap fallback
    routing_strategy="latency-based-routing",
)

Data Flow

Inbound Message Flow (Happy Path)

User sends message in Slack
    │
    ▼ HTTPS POST (Events API, HTTP mode)
Channel Gateway
    │ verify Slack signature (X-Slack-Signature)
    │ normalize → KonstructMessage(id, tenant_id=None, channel=slack, ...)
    │ check Redis idempotency key (message_id) → not seen
    │ set Redis idempotency key (TTL 24h)
    │ enqueue Celery task: handle_message(message)
    │ return HTTP 200 immediately
    ▼
Celery Broker (Redis)
    │
    ▼
Agent Orchestrator (Celery worker)
    │ resolve tenant_id from channel_metadata.workspace_id → PostgreSQL
    │ load agent config (persona, model preference, tools) → PostgreSQL (RLS-scoped)
    │ load short-term memory → Redis (last 20 messages for this thread_id)
    │ optionally query pgvector for relevant past context
    │ assemble prompt: system_prompt + memory + current message
    │ POST /complete → LLM Backend Pool
    │     LiteLLM Router selects provider, executes, returns response
    │ parse response: text reply OR tool_call
    │ if tool_call: Tool Executor.run(tool_name, args) → external API → result
    │ append result to prompt, re-call LLM if needed
    │ write KonstructMessage + response to PostgreSQL (audit)
    │ update Redis sliding window with new messages
    │ background: embed messages → pgvector
    ▼
Channel Gateway (outbound)
    │ POST message back to Slack via slack-sdk client.chat_postMessage()
    ▼
User sees response in Slack

Tenant Resolution Flow

KonstructMessage.channel_metadata = {"workspace_id": "T123ABC"}
    │
    ▼ Router: SELECT tenant_id FROM channel_connections
      WHERE channel_type = 'slack' AND external_org_id = 'T123ABC'
    │
    ▼ tenant_id resolved → stored in Python contextvar for RLS
    │
    All subsequent DB queries automatically scoped by RLS policy:
    CREATE POLICY tenant_isolation ON agents
        USING (tenant_id = current_setting('app.current_tenant')::uuid);

Admin Portal Data Flow

Browser → Next.js App Router (RSC or client component)
    │ TanStack Query useQuery / useMutation
    ▼
FastAPI REST API (authenticated endpoint)
    │ JWT verification (NextAuth.js token)
    │ Tenant scope enforced (user.tenant_id from token)
    ▼
PostgreSQL (RLS active: queries scoped to token's tenant)

Billing Event Flow

Stripe subscription event (checkout.session.completed, etc.)
    │ HTTPS POST to /webhooks/stripe
    ▼
Billing endpoint (FastAPI)
    │ verify Stripe webhook signature (stripe-signature header)
    │ parse event type
    │ update tenants.subscription_status, plan_tier in PostgreSQL
    │ if downgrade: update agent count limit, feature flags
    ▼
Next message processed by Router picks up new plan limits

Integration Points

External Services

Service	Integration Pattern	Key Requirements	Notes
Slack	HTTP Events API (webhook) — NOT Socket Mode	Public HTTPS URL with valid TLS; respond 200 within 3s	Socket Mode only for local dev. HTTP required for production and any future Marketplace distribution. Slack explicitly recommends HTTP for production reliability.
WhatsApp Cloud API	Meta webhook (HTTPS POST)	TLS required (no self-signed); verify token for subscription; 200 response within 20s	Meta has fully deprecated on-premise option. Cloud API is now the only supported path.
LiteLLM	In-process Python SDK OR sidecar HTTP proxy	Ollama running as Docker service; Anthropic/OpenAI API keys	Run as a separate service for isolation, or as an embedded router in the orchestrator. Separate service recommended for cost tracking and rate limiting.
Stripe	Webhook (HTTPS POST)	Signature verification via `stripe.WebhookSignature`; idempotent event handlers	Use Stripe's hosted billing portal for self-service plan changes — avoids building custom subscription UI.
Ollama	HTTP (Docker network)	GPU passthrough optional; accessible on internal Docker network	`http://ollama:11434` on compose network. No auth required on internal network.

Internal Service Boundaries

Boundary	Communication	Protocol	Notes
Gateway → Router	Direct HTTP POST (on same Docker network) or shared Celery queue	HTTP or Celery	For v1 simplicity, Gateway can enqueue directly to Celery, bypassing a separate Router HTTP call
Router → Orchestrator	Celery task via Redis broker	Celery/Redis	Decouples ingress from processing; enables retries, dead-letter queue, and horizontal scaling of workers
Orchestrator → LLM Pool	Internal HTTP POST	HTTP (FastAPI)	Keeps LLM routing concerns isolated; allows pool to be scaled independently
Orchestrator → Channel Gateway (outbound)	Direct Slack/WhatsApp SDK calls	HTTPS (external)	Orchestrator holds channel credentials and calls the appropriate SDK directly for responses
Portal → API	REST over HTTPS	HTTP (FastAPI)	Portal never accesses DB directly — all reads/writes through authenticated API
Any service → PostgreSQL	SQLAlchemy async (asyncpg driver)	TCP	RLS enforced; tenant context set before every query
Any service → Redis	aioredis / redis-py async	TCP	Namespaced by tenant_id to prevent accidental cross-tenant access

Build Order (Dependency Graph)

Building the wrong component first creates integration debt. The correct order:

Phase 1 — Foundation (build this first)
│
├── 1. Shared models + DB schema (Pydantic models, SQLAlchemy models, Alembic migrations)
│       └── Required by: every other service
│
├── 2. PostgreSQL + Redis + Docker Compose dev environment
│       └── Required by: everything
│
├── 3. Channel Gateway — Slack adapter only
│       └── Unblocks: end-to-end message flow testing
│
├── 4. Message Router — tenant resolution + rate limiting
│       └── Unblocks: scoped agent invocation
│
├── 5. LLM Backend Pool — LiteLLM with Ollama + Anthropic
│       └── Unblocks: agent can actually generate responses
│
└── 6. Agent Orchestrator — single agent, no tools, no memory
        └── First working end-to-end: Slack message → LLM response → Slack reply

Phase 2 — Feature Completeness
│
├── 7. Memory Layer (Redis short-term + pgvector long-term)
│       └── Depends on: working orchestrator
│
├── 8. Tool Framework (registry + executor + first built-in tools)
│       └── Depends on: working orchestrator
│
├── 9. WhatsApp channel adapter in Gateway
│       └── Mostly isolated: same normalize.py, new channel handler
│
├── 10. Admin Portal (Next.js) — tenant CRUD + agent config
│        └── Depends on: stable DB schema (stabilizes after step 8)
│
└── 11. Billing integration (Stripe webhooks + subscription enforcement)
         └── Depends on: tenant model, admin portal

Key dependency insight: Steps 1-6 must be strictly sequential. Steps 7-11 can overlap after step 6 is working, but the portal (10) and billing (11) should not be started until the DB schema is stable, which happens after memory and tools are defined (steps 7-8).

Scaling Considerations

Scale	Architecture Adjustments
0-100 tenants (beta)	Single Docker Compose host. One Celery worker process. All services on same machine. PostgreSQL RLS sufficient.
100-1k tenants	Scale Celery workers horizontally (multiple replicas). Separate Redis for broker vs. cache. Add connection pooling (PgBouncer). Consider moving Ollama to dedicated GPU host.
1k-10k tenants	Kubernetes (k3s). Multiple Gateway replicas behind load balancer. Celery worker auto-scaling. PostgreSQL read replica for analytics/portal queries. Qdrant for vector search at scale (pgvector starts to slow above ~1M embeddings).
10k+ tenants	Schema-per-tenant for Enterprise tier. Dedicated inference cluster. Multi-region PostgreSQL (Citus or regional replicas).

Scaling Priorities

First bottleneck: Celery workers during LLM call bursts. LLM calls are slow (2-30s). Workers pile up. Fix: increase worker count, implement per-tenant concurrency limits, add request coalescing for burst traffic.
Second bottleneck: PostgreSQL connection exhaustion under concurrent tenant load. Fix: PgBouncer transaction-mode pooling. This is critical early because each Celery worker opens its own SQLAlchemy async session.
Third bottleneck: pgvector query latency as embedding count grows. Fix: HNSW index tuning, then migrate to Qdrant for the vector tier while keeping PostgreSQL for structured data.

Anti-Patterns

Anti-Pattern 1: Doing LLM Work Inside the Webhook Handler

What people do: Call the LLM synchronously inside the Slack event handler or WhatsApp webhook endpoint and return the AI response as the HTTP reply.

Why it's wrong: Slack requires HTTP 200 within 3 seconds. OpenAI/Anthropic calls routinely take 5-30 seconds. The webhook times out, Slack retries the event (causing duplicate processing), and the app gets flagged as unreliable.

Do this instead: Acknowledge immediately (HTTP 200), enqueue to Celery, and send the AI response as a follow-up message via the channel API.

Anti-Pattern 2: Shared Redis Namespace Across Tenants

What people do: Store conversation history as redis.set("history:{thread_id}", ...) without scoping by tenant.

Why it's wrong: thread_id values can collide between tenants (e.g., two tenants both have Slack thread C123/T456). Tenant A reads Tenant B's conversation history.

Do this instead: Always namespace Redis keys as {tenant_id}:{key_type}:{resource_id}. Example: tenant_abc123:history:slack_C123_T456.

Anti-Pattern 3: Calling LLM Providers Directly from Orchestrator

What people do: Import the anthropic SDK directly in the orchestrator and call anthropic.messages.create(...).

Why it's wrong: Bypasses the LiteLLM router, losing fallback behavior, cost tracking, rate limit enforcement, and the ability to switch providers without touching orchestrator code.

Do this instead: All LLM calls go through the LLM Backend Pool service (or an embedded LiteLLM Router). The orchestrator sends a generic complete(messages, model_group="quality") call and the pool handles provider selection.

Anti-Pattern 4: Fat Channel Gateway

What people do: Add tenant resolution, rate limiting, and business logic to the gateway service to "simplify" the architecture.

Why it's wrong: The gateway must respond in under 3 seconds and must stay stateless to handle channel-specific webhook verification. Mixing business logic in couples the gateway to your domain model and makes it impossible to scale independently.

Do this instead: Gateway does exactly three things: verify signature, normalize message, enqueue. All business logic lives downstream.

Anti-Pattern 5: Embedding Agent Memory in PostgreSQL Without an Index

What people do: Store conversation embeddings in a vector column in PostgreSQL and run similarity queries without an HNSW or IVFFlat index.

Why it's wrong: pgvector without an index performs sequential scans. With more than ~50k embeddings per tenant, queries slow to seconds.

Do this instead: Create HNSW indexes on vector columns from the start. CREATE INDEX ON conversation_embeddings USING hnsw (embedding vector_cosine_ops);

Sources

Slack: Comparing HTTP and Socket Mode — MEDIUM confidence (official Slack docs, accessed 2026-03-22)
Slack: Using Socket Mode — HIGH confidence (official)
LiteLLM: Router Architecture — HIGH confidence (official LiteLLM docs)
LiteLLM: Routing and Load Balancing — HIGH confidence (official)
Redis: AI Agent Memory Architecture — MEDIUM confidence (official Redis blog)
Redis: AI Agent Architecture 2026 — MEDIUM confidence (official Redis blog)
Crunchy Data: Row Level Security for Tenants — HIGH confidence (authoritative PostgreSQL resource)
AWS: Multi-Tenant Data Isolation with PostgreSQL RLS — MEDIUM confidence
DEV Community: Building WhatsApp Business Bots — LOW confidence (community post)
ChatArchitect: Scalable Webhook Architecture for WhatsApp — LOW confidence (community)
PyWa documentation — MEDIUM confidence (library docs)
fast.io: Multi-Tenant AI Agent Architecture — LOW confidence (vendor blog)
Microsoft Learn: AI Agent Orchestration Patterns — MEDIUM confidence
DEV Community: Webhooks at Scale — Idempotency — LOW confidence (community post, pattern well-corroborated)

Architecture research for: Konstruct — channel-native AI workforce platform Researched: 2026-03-22

32 KiB Raw Blame History