Files

16 KiB

Stack Research

Domain: Channel-native AI workforce platform (multi-tenant SaaS) Researched: 2026-03-22 Confidence: HIGH (all versions verified against PyPI and official sources)


Core Backend Technologies

Technology Version Purpose Why Recommended
Python 3.12+ Runtime Specified in CLAUDE.md. Mature async ecosystem, best ML/AI library support. 3.12 is the LTS sweet spot — 3.13 is out but ecosystem support lags.
FastAPI 0.135.1 API framework Async-native, automatic OpenAPI docs, built-in dependency injection, excellent for multi-service microservices. The de facto choice for async Python APIs.
Pydantic v2 2.12.5 Data validation Mandatory for FastAPI. v2 is 20x faster than v1 (Rust core). Strict mode enforces type safety at runtime boundaries. Use for all internal message models.
SQLAlchemy 2.0.48 ORM / query builder 2.0 is a complete rewrite with true async support. Use AsyncSession + create_async_engine. The 1.x API is deprecated — do not use legacy patterns.
Alembic 1.18.4 Database migrations Standard companion to SQLAlchemy. Requires env.py modification for async engine (synchronous migration runner wraps async calls).
asyncpg 0.31.0 PostgreSQL async driver Required for SQLAlchemy async support with PostgreSQL. Significantly faster than psycopg2 for high-concurrency workloads.
PostgreSQL 16 Primary database Specified in CLAUDE.md. RLS (Row Level Security) is the v1 multi-tenancy mechanism. pgvector extension adds vector search without a separate service.
Redis 7.x Cache, pub/sub, rate limiting Session state, per-tenant rate limit counters, pub/sub for real-time event routing. Consider Valkey as a drop-in replacement if Redis license changes concern you.

LLM Integration

Technology Version Purpose Why Recommended
LiteLLM 1.82.5 LLM gateway / router Unified API across 100+ providers (Anthropic, OpenAI, Ollama, vLLM). Built-in load balancing, cost tracking, fallback routing, and virtual keys. Routes to Ollama locally and commercial APIs without code changes. Now at GA maturity with production users at scale.
Ollama latest Local LLM inference Dev environment local inference. Serves models via OpenAI-compatible API on port 11434 — LiteLLM proxies to it transparently.
pgvector 0.4.2 (Python client) Vector search / agent memory Co-located with PostgreSQL — no separate vector DB service for v1. Supports HNSW indexing (added 0.7.0) for sub-10ms queries at <1M vectors. Extension version 0.8.2 is production-ready and included on all major hosted PostgreSQL services.

Messaging Channel SDKs

Technology Version Purpose Why Recommended
slack-bolt 1.27.0 Slack integration Official Slack SDK. Supports both Events API (webhook) and Socket Mode (WebSocket). Use Events API mode in production (requires public HTTPS endpoint) — Socket Mode is for dev only.
WhatsApp Business Cloud API Meta-hosted WhatsApp integration No official Python SDK from Meta. Use httpx (async HTTP) to call the REST API directly. Webhooks arrive as POST to your FastAPI endpoint. py-whatsapp-cloudbot provides lightweight FastAPI helpers but is a thin wrapper — direct httpx is preferred for control.

Task Queue

Technology Version Purpose Why Recommended
Celery 5.6.2 Background job processing Use for LLM inference calls, tool execution, webhook delivery, and anything that shouldn't block the request/response cycle. Celery 5.x is stable and production-proven at scale. Dramatiq is simpler and more reliable per-message, but Celery's ecosystem (Flower monitoring, beat scheduler, chord/chain primitives) is more complete for complex workflows you'll need in v2+.
Redis (Celery broker) 7.x Celery message broker Use Redis as both broker and result backend. Redis is already in the stack for other purposes — no additional service needed.

Admin Portal (Next.js)

Technology Version Purpose Why Recommended
Next.js 16.x (latest stable) Portal framework Note: CLAUDE.md specifies 14+, but Next.js 16 is the current stable release as of March 2026. App Router is mature. Use 16 to avoid building on a version that's already behind. Turbopack is now default for faster builds.
TypeScript 5.x Type safety Strict mode required (matching CLAUDE.md).
Tailwind CSS 4.x Styling shadcn/ui requires Tailwind. v4 dropped JIT (always-on now) and uses CSS-native variables.
shadcn/ui latest Component library Copy-to-project component model means no version lock-in. Components are owned code. The standard choice for Next.js admin portals in 2025-2026. Use the CLI to scaffold.
TanStack Query 5.x Server state management Handles fetching, caching, and invalidation for API data. Pairs well with App Router — use for client-side mutations and real-time data.
React Hook Form + Zod latest Form validation Standard pairing for shadcn/ui forms. Zod schemas can be shared with backend (TypeScript definitions generated from Pydantic if needed).

Authentication

Technology Version Purpose Why Recommended
Auth.js (formerly NextAuth.js) v5 Portal authentication v5 is a complete rewrite compatible with Next.js App Router. Self-hosted, no per-MAU pricing. Supports credential, OAuth, and magic link flows. Database sessions stored in PostgreSQL via adapter. Use over Clerk for cost control and data sovereignty at scale.
FastAPI JWT middleware custom Backend API auth Validate JWTs issued by Auth.js in FastAPI middleware. Use python-jose or PyJWT for token verification.

Billing

Technology Version Purpose Why Recommended
stripe 14.4.1 Subscription billing Industry standard. Python SDK handles webhook signature verification, subscription lifecycle events, and checkout sessions. Idempotent webhook handlers are required — Stripe resends on failure.

Development Tools

Tool Purpose Notes
uv Python package manager and monorepo workspaces Replaces pip + virtualenv + pip-tools. uv workspace supports the monorepo structure in CLAUDE.md. Single shared lockfile across packages. Significantly faster than pip.
ruff Linting + formatting Replaces flake8, isort, and black in one tool. 100x faster than black. Configure in pyproject.toml. Use as both linter and formatter.
mypy Static type checking (strict mode) Run with --strict flag. Mandatory per CLAUDE.md. Slower than Pyright but more accurate for SQLAlchemy and Pydantic type inference.
pytest + pytest-asyncio Testing Async test support required for FastAPI endpoints. Use httpx.AsyncClient as the test client (not the sync TestClient).
Docker Compose Local dev orchestration All services (PostgreSQL, Redis, Ollama) in compose. FastAPI services run with uvicorn --reload outside compose for hot reload.
slowapi FastAPI rate limiting Redis-backed token bucket rate limiting middleware. Integrates directly with FastAPI. Use for per-tenant and per-channel rate limits.

Installation

# Initialize Python monorepo with uv
uv init konstruct
cd konstruct

# Add workspace packages
uv workspace add packages/gateway
uv workspace add packages/router
uv workspace add packages/orchestrator
uv workspace add packages/llm-pool
uv workspace add packages/shared

# Core backend dependencies (per package)
uv add fastapi[standard] pydantic[email] sqlalchemy[asyncio] asyncpg alembic
uv add litellm redis celery[redis] pgvector stripe
uv add slack-bolt python-jose[cryptography] httpx slowapi

# Dev dependencies
uv add --dev ruff mypy pytest pytest-asyncio pytest-httpx

# Portal (Node.js)
cd packages/portal
npx create-next-app@latest . --typescript --tailwind --eslint --app
npx shadcn@latest init
npm install @tanstack/react-query react-hook-form zod next-auth

Alternatives Considered

Recommended Alternative When to Use Alternative
Celery Dramatiq Dramatiq is the better choice if you want simpler per-message reliability and don't need complex workflow primitives (chords, chains). Switch to Dramatiq if Celery's configuration complexity becomes a team burden in v2.
Auth.js v5 Clerk Choose Clerk if you need built-in multi-tenant Organizations, passkeys, or faster time-to-market on auth. Tradeoff: per-MAU pricing and vendor lock-in.
pgvector Qdrant Migrate to Qdrant when vector count exceeds ~1M or when vector search latency under HNSW becomes a bottleneck. The CLAUDE.md already anticipates this upgrade path.
Redis Valkey Valkey is a Redis fork with a fully open license. Drop-in replacement. Consider if Redis licensing (BSL) becomes a concern.
LiteLLM SDK Direct Anthropic/OpenAI SDK Use direct SDKs only if you're locked to a single provider with no fallback needs. LiteLLM adds negligible overhead while enabling provider portability.
Next.js 16 Remix Remix is excellent for form-heavy apps. Next.js wins for the admin portal pattern (server components, strong Vercel ecosystem, shadcn/ui first-class support).
httpx (WhatsApp) whatsapp-cloud-api libraries None of the community Python WhatsApp SDKs have significant maintenance or production adoption. The Cloud API is a simple REST API — raw httpx with your own models is more maintainable.

What NOT to Use

Avoid Why Use Instead
LangGraph or CrewAI (v1) Both frameworks add significant abstraction overhead for a single-agent-per-tenant model. LangGraph's graph primitives shine for complex multi-agent stateful orchestration (v2 scenario). In v1, they'd constrain the agent model to their abstractions before requirements are clear. Custom orchestrator with direct LiteLLM calls. Evaluate LangGraph seriously for v2 multi-agent teams.
SQLAlchemy 1.x patterns The 1.x session.query() style and Session (sync) are deprecated in 2.0. Mixing sync and async patterns causes subtle bugs in FastAPI async endpoints. SQLAlchemy 2.0 with AsyncSession and select() query style exclusively.
Socket Mode (Slack) in production Socket Mode uses a persistent outbound WebSocket — no inbound port needed, but it ties a worker to a long-lived connection. This breaks horizontal scaling. Events API with a public webhook endpoint. Use Socket Mode only for local dev (bypasses ngrok need during testing).
psycopg2 Synchronous PostgreSQL driver. Blocks the event loop in async FastAPI handlers — kills concurrency. asyncpg (via SQLAlchemy async engine).
Flake8 + Black + isort (separately) Three tools with overlapping responsibilities, separate configs, and order-of-operation conflicts. The CLAUDE.md already specifies ruff. ruff, which replaces all three with a single configuration block in pyproject.toml.
Flask Flask is synchronous by default. Adding async support is possible but bolted on. For a platform that processes LLM calls and webhooks concurrently, you need async-native from the start. FastAPI.
Next.js 14 specifically CLAUDE.md says "14+" but Next.js 16 is the current stable release (March 2026). Starting on 14 means immediately being two major versions behind. Next.js 16 (latest stable).
Keycloak (v1) Correct for enterprise SSO/SAML needs but massively over-engineered for a v1 beta with a small number of tenants. Adds significant operational complexity. Auth.js v5 with PostgreSQL session storage. Add Keycloak in v2+ if enterprise SSO is a customer requirement.

Stack Patterns by Variant

For Slack Events API webhook handling:

  • Use slack-bolt in async mode with FastAPI as the ASGI host
  • AsyncApp + AsyncBoltAdapter for starlette
  • Mount the bolt app at /slack/events in your FastAPI router

For WhatsApp webhook handling:

  • Expose a GET endpoint for Meta's verification handshake (returns hub.challenge)
  • Expose a POST endpoint for incoming messages
  • Verify X-Hub-Signature-256 header with hmac before processing
  • Parse the nested JSON payload manually — no SDK needed

For tenant context in SQLAlchemy + RLS:

  • Set app.tenant_id session variable on each connection before query execution
  • Use SQLAlchemy event listeners (@event.listens_for(engine, "connect")) or middleware injection
  • The sqlalchemy-tenants library provides a clean abstraction if hand-rolling this becomes repetitive

For LLM call patterns:

  • All LLM calls go through LiteLLM proxy — never call provider APIs directly
  • LiteLLM handles retries, fallback, and cost tracking
  • Dispatch via Celery task so the HTTP response returns immediately
  • Stream tokens back to the user via WebSocket or Server-Sent Events for real-time feel

For Celery + async FastAPI coexistence:

  • Celery workers are synchronous processes — wrap async code with asyncio.run() inside task functions
  • Alternatively, use celery[gevent] for cooperative multitasking in workers
  • Do not share the SQLAlchemy AsyncEngine between the FastAPI app and Celery workers — create separate engines per process

Version Compatibility

Package Compatible With Notes
FastAPI 0.135.x Pydantic 2.x FastAPI 0.100+ requires Pydantic v2. v1 is not supported.
SQLAlchemy 2.0.x asyncpg 0.31.x Both support PostgreSQL 16. Use asyncpg as the dialect driver.
Alembic 1.18.x SQLAlchemy 2.0.x Compatible. Modify env.py to use run_async_migrations() pattern for async engine.
Celery 5.6.x Redis 7.x Celery 5.x uses Redis protocol — compatible with Redis 6+ and Valkey.
slack-bolt 1.27.x Python 3.12 Fully supported.
LiteLLM 1.82.x Python 3.12 Fully supported.
Next.js 16.x Auth.js v5 Auth.js v5 was rewritten specifically for Next.js App Router compatibility.
pgvector 0.4.2 (Python) pgvector 0.8.2 (PostgreSQL extension) Python client 0.4.x works with extension 0.7.x+. HNSW index requires extension 0.7.0+.

Sources

  • PyPI (verified March 2026): FastAPI 0.135.1, SQLAlchemy 2.0.48, Pydantic 2.12.5, Alembic 1.18.4, asyncpg 0.31.0, Celery 5.6.2, Dramatiq 2.1.0, stripe 14.4.1, pgvector 0.4.2, LiteLLM 1.82.5, slack-bolt 1.27.0
  • FastAPI official docs — async patterns, dependency injection
  • LiteLLM docs — provider support, routing configuration
  • pgvector GitHub — HNSW indexing, production readiness
  • uv workspace docs — monorepo setup
  • Slack Bolt Python GitHub — Events API vs Socket Mode
  • Auth.js docs — v5 App Router compatibility (MEDIUM confidence — not directly fetched)
  • sqlalchemy-tenants — RLS + SQLAlchemy integration pattern
  • Next.js 16 confirmed as latest stable via npm registry search (March 2026)
  • LangGraph 1.0 GA confirmed via community sources (MEDIUM confidence — agent framework recommendation is HIGH confidence to avoid it for v1)

Stack research for: Konstruct — channel-native AI workforce platform Researched: 2026-03-22