Files

Adolfo Delorenzo 376982f16f docs: complete project research

2026-03-22 00:12:58 -06:00

16 KiB

Raw Blame History

Stack Research

Domain: Channel-native AI workforce platform (multi-tenant SaaS) Researched: 2026-03-22 Confidence: HIGH (all versions verified against PyPI and official sources)

Recommended Stack

Core Backend Technologies

Technology	Version	Purpose	Why Recommended
Python	3.12+	Runtime	Specified in CLAUDE.md. Mature async ecosystem, best ML/AI library support. 3.12 is the LTS sweet spot — 3.13 is out but ecosystem support lags.
FastAPI	0.135.1	API framework	Async-native, automatic OpenAPI docs, built-in dependency injection, excellent for multi-service microservices. The de facto choice for async Python APIs.
Pydantic v2	2.12.5	Data validation	Mandatory for FastAPI. v2 is 20x faster than v1 (Rust core). Strict mode enforces type safety at runtime boundaries. Use for all internal message models.
SQLAlchemy	2.0.48	ORM / query builder	2.0 is a complete rewrite with true async support. Use `AsyncSession` + `create_async_engine`. The 1.x API is deprecated — do not use legacy patterns.
Alembic	1.18.4	Database migrations	Standard companion to SQLAlchemy. Requires `env.py` modification for async engine (synchronous migration runner wraps async calls).
asyncpg	0.31.0	PostgreSQL async driver	Required for SQLAlchemy async support with PostgreSQL. Significantly faster than psycopg2 for high-concurrency workloads.
PostgreSQL	16	Primary database	Specified in CLAUDE.md. RLS (Row Level Security) is the v1 multi-tenancy mechanism. pgvector extension adds vector search without a separate service.
Redis	7.x	Cache, pub/sub, rate limiting	Session state, per-tenant rate limit counters, pub/sub for real-time event routing. Consider Valkey as a drop-in replacement if Redis license changes concern you.

LLM Integration

Technology	Version	Purpose	Why Recommended
LiteLLM	1.82.5	LLM gateway / router	Unified API across 100+ providers (Anthropic, OpenAI, Ollama, vLLM). Built-in load balancing, cost tracking, fallback routing, and virtual keys. Routes to Ollama locally and commercial APIs without code changes. Now at GA maturity with production users at scale.
Ollama	latest	Local LLM inference	Dev environment local inference. Serves models via OpenAI-compatible API on port 11434 — LiteLLM proxies to it transparently.
pgvector	0.4.2 (Python client)	Vector search / agent memory	Co-located with PostgreSQL — no separate vector DB service for v1. Supports HNSW indexing (added 0.7.0) for sub-10ms queries at <1M vectors. Extension version 0.8.2 is production-ready and included on all major hosted PostgreSQL services.

Messaging Channel SDKs

Technology	Version	Purpose	Why Recommended
slack-bolt	1.27.0	Slack integration	Official Slack SDK. Supports both Events API (webhook) and Socket Mode (WebSocket). Use Events API mode in production (requires public HTTPS endpoint) — Socket Mode is for dev only.
WhatsApp Business Cloud API	Meta-hosted	WhatsApp integration	No official Python SDK from Meta. Use `httpx` (async HTTP) to call the REST API directly. Webhooks arrive as POST to your FastAPI endpoint. `py-whatsapp-cloudbot` provides lightweight FastAPI helpers but is a thin wrapper — direct httpx is preferred for control.

Task Queue

Technology	Version	Purpose	Why Recommended
Celery	5.6.2	Background job processing	Use for LLM inference calls, tool execution, webhook delivery, and anything that shouldn't block the request/response cycle. Celery 5.x is stable and production-proven at scale. Dramatiq is simpler and more reliable per-message, but Celery's ecosystem (Flower monitoring, beat scheduler, chord/chain primitives) is more complete for complex workflows you'll need in v2+.
Redis (Celery broker)	7.x	Celery message broker	Use Redis as both broker and result backend. Redis is already in the stack for other purposes — no additional service needed.

Admin Portal (Next.js)

Technology	Version	Purpose	Why Recommended
Next.js	16.x (latest stable)	Portal framework	Note: CLAUDE.md specifies 14+, but Next.js 16 is the current stable release as of March 2026. App Router is mature. Use 16 to avoid building on a version that's already behind. Turbopack is now default for faster builds.
TypeScript	5.x	Type safety	Strict mode required (matching CLAUDE.md).
Tailwind CSS	4.x	Styling	shadcn/ui requires Tailwind. v4 dropped JIT (always-on now) and uses CSS-native variables.
shadcn/ui	latest	Component library	Copy-to-project component model means no version lock-in. Components are owned code. The standard choice for Next.js admin portals in 2025-2026. Use the CLI to scaffold.
TanStack Query	5.x	Server state management	Handles fetching, caching, and invalidation for API data. Pairs well with App Router — use for client-side mutations and real-time data.
React Hook Form + Zod	latest	Form validation	Standard pairing for shadcn/ui forms. Zod schemas can be shared with backend (TypeScript definitions generated from Pydantic if needed).

Authentication

Technology	Version	Purpose	Why Recommended
Auth.js (formerly NextAuth.js)	v5	Portal authentication	v5 is a complete rewrite compatible with Next.js App Router. Self-hosted, no per-MAU pricing. Supports credential, OAuth, and magic link flows. Database sessions stored in PostgreSQL via adapter. Use over Clerk for cost control and data sovereignty at scale.
FastAPI JWT middleware	custom	Backend API auth	Validate JWTs issued by Auth.js in FastAPI middleware. Use `python-jose` or `PyJWT` for token verification.

Billing

Technology	Version	Purpose	Why Recommended
stripe	14.4.1	Subscription billing	Industry standard. Python SDK handles webhook signature verification, subscription lifecycle events, and checkout sessions. Idempotent webhook handlers are required — Stripe resends on failure.

Development Tools

Tool	Purpose	Notes
uv	Python package manager and monorepo workspaces	Replaces pip + virtualenv + pip-tools. `uv workspace` supports the monorepo structure in CLAUDE.md. Single shared lockfile across packages. Significantly faster than pip.
ruff	Linting + formatting	Replaces flake8, isort, and black in one tool. 100x faster than black. Configure in `pyproject.toml`. Use as both linter and formatter.
mypy	Static type checking (strict mode)	Run with `--strict` flag. Mandatory per CLAUDE.md. Slower than Pyright but more accurate for SQLAlchemy and Pydantic type inference.
pytest + pytest-asyncio	Testing	Async test support required for FastAPI endpoints. Use `httpx.AsyncClient` as the test client (not the sync TestClient).
Docker Compose	Local dev orchestration	All services (PostgreSQL, Redis, Ollama) in compose. FastAPI services run with `uvicorn --reload` outside compose for hot reload.
slowapi	FastAPI rate limiting	Redis-backed token bucket rate limiting middleware. Integrates directly with FastAPI. Use for per-tenant and per-channel rate limits.

Installation

# Initialize Python monorepo with uv
uv init konstruct
cd konstruct

# Add workspace packages
uv workspace add packages/gateway
uv workspace add packages/router
uv workspace add packages/orchestrator
uv workspace add packages/llm-pool
uv workspace add packages/shared

# Core backend dependencies (per package)
uv add fastapi[standard] pydantic[email] sqlalchemy[asyncio] asyncpg alembic
uv add litellm redis celery[redis] pgvector stripe
uv add slack-bolt python-jose[cryptography] httpx slowapi

# Dev dependencies
uv add --dev ruff mypy pytest pytest-asyncio pytest-httpx

# Portal (Node.js)
cd packages/portal
npx create-next-app@latest . --typescript --tailwind --eslint --app
npx shadcn@latest init
npm install @tanstack/react-query react-hook-form zod next-auth

Alternatives Considered

Recommended	Alternative	When to Use Alternative
Celery	Dramatiq	Dramatiq is the better choice if you want simpler per-message reliability and don't need complex workflow primitives (chords, chains). Switch to Dramatiq if Celery's configuration complexity becomes a team burden in v2.
Auth.js v5	Clerk	Choose Clerk if you need built-in multi-tenant Organizations, passkeys, or faster time-to-market on auth. Tradeoff: per-MAU pricing and vendor lock-in.
pgvector	Qdrant	Migrate to Qdrant when vector count exceeds ~1M or when vector search latency under HNSW becomes a bottleneck. The CLAUDE.md already anticipates this upgrade path.
Redis	Valkey	Valkey is a Redis fork with a fully open license. Drop-in replacement. Consider if Redis licensing (BSL) becomes a concern.
LiteLLM SDK	Direct Anthropic/OpenAI SDK	Use direct SDKs only if you're locked to a single provider with no fallback needs. LiteLLM adds negligible overhead while enabling provider portability.
Next.js 16	Remix	Remix is excellent for form-heavy apps. Next.js wins for the admin portal pattern (server components, strong Vercel ecosystem, shadcn/ui first-class support).
httpx (WhatsApp)	whatsapp-cloud-api libraries	None of the community Python WhatsApp SDKs have significant maintenance or production adoption. The Cloud API is a simple REST API — raw httpx with your own models is more maintainable.

What NOT to Use

Avoid	Why	Use Instead
LangGraph or CrewAI (v1)	Both frameworks add significant abstraction overhead for a single-agent-per-tenant model. LangGraph's graph primitives shine for complex multi-agent stateful orchestration (v2 scenario). In v1, they'd constrain the agent model to their abstractions before requirements are clear.	Custom orchestrator with direct LiteLLM calls. Evaluate LangGraph seriously for v2 multi-agent teams.
SQLAlchemy 1.x patterns	The 1.x `session.query()` style and `Session` (sync) are deprecated in 2.0. Mixing sync and async patterns causes subtle bugs in FastAPI async endpoints.	SQLAlchemy 2.0 with `AsyncSession` and `select()` query style exclusively.
Socket Mode (Slack) in production	Socket Mode uses a persistent outbound WebSocket — no inbound port needed, but it ties a worker to a long-lived connection. This breaks horizontal scaling.	Events API with a public webhook endpoint. Use Socket Mode only for local dev (bypasses ngrok need during testing).
psycopg2	Synchronous PostgreSQL driver. Blocks the event loop in async FastAPI handlers — kills concurrency.	asyncpg (via SQLAlchemy async engine).
Flake8 + Black + isort (separately)	Three tools with overlapping responsibilities, separate configs, and order-of-operation conflicts. The CLAUDE.md already specifies ruff.	ruff, which replaces all three with a single configuration block in pyproject.toml.
Flask	Flask is synchronous by default. Adding async support is possible but bolted on. For a platform that processes LLM calls and webhooks concurrently, you need async-native from the start.	FastAPI.
Next.js 14 specifically	CLAUDE.md says "14+" but Next.js 16 is the current stable release (March 2026). Starting on 14 means immediately being two major versions behind.	Next.js 16 (latest stable).
Keycloak (v1)	Correct for enterprise SSO/SAML needs but massively over-engineered for a v1 beta with a small number of tenants. Adds significant operational complexity.	Auth.js v5 with PostgreSQL session storage. Add Keycloak in v2+ if enterprise SSO is a customer requirement.

Stack Patterns by Variant

For Slack Events API webhook handling:

Use slack-bolt in async mode with FastAPI as the ASGI host
AsyncApp + AsyncBoltAdapter for starlette
Mount the bolt app at /slack/events in your FastAPI router

For WhatsApp webhook handling:

Expose a GET endpoint for Meta's verification handshake (returns hub.challenge)
Expose a POST endpoint for incoming messages
Verify X-Hub-Signature-256 header with hmac before processing
Parse the nested JSON payload manually — no SDK needed

For tenant context in SQLAlchemy + RLS:

Set app.tenant_id session variable on each connection before query execution
Use SQLAlchemy event listeners (@event.listens_for(engine, "connect")) or middleware injection
The sqlalchemy-tenants library provides a clean abstraction if hand-rolling this becomes repetitive

For LLM call patterns:

All LLM calls go through LiteLLM proxy — never call provider APIs directly
LiteLLM handles retries, fallback, and cost tracking
Dispatch via Celery task so the HTTP response returns immediately
Stream tokens back to the user via WebSocket or Server-Sent Events for real-time feel

For Celery + async FastAPI coexistence:

Celery workers are synchronous processes — wrap async code with asyncio.run() inside task functions
Alternatively, use celery[gevent] for cooperative multitasking in workers
Do not share the SQLAlchemy AsyncEngine between the FastAPI app and Celery workers — create separate engines per process

Version Compatibility

Package	Compatible With	Notes
FastAPI 0.135.x	Pydantic 2.x	FastAPI 0.100+ requires Pydantic v2. v1 is not supported.
SQLAlchemy 2.0.x	asyncpg 0.31.x	Both support PostgreSQL 16. Use `asyncpg` as the dialect driver.
Alembic 1.18.x	SQLAlchemy 2.0.x	Compatible. Modify `env.py` to use `run_async_migrations()` pattern for async engine.
Celery 5.6.x	Redis 7.x	Celery 5.x uses Redis protocol — compatible with Redis 6+ and Valkey.
slack-bolt 1.27.x	Python 3.12	Fully supported.
LiteLLM 1.82.x	Python 3.12	Fully supported.
Next.js 16.x	Auth.js v5	Auth.js v5 was rewritten specifically for Next.js App Router compatibility.
pgvector 0.4.2 (Python)	pgvector 0.8.2 (PostgreSQL extension)	Python client 0.4.x works with extension 0.7.x+. HNSW index requires extension 0.7.0+.

Sources

PyPI (verified March 2026): FastAPI 0.135.1, SQLAlchemy 2.0.48, Pydantic 2.12.5, Alembic 1.18.4, asyncpg 0.31.0, Celery 5.6.2, Dramatiq 2.1.0, stripe 14.4.1, pgvector 0.4.2, LiteLLM 1.82.5, slack-bolt 1.27.0
FastAPI official docs — async patterns, dependency injection
LiteLLM docs — provider support, routing configuration
pgvector GitHub — HNSW indexing, production readiness
uv workspace docs — monorepo setup
Slack Bolt Python GitHub — Events API vs Socket Mode
Auth.js docs — v5 App Router compatibility (MEDIUM confidence — not directly fetched)
sqlalchemy-tenants — RLS + SQLAlchemy integration pattern
Next.js 16 confirmed as latest stable via npm registry search (March 2026)
LangGraph 1.0 GA confirmed via community sources (MEDIUM confidence — agent framework recommendation is HIGH confidence to avoid it for v1)

Stack research for: Konstruct — channel-native AI workforce platform Researched: 2026-03-22

16 KiB Raw Blame History