Files
konstruct/CLAUDE.md
Adolfo Delorenzo 0e0ea5fb66 fix: runtime deployment fixes for Docker Compose stack
- Add .gitignore for __pycache__, node_modules, .playwright-mcp
- Add CLAUDE.md project instructions
- docker-compose: remove host port exposure for internal services,
  remove Ollama container (use host), add CORS origin, bake
  NEXT_PUBLIC_API_URL at build time, run alembic migrations on
  gateway startup, add CPU-only torch pre-install
- gateway: add CORS middleware, graceful Slack degradation without
  bot token, fix None guard on slack_handler
- gateway pyproject: add aiohttp dependency for slack-bolt async
- llm-pool pyproject: install litellm from GitHub (removed from PyPI),
  enable hatch direct references
- portal: enable standalone output in next.config.ts
- Remove orphaned migration 003_phase2_audit_kb.py (renamed to 004)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 12:26:34 -06:00

17 KiB
Raw Blame History

CLAUDE.md — Konstruct

What is Konstruct?

Konstruct is an AI workforce platform where clients subscribe to AI employees, teams, or entire AI-run companies. AI workers communicate through familiar channels — Slack, Microsoft Teams, Mattermost, Rocket.Chat, WhatsApp, Telegram, and Signal — so adoption requires zero behavior change from the customer.

Think of it as "Hire an AI department" — not another chatbot SaaS.


Project Identity

  • Codename: Konstruct
  • Domain: TBD (check konstruct.ai, konstruct.io, konstruct.dev)
  • Tagline ideas: "Build your AI workforce" / "AI teams that just work"
  • Inspired by: paperclip.ing
  • Differentiation: Channel-native AI workers (not a dashboard), tiered multi-tenancy, BYO-model support

Architecture Overview

Core Mental Model

Client (Slack/Teams/etc.)
        │
        ▼
┌─────────────────────┐
│   Channel Gateway    │  ← Unified ingress for all messaging platforms
│   (webhook/WS)      │
└────────┬────────────┘
         │
         ▼
┌─────────────────────┐
│   Message Router     │  ← Tenant resolution, rate limiting, context loading
└────────┬────────────┘
         │
         ▼
┌─────────────────────┐
│   Agent Orchestrator │  ← Agent selection, tool dispatch, memory, handoffs
│   (per-tenant)       │
└────────┬────────────┘
         │
         ▼
┌─────────────────────┐
│   LLM Backend Pool   │  ← LiteLLM router → Ollama / vLLM / OpenAI / Anthropic / BYO
└─────────────────────┘

Key Architectural Principles

  1. Channel-agnostic core — Business logic never depends on which messaging platform the message came from. The Channel Gateway normalizes everything into a unified internal message format.
  2. Tenant-isolated agent state — Each tenant's agents have isolated memory, tools, and configuration. No cross-tenant data leakage, ever.
  3. LLM backend as a pluggable resource — Clients can use platform-provided models, bring their own API keys, or point to their own self-hosted inference endpoints.
  4. Agents are composable — A single AI employee is an agent. A team is an orchestrated group of agents. A company is a hierarchy of teams with shared context and delegation.

Tech Stack

Backend (Primary: Python)

Layer Technology Rationale
API Framework FastAPI Async-native, OpenAPI docs, dependency injection
Task Queue Celery + Redis or Dramatiq Background jobs: LLM calls, tool execution, webhooks
Database PostgreSQL 16 Primary data store, tenant isolation via schemas or RLS
Cache / Pub-Sub Redis / Valkey Session state, rate limiting, pub/sub for real-time events
Vector Store pgvector (start) → Qdrant (scale) Agent memory, RAG, conversation search
Object Storage MinIO (self-hosted) / S3 (cloud burst) File attachments, documents, agent artifacts
LLM Gateway LiteLLM Unified API across all LLM providers, cost tracking, fallback routing
Agent Framework Custom (evaluate LangGraph, CrewAI, or raw) Agent orchestration, tool use, multi-agent handoffs

Messaging Channel SDKs

Channel Library / Integration
Slack slack-bolt (Events API + Socket Mode)
Microsoft Teams botbuilder-python (Bot Framework SDK)
Mattermost mattermostdriver + webhooks
Rocket.Chat REST API + Realtime API (WebSocket)
WhatsApp WhatsApp Business API (Cloud API)
Telegram python-telegram-bot (Bot API)
Signal signal-cli or signald (bridge)

Frontend (Admin Dashboard / Client Portal)

Layer Technology
Framework Next.js 14+ (App Router)
UI Tailwind CSS + shadcn/ui
State TanStack Query
Auth NextAuth.js → consider Keycloak for enterprise

Infrastructure

Layer Technology
Dev Orchestration Docker Compose + Portainer
Prod Orchestration Kubernetes (k3s or Talos Linux)
Core Hosting Hetzner Dedicated Servers
Cloud Burst AWS / GCP (auto-scale inference, overflow)
Reverse Proxy NPM Plus (dev) / Traefik (prod K8s ingress)
DNS Technitium (internal) / Cloudflare (external)
VPN Mesh Headscale (self-hosted) + Tailscale clients
CI/CD Gitea ActionsGitHub Actions (if public)
Monitoring Prometheus + Grafana + Loki
Security Wazuh (SIEM), Trivy (container scanning)

Repo Structure

Monorepo to start, split later when service boundaries stabilize.

konstruct/
├── CLAUDE.md                    # This file
├── docker-compose.yml           # Local dev environment
├── docker-compose.prod.yml      # Production-like local stack
├── k8s/                         # Kubernetes manifests / Helm charts
│   ├── base/
│   └── overlays/
│       ├── staging/
│       └── production/
├── packages/
│   ├── gateway/                 # Channel Gateway service
│   │   ├── channels/            # Per-channel adapters (slack, teams, etc.)
│   │   ├── normalize.py         # Unified message format
│   │   └── main.py
│   ├── router/                  # Message Router service
│   │   ├── tenant.py            # Tenant resolution
│   │   ├── ratelimit.py
│   │   └── main.py
│   ├── orchestrator/            # Agent Orchestrator service
│   │   ├── agents/              # Agent definitions and behaviors
│   │   ├── teams/               # Multi-agent team logic
│   │   ├── tools/               # Tool registry and execution
│   │   ├── memory/              # Conversation and long-term memory
│   │   └── main.py
│   ├── llm-pool/                # LLM Backend Pool service
│   │   ├── providers/           # Provider configs (litellm router)
│   │   ├── byo/                 # BYO key / endpoint management
│   │   └── main.py
│   ├── portal/                  # Next.js admin dashboard
│   │   ├── app/
│   │   ├── components/
│   │   └── lib/
│   └── shared/                  # Shared Python libs
│       ├── models/              # Pydantic models, DB schemas
│       ├── auth/                # Auth utilities
│       ├── messaging/           # Internal message format
│       └── config/              # Shared config / env management
├── migrations/                  # Alembic DB migrations
├── scripts/                     # Dev scripts, seed data, utilities
├── tests/
│   ├── unit/
│   ├── integration/
│   └── e2e/
├── docs/                        # Architecture docs, ADRs, runbooks
├── pyproject.toml               # Python monorepo config (uv / hatch)
└── .env.example

Multi-Tenancy Model

Tiered isolation — the level increases with the subscription plan:

Tier Isolation Target
Starter Shared infra, PostgreSQL RLS, logical separation Solo founders, micro-businesses
Team Dedicated DB schema, isolated Redis namespace, dedicated agent processes SMBs, small teams
Enterprise Dedicated namespace (K8s), dedicated DB, optional dedicated LLM inference Larger orgs, compliance needs
Self-Hosted Customer deploys their own Konstruct instance (Helm chart / Docker Compose) On-prem requirements, data sovereignty

Tenant Resolution Flow

  1. Inbound message hits Channel Gateway
  2. Gateway extracts workspace/org identifier from the channel metadata (Slack workspace ID, Teams tenant ID, etc.)
  3. Router maps channel org → Konstruct tenant via lookup table
  4. All subsequent processing scoped to that tenant's context, models, tools, and memory

AI Employee Model

Hierarchy

Company (AI-run)
  └── Team
       └── Employee (Agent)
            ├── Role definition (system prompt + persona)
            ├── Skills (tool bindings)
            ├── Memory (vector store + conversation history)
            ├── Channels (which messaging platforms it's active on)
            └── Escalation rules (when to hand off to human or another agent)

Employee Configuration (example)

employee:
  name: "Mara"
  role: "Customer Support Lead"
  persona: |
    Professional, empathetic, solution-oriented.
    Fluent in English, Spanish, Portuguese.
    Escalates billing disputes to human after 2 failed resolutions.
  model:
    primary: "anthropic/claude-sonnet-4-20250514"
    fallback: "openai/gpt-4o"
    local: "ollama/qwen3:32b"
  tools:
    - zendesk_ticket_create
    - zendesk_ticket_lookup
    - knowledge_base_search
    - calendar_book
  channels:
    - slack
    - whatsapp
  memory:
    type: "conversational + rag"
    retention_days: 90
  escalation:
    - condition: "billing_dispute AND attempts > 2"
      action: "handoff_human"
    - condition: "sentiment < -0.7"
      action: "handoff_human"

Team Orchestration

Teams use a coordinator pattern:

  1. Coordinator agent receives the inbound message
  2. Coordinator decides which team member(s) should handle it (routing)
  3. Specialist agent(s) execute their part
  4. Coordinator assembles the final response or delegates follow-up
  5. All inter-agent communication logged for audit

LLM Backend Strategy

Provider Hierarchy

┌─────────────────────────────────────────┐
│              LiteLLM Router             │
│  (load balancing, fallback, cost caps)  │
└────┬──────────┬──────────┬─────────┬────┘
     │          │          │         │
  Ollama     vLLM     Anthropic   OpenAI
  (local)   (local)    (API)      (API)
                                    │
                              BYO Endpoint
                            (customer-provided)

Routing Logic

  1. Tenant config specifies preferred provider(s) and fallback chain
  2. Cost caps per tenant (daily/monthly spend limits)
  3. Model routing by task type: simple queries → smaller/local models, complex reasoning → commercial APIs
  4. BYO keys stored encrypted (AES-256), never logged, never used for other tenants

Messaging Format (Internal)

All channel adapters normalize messages into this format:

class KonstructMessage(BaseModel):
    id: str                          # UUID
    tenant_id: str                   # Konstruct tenant
    channel: ChannelType             # slack | teams | mattermost | rocketchat | whatsapp | telegram | signal
    channel_metadata: dict           # Channel-specific IDs (workspace, channel, thread)
    sender: SenderInfo               # User ID, display name, role
    content: MessageContent          # Text, attachments, structured data
    timestamp: datetime
    thread_id: str | None            # For threaded conversations
    reply_to: str | None             # Parent message ID
    context: dict                    # Extracted intent, entities, sentiment (populated downstream)

Security & Compliance

Non-Negotiables

  • Encryption at rest (PostgreSQL TDE, MinIO server-side encryption)
  • Encryption in transit (TLS 1.3 everywhere, mTLS between services)
  • Tenant isolation enforced at every layer (DB, cache, object storage, agent memory)
  • BYO API keys encrypted with per-tenant KEK, HSM-backed in Enterprise tier
  • Audit log for every agent action, tool invocation, and LLM call
  • RBAC per tenant (admin, manager, member, viewer)
  • Rate limiting per tenant, per channel, per agent
  • PII handling — configurable PII detection and redaction per tenant

Future Compliance Targets

  • SOC 2 Type II (when revenue supports it)
  • GDPR data residency (leverage Hetzner EU + customer self-hosted option)
  • HIPAA (Enterprise self-hosted tier only, with BAA)

Development Workflow

Local Dev

# Clone and setup
git clone <repo-url> && cd konstruct
cp .env.example .env

# Start all services
docker compose up -d

# Run gateway in dev mode (hot reload)
cd packages/gateway
uvicorn main:app --reload --port 8001

# Run tests
pytest tests/unit -x
pytest tests/integration -x

Branch Strategy

  • main — production-ready, protected
  • develop — integration branch
  • feat/* — feature branches off develop
  • fix/* — bugfix branches
  • release/* — release candidates

CI Pipeline

  1. Lint (ruff check, ruff format --check)
  2. Type check (mypy --strict)
  3. Unit tests (pytest tests/unit)
  4. Integration tests (pytest tests/integration — spins up Docker Compose)
  5. Container build + scan (trivy image)
  6. Deploy to staging (auto on develop merge)
  7. Deploy to production (manual approval on release/* merge)

Milestones

Phase 1: Foundation (Weeks 16)

  • Repo scaffolding, CI/CD, Docker Compose dev environment
  • PostgreSQL schema with RLS multi-tenancy
  • Unified message format and Channel Gateway (start with Slack + Telegram)
  • Basic agent orchestrator (single agent per tenant, no teams yet)
  • LiteLLM integration with Ollama + one commercial API
  • Basic admin portal (tenant CRUD, agent config)

Phase 2: Channel Expansion + Teams (Weeks 712)

  • Add channels: Mattermost, WhatsApp, Teams
  • Multi-agent teams with coordinator pattern
  • Conversational memory (vector store + sliding window)
  • Tool framework (registry, execution, sandboxing)
  • BYO API key support
  • Tenant onboarding flow in portal

Phase 3: Polish + Launch (Weeks 1318)

  • Add channels: Rocket.Chat, Signal
  • AI company hierarchy (teams of teams)
  • Cost tracking and billing integration (Stripe)
  • Agent performance analytics dashboard
  • Self-hosted deployment option (Helm chart + docs)
  • Public launch (Product Hunt, Hacker News, Reddit)

Phase 4: Scale (Post-Launch)

  • Kubernetes migration for production workloads
  • Cloud burst infrastructure (AWS auto-scaling inference)
  • Marketplace for pre-built AI employee templates
  • Enterprise tier with dedicated isolation
  • SOC 2 preparation
  • API for programmatic agent management

Coding Standards

Python

  • Version: 3.12+
  • Package manager: uv
  • Linting: ruff (replaces flake8, isort, black)
  • Type checking: mypy --strict — no Any types in public interfaces
  • Testing: pytest + pytest-asyncio + httpx (for FastAPI test client)
  • Models: Pydantic v2 for all data validation and serialization
  • Async: Prefer async def for all I/O-bound operations
  • DB: SQLAlchemy 2.0 async with Alembic migrations

TypeScript (Portal)

  • Runtime: Node 20+ LTS
  • Framework: Next.js 14+ (App Router)
  • Linting: eslint + prettier
  • Type checking: strict: true in tsconfig

General

  • Every PR requires at least one approval
  • No secrets in code — use .env + secrets manager
  • Write ADRs (Architecture Decision Records) in docs/adr/ for significant decisions
  • Conventional commits (feat:, fix:, chore:, docs:, refactor:)

Key Design Decisions (ADR Stubs)

These need full ADRs written before implementation:

  1. ADR-001: Channel Gateway — webhook-based vs. persistent WebSocket connections per channel
  2. ADR-002: Agent memory — pgvector vs. dedicated vector DB vs. hybrid
  3. ADR-003: Multi-tenancy — RLS vs. schema-per-tenant vs. DB-per-tenant
  4. ADR-004: Agent framework — build custom vs. adopt LangGraph/CrewAI
  5. ADR-005: BYO key encryption — envelope encryption strategy and key rotation
  6. ADR-006: Inter-agent communication — direct function calls vs. message bus vs. shared context
  7. ADR-007: Rate limiting — per-tenant token bucket implementation
  8. ADR-008: Self-hosted distribution — Helm chart vs. Docker Compose vs. Omnibus

Open Questions

  • Pricing model: per-agent, per-message, per-seat, or hybrid?
  • Should agents maintain persistent identity across channels (same "Mara" on Slack and WhatsApp)?
  • Voice channel support? (Telephony via Twilio/Vonage — Phase 4+?)
  • Agent-to-agent communication across tenants (marketplace scenario)?
  • White-labeling for agencies reselling Konstruct?

References