diff --git a/.planning/PROJECT.md b/.planning/PROJECT.md index 29c96ff..76e92c6 100644 --- a/.planning/PROJECT.md +++ b/.planning/PROJECT.md @@ -2,69 +2,74 @@ ## What This Is -Konstruct is an AI workforce platform where SMBs subscribe to AI employees that communicate through familiar messaging channels — Slack and WhatsApp for v1. Clients get an AI worker that shows up where their team already communicates, requiring zero behavior change. Think "hire an AI department" rather than "subscribe to another SaaS dashboard." +Konstruct is an AI workforce platform where SMBs subscribe to AI employees that communicate through familiar messaging channels — Slack, WhatsApp, and the built-in web chat. Clients get AI workers that show up where their team already communicates, requiring zero behavior change. Think "hire an AI department" rather than "subscribe to another SaaS dashboard." ## Core Value -An AI employee that works in the channels your team already uses — no new tools to learn, no dashboards to check, just a capable coworker in Slack or WhatsApp. +An AI employee that works in the channels your team already uses — no new tools to learn, no dashboards to check, just a capable coworker in Slack, WhatsApp, or the portal chat. -## Requirements +## Current State (v1.0 — Beta-Ready) -### Validated +All 10 phases complete. 39 plans executed. 67 requirements satisfied. -(None yet — ship to validate) +### What's Shipped -### Active +| Feature | Status | +|---------|--------| +| Channel Gateway (Slack + WhatsApp + Web Chat) | ✓ Complete | +| Multi-tenant isolation (PostgreSQL RLS) | ✓ Complete | +| LLM Backend (Ollama + Anthropic/OpenAI via LiteLLM) | ✓ Complete | +| Conversational memory (Redis sliding window + pgvector) | ✓ Complete | +| Tool framework (web search, KB, HTTP, calendar) | ✓ Complete | +| Knowledge base (document upload, URL scraping, YouTube transcription) | ✓ Complete | +| Google Calendar integration (OAuth, CRUD) | ✓ Complete | +| Human escalation with assistant mode | ✓ Complete | +| Bidirectional media support (multimodal LLM) | ✓ Complete | +| Admin portal (Next.js 16, shadcn/ui, DM Sans) | ✓ Complete | +| Agent Designer + Wizard + 6 pre-built templates | ✓ Complete | +| Stripe billing (per-agent monthly, 14-day trial) | ✓ Complete | +| BYO API keys (Fernet encrypted) | ✓ Complete | +| Cost dashboard with Recharts | ✓ Complete | +| 3-tier RBAC (platform admin, customer admin, operator) | ✓ Complete | +| Email invitation flow (SMTP, HMAC tokens) | ✓ Complete | +| Web Chat with real-time streaming (bypass Celery) | ✓ Complete | +| Multilanguage (English, Spanish, Portuguese) | ✓ Complete | +| Mobile layout (bottom tab bar, full-screen chat) | ✓ Complete | +| PWA (service worker, push notifications, offline queue) | ✓ Complete | +| E2E tests (Playwright, 7 flows, 3 browsers) | ✓ Complete | +| CI pipeline (Gitea Actions) | ✓ Complete | +| Premium UI (indigo brand, dark sidebar, glass-morphism) | ✓ Complete | -- [ ] Channel Gateway that normalizes messages from Slack and WhatsApp into a unified internal format -- [ ] Single AI employee per tenant with configurable role, persona, and tools -- [ ] Multi-tenant architecture with proper isolation (PostgreSQL RLS for Starter tier) -- [ ] LLM backend pool with Ollama (local) + commercial APIs (Anthropic/OpenAI) via LiteLLM -- [ ] Full admin portal (Next.js) for tenant management, agent configuration, and monitoring -- [ ] Tenant onboarding flow in the portal -- [ ] Billing integration (Stripe) for subscription management -- [ ] Conversational memory (conversation history + vector search) -- [ ] Tool framework for agent capabilities (registry, execution) -- [ ] Rate limiting per tenant and per channel +### v2 Scope (Deferred) -### Out of Scope - -- Multi-agent teams and coordinator pattern — v2 (need single agent working first) -- AI company hierarchy (teams of teams) — v2+ -- Microsoft Teams, Mattermost, Rocket.Chat, Signal, Telegram — v2 channel expansion -- BYO API key support — moved to v1 Phase 3 (operator requested during scoping) -- Self-hosted deployment (Helm chart) — v2+ (SaaS-first for beta) -- Voice/telephony channels — v3+ -- Agent marketplace / pre-built templates — v3+ -- SOC 2 / HIPAA compliance — post-revenue -- White-labeling for agencies — future consideration +- Multi-agent teams and coordinator pattern +- Microsoft Teams, Mattermost, Telegram channels +- Self-hosted deployment (Helm chart) +- Schema-per-tenant isolation +- Agent marketplace +- Voice/telephony channels +- SSO/SAML for enterprise +- Granular operator permissions ## Context -- **Market gap:** Existing AI tools are dashboards or chatbots, not channel-native workers. No coordinated AI teams. No self-hosted options for enterprises. Konstruct addresses all three. -- **Target customer:** SMBs that need additional staff capacity but lack resources, are overwhelmed with processes, or want to grow faster but can't find the right balance. -- **Inspiration:** paperclip.ing — but differentiated by channel-native presence, tiered multi-tenancy, and eventual BYO-model support. -- **V1 goal:** Beta-ready product that can accept early users. One AI employee per tenant on Slack + WhatsApp, managed through a full admin portal, with multi-tenancy and billing. -- **Tech foundation:** Python (FastAPI) backend, Next.js portal, PostgreSQL + Redis, Docker Compose for dev, monorepo structure. - -## Constraints - -- **Tech stack:** Python 3.12+ (FastAPI, SQLAlchemy 2.0, Pydantic v2), Next.js 14+ (App Router, shadcn/ui), PostgreSQL 16, Redis — as specified in CLAUDE.md -- **V1 channels:** Slack (slack-bolt) + WhatsApp (Business Cloud API) only -- **LLM providers:** Ollama (local) + Anthropic/OpenAI (commercial) via LiteLLM — no BYO in v1 -- **Multi-tenancy:** PostgreSQL RLS for v1 (Starter tier), schema isolation deferred to v2 -- **Deployment:** Docker Compose for dev, single-server deployment for beta — Kubernetes deferred +- **Market gap:** Existing AI tools are dashboards or chatbots, not channel-native workers. No coordinated AI teams. No self-hosted options for enterprises. +- **Target customer:** SMBs that need additional staff capacity but lack resources, are overwhelmed with processes, or want to grow faster. +- **Tech foundation:** Python 3.12+ (FastAPI, SQLAlchemy 2.0, Celery), Next.js 16 (App Router, shadcn/ui, next-intl, Serwist), PostgreSQL 16 + pgvector, Redis, Ollama, Docker Compose. ## Key Decisions | Decision | Rationale | Outcome | |----------|-----------|---------| -| Slack + WhatsApp for v1 channels | Slack = where SMB teams work, WhatsApp = massive business communication reach | — Pending | -| Single agent per tenant for v1 | Prove the channel-native thesis before adding team complexity | — Pending | -| Full portal from day one | Beta users need a proper UI, not config files — lowers barrier to adoption | — Pending | -| Local + commercial LLMs | Ollama for dev/cheap tasks, commercial APIs for quality — balances cost and capability | — Pending | -| PostgreSQL RLS multi-tenancy | Simplest to start, sufficient for Starter tier, upgrade path to schema isolation exists | — Pending | -| Beta-ready as v1 target | Multi-tenancy + billing = can accept real users, not just demos | — Pending | +| Slack + WhatsApp + Web Chat channels | Covers office (Slack), customers (WhatsApp), and portal users (Web Chat) | ✓ Shipped | +| Single agent per tenant for v1 | Prove channel-native thesis before team complexity | ✓ Shipped | +| Full portal from day one | Beta users need UI, not config files | ✓ Shipped | +| Local + commercial LLMs | Ollama for dev/cost, commercial for quality | ✓ Shipped | +| PostgreSQL RLS multi-tenancy | Simplest, sufficient for Starter tier | ✓ Shipped | +| Web chat bypasses Celery | Direct LLM streaming from WebSocket for speed | ✓ Shipped | +| Per-agent monthly pricing | Matches "hire an employee" metaphor | ✓ Shipped | +| 3-tier RBAC with invite flow | Self-service for customers, control for operators | ✓ Shipped | +| DM Sans + indigo brand | Premium SaaS aesthetic for SMB market | ✓ Shipped | --- -*Last updated: 2026-03-22 after initialization* +*Last updated: 2026-03-26 after Phase 10 completion* diff --git a/CHANGELOG.md b/CHANGELOG.md new file mode 100644 index 0000000..e1dd41f --- /dev/null +++ b/CHANGELOG.md @@ -0,0 +1,103 @@ +# Changelog + +All notable changes to Konstruct are documented in this file. + +## [1.0.0] - 2026-03-26 + +### Phase 10: Agent Capabilities +- Knowledge base ingestion pipeline — upload PDF, DOCX, PPTX, XLSX, CSV, TXT, Markdown; add URLs (Firecrawl scraping); add YouTube videos (transcript extraction) +- Async document processing via Celery — chunk, embed (all-MiniLM-L6-v2), store in pgvector +- KB management portal page with drag-and-drop upload, live status polling, delete, reindex +- Google Calendar OAuth per tenant — list events, check availability, create events +- Token auto-refresh with encrypted DB write-back +- Web search connected to Brave Search API (platform-wide key) +- Tool executor injects tenant_id/agent_id into all tool handlers +- System prompt includes tool result formatting instruction (no raw JSON) + +### Phase 9: Testing & QA +- Playwright E2E test suite — 29 tests across 7 critical flows (login, tenants, agent deploy, chat, RBAC, i18n, mobile) +- Cross-browser testing — Chromium, Firefox, WebKit +- Visual regression snapshots at 3 viewports (desktop, tablet, mobile) +- axe-core accessibility scans on all pages +- Lighthouse CI score gating (>= 80 hard floor) +- Gitea Actions CI pipeline — backend lint + pytest → portal build + E2E + Lighthouse + +### Phase 8: Mobile + PWA +- Responsive mobile layout with bottom tab bar (Dashboard, Employees, Chat, Usage, More) +- Full-screen WhatsApp-style mobile chat with back arrow + agent name header +- Visual Viewport API keyboard handling for iOS +- PWA manifest with K monogram icons +- Service worker (Serwist) with app shell + runtime caching +- Web Push notifications (VAPID) with push subscription management +- IndexedDB offline message queue with drain-on-reconnect +- Smart install prompt on second visit +- iOS safe-area support + +### Phase 7: Multilanguage +- Full portal UI localization — English, Spanish, Portuguese +- next-intl v4 (cookie-based locale, no URL routing) +- Language switcher in sidebar (post-auth) and login page (pre-auth) +- Browser locale auto-detection on first visit +- Language preference saved to DB, synced to JWT +- Agent templates translated in all 3 languages (JSONB translations column) +- System prompt language instruction — agents auto-detect and respond in user's language +- Localized invitation emails + +### Phase 6: Web Chat +- Real-time WebSocket chat in the portal +- Direct LLM streaming from WebSocket handler (bypasses Celery for speed) +- Token-by-token streaming via NDJSON → Redis pub-sub → WebSocket +- Conversation persistence (web_conversations + web_conversation_messages tables) +- Agent picker dialog for new conversations +- Markdown rendering (react-markdown + remark-gfm) +- Typing indicator during LLM generation +- All roles can chat (operators included) + +### Phase 5: Employee Design +- Three-path AI employee creation: Templates / Guided Setup / Advanced +- 6 pre-built agent templates (Customer Support Rep, Sales Assistant, Marketing Manager, Office Manager, Project Coordinator, Finance & Accounting Manager) +- 5-step wizard (Role → Persona → Tools → Channels → Escalation) +- System prompt auto-generation from wizard inputs +- Templates stored as DB seed data with one-click deploy +- Agent Designer as "Advanced" mode + +### Phase 4: RBAC +- Three-tier roles: platform_admin, customer_admin, customer_operator +- FastAPI RBAC guard dependencies (require_platform_admin, require_tenant_admin, require_tenant_member) +- Email invitation flow with HMAC tokens (48-hour expiry, resend capability) +- SMTP email sending via Python stdlib +- Portal navigation and API endpoints enforce role-based access +- Impersonation for platform admins with audit trail +- Global user management page + +### Phase 3: Operator Experience +- Slack OAuth "Add to Slack" flow with HMAC state protection +- WhatsApp guided manual setup +- 3-step onboarding wizard (Connect → Configure → Test) +- Stripe subscription management (per-agent $49/month, 14-day trial) +- BYO API key management with Fernet encryption + MultiFernet key rotation +- Cost dashboard with Recharts (token usage, provider costs, message volume, budget alerts) +- Agent-level cost tracking and budget limits + +### Phase 2: Agent Features +- Two-layer conversational memory (Redis sliding window + pgvector HNSW) +- Cross-conversation memory keyed per-user per-agent +- Tool framework with 4 built-in tools (web search, KB search, HTTP request, calendar) +- Schema-validated tool execution with confirmation flow for side-effecting actions +- Immutable audit logging (REVOKE UPDATE/DELETE at DB level) +- WhatsApp Business Cloud API adapter with Meta 2026 policy compliance +- Two-tier business-function scoping (keyword allowlist + role-based LLM) +- Human escalation with DM delivery, full transcript, and assistant mode +- Cross-channel bidirectional media support with multimodal LLM interpretation + +### Phase 1: Foundation +- Monorepo with uv workspaces +- Docker Compose dev environment (PostgreSQL 16 + pgvector, Redis, Ollama) +- PostgreSQL Row Level Security with FORCE ROW LEVEL SECURITY +- Shared Pydantic models (KonstructMessage) and SQLAlchemy 2.0 async ORM +- LiteLLM Router with Ollama + Anthropic/OpenAI and fallback routing +- Celery orchestrator with sync-def pattern (asyncio.run) +- Slack adapter (Events API) with typing indicator +- Message Router with tenant resolution, rate limiting, idempotency +- Next.js 16 admin portal with Auth.js v5, tenant CRUD, Agent Designer +- Premium UI design system (indigo brand, dark sidebar, glass-morphism, DM Sans) diff --git a/README.md b/README.md new file mode 100644 index 0000000..dfe4d08 --- /dev/null +++ b/README.md @@ -0,0 +1,167 @@ +# Konstruct + +**Build your AI workforce.** Deploy AI employees that work in the channels your team already uses — Slack, WhatsApp, and the built-in web chat. Zero behavior change required. + +--- + +## What is Konstruct? + +Konstruct is an AI workforce platform where SMBs subscribe to AI employees. Each AI employee has a name, role, persona, and tools — and communicates through familiar messaging channels. Think of it as "hire an AI department" rather than "subscribe to another SaaS dashboard." + +### Key Features + +- **Channel-native AI employees** — Agents respond in Slack, WhatsApp, and the portal web chat +- **Knowledge base** — Upload documents (PDF, DOCX, PPTX, Excel, CSV, TXT, Markdown), URLs, and YouTube videos. Agents search them automatically. +- **Google Calendar** — Agents check availability, list events, and book meetings via OAuth +- **Web search** — Agents search the web via Brave Search API +- **Real-time streaming** — Web chat streams LLM responses word-by-word +- **6 pre-built templates** — Customer Support Rep, Sales Assistant, Marketing Manager, Office Manager, Project Coordinator, Finance & Accounting Manager +- **Employee wizard** — 5-step guided setup or one-click template deployment +- **3-tier RBAC** — Platform admin, customer admin, customer operator with email invitation flow +- **Multilanguage** — English, Spanish, Portuguese (portal UI + agent responses) +- **Mobile + PWA** — Bottom tab bar, full-screen chat, push notifications, offline support +- **Stripe billing** — Per-agent monthly pricing with 14-day free trial +- **BYO API keys** — Tenants can bring their own LLM provider keys (Fernet encrypted) + +--- + +## Quick Start + +### Prerequisites + +- Docker + Docker Compose +- Ollama running on the host (port 11434) +- Node.js 22+ (for portal development) +- Python 3.12+ with `uv` (for backend development) + +### Setup + +```bash +# Clone +git clone https://git.oe74.net/adelorenzo/konstruct.git +cd konstruct + +# Configure +cp .env.example .env +# Edit .env — set OLLAMA_MODEL, API keys, SMTP, etc. + +# Start all services +docker compose up -d + +# Create admin user +curl -X POST http://localhost:8001/api/portal/auth/register \ + -H "Content-Type: application/json" \ + -d '{"email": "admin@example.com", "password": "YourPassword123", "name": "Admin"}' + +# Set as platform admin +docker exec konstruct-postgres psql -U postgres -d konstruct \ + -c "UPDATE portal_users SET role = 'platform_admin' WHERE email = 'admin@example.com';" +``` + +Open `http://localhost:3000` and sign in. + +### Services + +| Service | Port | Description | +|---------|------|-------------| +| Portal | 3000 | Next.js admin dashboard | +| Gateway | 8001 | FastAPI API + WebSocket | +| LLM Pool | internal | LiteLLM router (Ollama + commercial) | +| Celery Worker | internal | Background task processing | +| PostgreSQL | internal | Primary database with RLS + pgvector | +| Redis | internal | Cache, sessions, pub-sub, task queue | + +--- + +## Architecture + +``` +Client (Slack / WhatsApp / Web Chat) + │ + ▼ +┌─────────────────────┐ +│ Channel Gateway │ Unified ingress, normalizes to KonstructMessage +│ (FastAPI :8001) │ +└────────┬────────────┘ + │ + ▼ +┌─────────────────────┐ +│ Agent Orchestrator │ Memory, tools, escalation, audit +│ (Celery / Direct) │ Web chat streams directly (no Celery) +└────────┬────────────┘ + │ + ▼ +┌─────────────────────┐ +│ LLM Backend Pool │ LiteLLM → Ollama / Anthropic / OpenAI +└─────────────────────┘ +``` + +--- + +## Tech Stack + +### Backend +- **Python 3.12+** — FastAPI, SQLAlchemy 2.0, Pydantic v2, Celery +- **PostgreSQL 16** — RLS multi-tenancy, pgvector for embeddings +- **Redis** — Cache, pub-sub, task queue, sliding window memory +- **LiteLLM** — Unified LLM provider routing with fallback + +### Frontend +- **Next.js 16** — App Router, standalone output +- **Tailwind CSS v4** — Utility-first styling +- **shadcn/ui** — Component library (base-nova style) +- **next-intl** — Internationalization (en/es/pt) +- **Serwist** — Service worker for PWA +- **DM Sans** — Primary font + +### Infrastructure +- **Docker Compose** — Development and deployment +- **Alembic** — Database migrations (14 migrations) +- **Playwright** — E2E testing (7 flows, 3 browsers) +- **Gitea Actions** — CI/CD pipeline + +--- + +## Configuration + +All configuration is via environment variables in `.env`: + +| Variable | Description | Default | +|----------|-------------|---------| +| `OLLAMA_MODEL` | Ollama model for local inference | `qwen3:32b` | +| `OLLAMA_BASE_URL` | Ollama server URL | `http://host.docker.internal:11434` | +| `ANTHROPIC_API_KEY` | Anthropic API key (optional) | — | +| `OPENAI_API_KEY` | OpenAI API key (optional) | — | +| `BRAVE_API_KEY` | Brave Search API key | — | +| `FIRECRAWL_API_KEY` | Firecrawl API key for URL scraping | — | +| `STRIPE_SECRET_KEY` | Stripe billing key | — | +| `AUTH_SECRET` | JWT signing secret | — | +| `PLATFORM_ENCRYPTION_KEY` | Fernet key for BYO API key encryption | — | + +See `.env.example` for the complete list. + +--- + +## Project Structure + +``` +konstruct/ +├── packages/ +│ ├── gateway/ # Channel Gateway (FastAPI) +│ ├── orchestrator/ # Agent Orchestrator (Celery tasks) +│ ├── llm-pool/ # LLM Backend Pool (LiteLLM) +│ ├── router/ # Message Router (tenant resolution, rate limiting) +│ ├── shared/ # Shared models, config, API routers +│ └── portal/ # Admin Portal (Next.js 16) +├── migrations/ # Alembic DB migrations +├── tests/ # Backend test suite +├── docker-compose.yml # Service definitions +├── .planning/ # GSD planning artifacts +└── .env # Environment configuration +``` + +--- + +## License + +Proprietary. All rights reserved.