Files

Adolfo Delorenzo e56b5f885b docs(10-01): complete KB ingestion pipeline plan

2026-03-26 09:11:56 -06:00

19 KiB

Raw Blame History

Roadmap: Konstruct

Overview

Konstruct ships in three coarse phases ordered by dependency: first build the secure multi-tenant pipeline and prove that a Slack message triggers an LLM response (Phase 1 — Foundation), then add the agent capabilities that make it a real product: memory, tools, WhatsApp, and escalation (Phase 2 — Agent Features), then complete the operator-facing experience so tenants can self-onboard and pay (Phase 3 — Operator Experience). Phase 3 is gated on DB schema stability, which only exists after Phase 2 defines the memory and tool data models.

Phases

Phase Numbering:

Integer phases (1, 2, 3): Planned milestone work
Decimal phases (2.1, 2.2): Urgent insertions (marked with INSERTED)

Decimal phases appear between their surrounding integers in numeric order.

Phase 1: Foundation - Secure multi-tenant pipeline with Slack end-to-end and basic agent response (completed 2026-03-23)
Phase 2: Agent Features - Persistent memory, tool framework, WhatsApp integration, and human escalation (gap closure in progress) (completed 2026-03-24)
Phase 3: Operator Experience - Admin portal, tenant onboarding, and Stripe billing (gap closure in progress)
Phase 4: RBAC - Three-tier role-based access control with email invitation flow (completed 2026-03-24)

Phase Details

Phase 1: Foundation

Goal: Operators can deploy the platform, a Slack message triggers an LLM response back in-thread, and no tenant can ever see another tenant's data Depends on: Nothing (first phase) Requirements: CHAN-01, CHAN-02, CHAN-05, AGNT-01, LLM-01, LLM-02, TNNT-01, TNNT-02, TNNT-03, TNNT-04, PRTA-01, PRTA-02 Success Criteria (what must be TRUE):

A user can send a Slack @mention or DM to the AI employee and receive a coherent reply in the same thread — end-to-end in under 30 seconds
Tenant A's messages, agent configuration, and conversation data are completely invisible to Tenant B — verified by integration tests with two-tenant fixtures
A request that exceeds the per-tenant or per-channel rate limit is rejected with an informative response rather than silently dropped
The LLM backend pool routes requests through LiteLLM to both Ollama (local) and Anthropic/OpenAI, with automatic fallback when a provider is unavailable
A new AI employee can be configured with a custom name, role, and persona — and that persona is reflected in responses
An operator can create tenants and design agents (name, role, persona, system prompt, tools, escalation rules) via the admin portal Plans: 4 plans

Plans:

01-01: Monorepo scaffolding, Docker Compose dev environment, shared Pydantic models, DB schema with RLS
01-02: LiteLLM backend pool service with Ollama + Anthropic/OpenAI providers and Celery async dispatch
01-03: Channel Gateway (Slack adapter), Message Router (tenant resolution), basic Agent Orchestrator (single agent, no memory/tools)
01-04: Next.js admin portal with Auth.js v5, tenant CRUD, and Agent Designer module

Phase 2: Agent Features

Goal: The AI employee maintains conversation memory, can execute tools, handles WhatsApp messages, and escalates to humans when rules trigger — making it a capable product rather than a demo Depends on: Phase 1 Requirements: CHAN-03, CHAN-04, AGNT-02, AGNT-03, AGNT-04, AGNT-05, AGNT-06 Success Criteria (what must be TRUE):

The AI employee remembers context from earlier in the same conversation and can reference it accurately — tested at 30+ conversation turns without degradation
A user can send a WhatsApp message to the AI employee and receive a reply — with per-tenant phone number isolation and business-function scoping enforced per Meta 2026 policy
The agent can invoke a registered tool (e.g., knowledge base search) and incorporate the result into its response
When a configured escalation rule triggers (e.g., failed resolution attempts), the conversation and full context are handed off to a human with no information lost
Every LLM call, tool invocation, and handoff event is recorded in an immutable audit trail queryable by tenant Plans: 6 plans

Plans:

02-01: Conversational memory layer (Redis sliding window + pgvector long-term storage with HNSW index)
02-02: Tool framework (registry, schema-validated execution, audit logging) — split into audit+tools+wiring
02-03: WhatsApp adapter (Business Cloud API, per-tenant phone numbers, media download, Meta policy compliance)
02-04: Human escalation/handoff with full context transfer and audit trail
02-05: Cross-channel media support and multimodal LLM interpretation (Slack file_share, image_url content blocks, channel-aware outbound routing)
02-06: Gap closure — re-wire escalation handler and WhatsApp outbound routing into pipeline, add tier-2 system prompt scoping

Phase 3: Operator Experience

Goal: An operator can sign up, onboard their tenant through a web UI, connect their messaging channels, configure their AI employee, and manage their subscription — without touching config files or the command line Depends on: Phase 2 Requirements: AGNT-07, LLM-03, PRTA-03, PRTA-04, PRTA-05, PRTA-06 Success Criteria (what must be TRUE):

An operator can connect Slack and WhatsApp to their tenant through a guided in-portal wizard without reading documentation
A new tenant completes the full onboarding sequence (connect channel -> configure agent -> send test message) in under 15 minutes
An operator can subscribe, upgrade, and cancel their plan through Stripe — and feature limits are enforced automatically based on subscription state
The portal displays per-tenant agent cost and token usage, giving operators visibility into spending without requiring access to backend logs Plans: 5 plans

Plans:

03-01-PLAN.md — Backend foundation: DB migrations, billing models, encryption service, channel/billing/usage API endpoints, audit logger token metadata
03-02-PLAN.md — Channel connection wizard (Slack OAuth + WhatsApp manual), onboarding flow with 3-step stepper, BYO API key settings page
03-03-PLAN.md — Stripe billing page with subscription management, status badges, Checkout and Billing Portal redirects
03-04-PLAN.md — Cost tracking dashboard with Recharts charts, budget alert badges, time range filtering
03-05-PLAN.md — Gap closure: mount Phase 3 API routers on gateway, fix Slack OAuth and budget alert field name mismatches (completed 2026-03-24)

Phase 4: RBAC

Goal: Three-tier role-based access control — platform admins manage the SaaS, customer admins manage their tenant, customer operators get read-only access — with email invitation flow for onboarding tenant users Depends on: Phase 3 Requirements: RBAC-01, RBAC-02, RBAC-03, RBAC-04, RBAC-05, RBAC-06 Success Criteria (what must be TRUE):

A platform admin can see all tenants, all agents, and all users across the entire platform
A customer admin can only see their own tenant's agents, users, billing, and settings — no cross-tenant visibility
A customer operator can view agents and usage dashboards but cannot create, edit, or delete anything
A customer admin can invite a new user (admin or operator) by email — the invitee receives a link, clicks to activate, and sets their password
Portal navigation and API endpoints enforce role-based access — unauthorized actions return 403, not just hidden UI elements Plans: 3 plans

Plans:

04-01-PLAN.md — Backend RBAC foundation: DB migration (is_admin -> role enum), ORM models (UserTenantRole, PortalInvitation), RBAC guard dependencies, invitation API + SMTP email, unit tests
04-02-PLAN.md — Portal RBAC integration: Auth.js JWT role claims, proxy role redirects, role-filtered nav, tenant switcher, impersonation banner, invite acceptance page, user management pages
04-03-PLAN.md — Wire RBAC guards to all existing API endpoints, impersonation audit logging, integration tests, human verification checkpoint

Phase 5: Employee Design

Goal: Operators and customer admins can create AI employees through a guided wizard that walks them through role definition, persona setup, tool selection, and channel assignment — or deploy instantly from a library of pre-built agent templates Depends on: Phase 4 Requirements: EMPL-01, EMPL-02, EMPL-03, EMPL-04, EMPL-05 Success Criteria (what must be TRUE):

An operator can create a fully configured AI employee by completing a multi-step wizard without needing to understand the underlying system prompt format
Pre-built agent templates (e.g., Customer Support Rep, Sales Assistant, Office Manager) are available for one-click deployment with sensible defaults
A template-deployed agent is immediately functional — responds in connected channels with the template's persona, tools, and escalation rules
The wizard and templates are accessible to both platform admins and customer admins (respecting RBAC)
Created agents appear in the Agent Designer for further customization after initial setup Plans: 4 plans

Plans:

05-01-PLAN.md — Backend: AgentTemplate model, migration 007 with 7 seed templates, template list/deploy API, system prompt builder, unit + integration tests
05-02-PLAN.md — Frontend: three-option entry screen, template gallery with one-click deploy, 5-step wizard (Role/Persona/Tools/Channels/Escalation), Advanced mode relocation
05-03-PLAN.md — Human verification: test all three creation paths, RBAC enforcement, system prompt auto-generation
05-04-PLAN.md — Gap closure: add /agents/new to proxy RBAC restrictions, hide New Employee button for operators, fix wizard deploy error handling

Phase 6: Web Chat

Goal: Users can chat with AI Employees directly in the portal through a real-time web chat interface — no external messaging platform required Depends on: Phase 5 Requirements: CHAT-01, CHAT-02, CHAT-03, CHAT-04, CHAT-05 Success Criteria (what must be TRUE):

A user can open a chat window with any AI Employee and have a real-time conversation within the portal
The chat interface supports the full agent pipeline — memory, tools, escalation, and media (same capabilities as Slack/WhatsApp)
Conversation history persists and is visible when the user returns to the chat
The chat respects RBAC — users can only chat with agents belonging to tenants they have access to
The chat interface feels responsive — typing indicators, message streaming or fast response display Plans: 3 plans

Plans:

06-01-PLAN.md — Backend: DB migration (web_conversations + web_conversation_messages), ORM models, ChannelType.WEB, Redis pub-sub key, WebSocket endpoint, web channel adapter, chat REST API with RBAC, orchestrator _send_response wiring, unit tests
06-02-PLAN.md — Frontend: /chat page with conversation sidebar, message window with markdown rendering, typing indicators, WebSocket hook, agent picker dialog, nav link, react-markdown install
06-03-PLAN.md — Human verification: end-to-end chat flow, conversation persistence, RBAC enforcement, markdown rendering, all roles can chat

Progress

Execution Order: Phases execute in numeric order: 1 -> 2 -> 3 -> 4 -> 5 -> 6 -> 7 -> 8 -> 9 -> 10

Phase	Plans Complete	Status	Completed
1. Foundation	4/4	Complete	2026-03-23
2. Agent Features	6/6	Complete	2026-03-24
3. Operator Experience	5/5	Complete	2026-03-24
4. RBAC	3/3	Complete	2026-03-24
5. Employee Design	4/4	Complete	2026-03-25
6. Web Chat	3/3	Complete	2026-03-25
7. Multilanguage	4/4	Complete	2026-03-25
8. Mobile + PWA	4/4	Complete	2026-03-26
9. Testing & QA	3/3	Complete	2026-03-26
10. Agent Capabilities	2/3	In Progress

Coverage Notes

LLM-03 conflict resolved: BYO API keys confirmed in v1 scope per user decision during Phase 3 context gathering. Implemented via Fernet encryption in Phase 3.

Phase 7: Multilanguage

Goal: The entire platform supports English, Spanish, and Portuguese — the portal UI is fully localized with a language switcher, and AI Employees respond in the user's language Depends on: Phase 6 Requirements: I18N-01, I18N-02, I18N-03, I18N-04, I18N-05, I18N-06 Success Criteria (what must be TRUE):

The portal UI (all pages, labels, buttons, messages) renders correctly in English, Spanish, and Portuguese
A user can switch language from anywhere in the portal via a language selector, and the change persists across sessions
AI Employees detect the user's language and respond in the same language — or use a language configured per agent
Agent templates, wizard steps, and onboarding flow are all fully translated
Error messages, validation text, and system notifications are localized
Adding a new language in the future requires only adding translation files, not code changes Plans: 4 plans

Plans:

07-01-PLAN.md — Backend i18n: migration 009 (language column + translations JSONB), system prompt language instruction, localized emails, locale-aware templates API
07-02-PLAN.md — Frontend i18n infrastructure: next-intl setup, complete en/es/pt message files, language switcher, Auth.js JWT language sync
07-03-PLAN.md — Frontend string extraction: replace all hardcoded English strings with useTranslations() calls across all pages and components
07-04-PLAN.md — Human verification: multilanguage testing across all pages, language switcher, AI Employee language response

Phase 8: Mobile + PWA

Goal: The portal is fully responsive on mobile/tablet devices and installable as a Progressive Web App — operators and customers can manage their AI workforce and chat with employees from any device Depends on: Phase 7 Requirements: MOB-01, MOB-02, MOB-03, MOB-04, MOB-05, MOB-06 Success Criteria (what must be TRUE):

All portal pages render correctly and are usable on mobile screens (320px-480px) and tablets (768px-1024px)
The sidebar collapses to a bottom tab bar on mobile with smooth open/close animation
The chat interface is fully functional on mobile — send messages, see streaming responses, scroll history
The portal can be installed as a PWA from Chrome/Safari with app icon, splash screen, and offline shell
Push notifications work for new messages when the PWA is installed (or at minimum, the service worker caches the app shell for instant load)
All touch interactions (swipe, tap, long-press) feel native — no hover-dependent UI that breaks on touch Plans: 4 plans

Plans:

08-01-PLAN.md — PWA infrastructure (manifest, service worker, icons, offline banner) + responsive layout (bottom tab bar, More sheet, layout split)
08-02-PLAN.md — Mobile chat (full-screen WhatsApp-style flow, Visual Viewport keyboard handling, touch-safe interactions)
08-03-PLAN.md — Push notifications (VAPID, push subscription DB, service worker push handler, offline message queue, install prompt)
08-04-PLAN.md — Human verification: mobile responsive layout, PWA install, push notifications, touch interactions

Phase 9: Testing & QA

Goal: Comprehensive automated testing and quality assurance — E2E tests for critical user flows, Lighthouse audits for performance/accessibility, visual regression testing across viewports, and cross-browser validation — ensuring the platform is beta-ready Depends on: Phase 8 Requirements: QA-01, QA-02, QA-03, QA-04, QA-05, QA-06, QA-07 Success Criteria (what must be TRUE):

Playwright E2E tests cover all critical flows: login, tenant CRUD, agent deployment (template + wizard), chat with streaming response, billing, RBAC enforcement
Lighthouse scores >= 90 for performance, accessibility, best practices, and SEO on key pages
Visual regression snapshots exist for all key pages at desktop (1280px), tablet (768px), and mobile (375px) viewports
axe-core accessibility audit passes with zero critical violations across all pages
All E2E tests pass on Chrome, Firefox, and Safari (WebKit)
Empty states, error states, and loading states are tested and render correctly
CI-ready test suite that can run in a GitHub Actions / Gitea Actions pipeline Plans: 3 plans

Plans:

09-01-PLAN.md — Playwright infrastructure (config, auth fixtures, seed helpers) + all 7 critical flow E2E tests (login, tenant CRUD, agent deploy, chat, RBAC, i18n, mobile)
09-02-PLAN.md — Visual regression snapshots at 3 viewports, axe-core accessibility scans, Lighthouse CI score gating
09-03-PLAN.md — Gitea Actions CI pipeline (backend lint+pytest, portal build+E2E+Lighthouse) + human verification

Phase 10: Agent Capabilities

Goal: Connect the 4 built-in agent tools to real external services so AI Employees can actually search the web, query a knowledge base of uploaded documents, make HTTP API calls, and check calendar availability — with full CRUD Google Calendar integration and a dedicated KB management portal page Depends on: Phase 9 Requirements: CAP-01, CAP-02, CAP-03, CAP-04, CAP-05, CAP-06, CAP-07 Success Criteria (what must be TRUE):

Web search tool returns real search results from a search provider (Brave, SerpAPI, or similar)
Knowledge base tool can search documents that operators have uploaded (PDF, DOCX, TXT) — documents are chunked, embedded, and stored in pgvector per tenant
Operators can upload documents to a tenant's knowledge base via the portal
HTTP request tool can call arbitrary URLs configured by the operator, with response parsing
Calendar tool can check availability on Google Calendar (read-only for v1)
Tool results are incorporated naturally into agent responses (no raw JSON dumps)
All tool invocations are logged in the audit trail with input/output Plans: 3 plans

Plans:

10-01-PLAN.md — KB ingestion pipeline backend: migration 013, text extractors (PDF/DOCX/PPTX/XLSX/CSV/TXT/MD), chunking + embedding Celery task, KB API router (upload/list/delete/reindex/URL), executor tenant_id injection, web search config
10-02-PLAN.md — Google Calendar OAuth per tenant: install/callback endpoints, calendar_lookup replacement with list/create/check_availability, encrypted token storage, router mounting, tool response formatting
10-03-PLAN.md — Portal KB management page: document list with status polling, file upload (drag-and-drop), URL/YouTube ingestion, delete/reindex, RBAC, human verification

Roadmap created: 2026-03-23 Coverage: 25/25 v1 requirements + 6 RBAC requirements + 5 Employee Design requirements + 5 Web Chat requirements + 6 Multilanguage requirements + 6 Mobile+PWA requirements + 7 Testing & QA requirements + 7 Agent Capabilities requirements mapped

19 KiB Raw Blame History

Roadmap: Konstruct

Overview

Phases

Phase Details

Phase 1: Foundation

Phase 2: Agent Features

Phase 3: Operator Experience

Phase 4: RBAC

Phase 5: Employee Design

Phase 6: Web Chat

Progress

Coverage Notes

Phase 7: Multilanguage

Phase 8: Mobile + PWA

Phase 9: Testing & QA

Phase 10: Agent Capabilities

19 KiB

Raw Blame History