docs: complete project research

2026-03-22 00:12:58 -06:00
parent 320da9df87
commit 376982f16f
5 changed files with 1655 additions and 0 deletions
--- a/.planning/research/FEATURES.md
+++ b/.planning/research/FEATURES.md
@@ -0,0 +1,270 @@
+# Feature Research
+
+**Domain:** AI workforce platform — channel-native AI employees for SMBs (Slack + WhatsApp)
+**Researched:** 2026-03-22
+**Confidence:** MEDIUM-HIGH (WebSearch verified against multiple sources; some claims from single sources flagged)
+
+---
+
+## Feature Landscape
+
+### Table Stakes (Users Expect These)
+
+Features users assume exist. Missing these = product feels incomplete or unprofessional.
+
+| Feature | Why Expected | Complexity | Notes |
+|---------|--------------|------------|-------|
+| Natural language conversation in-channel | Core promise: AI employee lives in Slack/WhatsApp. No NL = no product. | MEDIUM | Must handle message threading, @mentions, DMs, and group channels |
+| Persistent conversational memory | Users expect the AI to remember prior context within and across sessions. A "goldfish" agent feels broken. | MEDIUM | Sliding window (short-term) + vector search (long-term) required |
+| Human escalation / handoff | Users must be able to override or transfer to a human. Especially required for WhatsApp per Meta's 2026 policy (non-compliant without it). | MEDIUM | Full chat history must transfer with the handoff; clean no-overlap handover |
+| Role and persona configuration | Customers need to define what the AI employee does, its tone, its name. Without this it's a generic bot, not "their" employee. | LOW | YAML/form-based config: name, role description, system prompt |
+| Tool / integration capability | An AI that only talks but can't DO anything (look up a ticket, book a calendar slot) has minimal value for SMBs. | HIGH | Requires tool registry, sandboxed execution, defined tool schemas |
+| Admin portal for configuration | Operators need a UI to set up and manage agents. CLI-only = early adopter only. | HIGH | Tenant CRUD, agent config, channel connection, basic monitoring |
+| Multi-tenant isolation | Platform SaaS: Tenant A must never see Tenant B's data or conversations. | HIGH | PostgreSQL RLS at minimum; enforced at every layer |
+| Subscription billing | SaaS businesses must accept payment. No billing = no revenue = not a product. | MEDIUM | Stripe integration, plan management, upgrade/downgrade flows |
+| Slack integration (Events API + Socket Mode) | Slack is the primary channel for v1. Must support @mention, DM, channel messages, thread replies. | MEDIUM | slack-bolt handles Events API; Socket Mode for real-time without public webhook |
+| WhatsApp Business API integration | WhatsApp is second channel for v1. 3B+ users globally, dominant for SMB-to-customer and team comms. | MEDIUM | Cloud API (Meta-hosted) preferred over on-prem. Per-message billing since July 2025. |
+| Rate limiting per tenant | Without limits, one misbehaving tenant can degrade service for all others. Platform-level hygiene. | LOW | Token bucket per tenant + per channel; configurable hard limits |
+| Audit log for agent actions | SMBs want to know what the AI did. Required for debugging, trust-building, and future compliance. | MEDIUM | Every LLM call, tool invocation, and handoff should be logged with timestamp + actor |
+| Structured onboarding flow | Operators won't configure agents if setup is painful. Wizard-style onboarding is expected by SMB tools. | MEDIUM | Channel connection wizard, agent role setup, first-message test — all in portal |
+
+---
+
+### Differentiators (Competitive Advantage)
+
+Features that set Konstruct apart. Not universally expected but create defensible advantage.
+
+| Feature | Value Proposition | Complexity | Notes |
+|---------|-------------------|------------|-------|
+| True channel-native presence (not a dashboard) | Competitors (Lindy, Sintra, Relevance AI) all require a separate UI. Konstruct's AI lives IN the channel. Zero behavior change for end users. | HIGH | The entire architecture is built for this — gateway normalization, channel adapters, in-thread replies |
+| Single identity across channels (Slack + WhatsApp as same agent) | "Mara" responds on Slack during office hours and WhatsApp during off-hours — same agent, same memory, same persona. Competitors don't offer cross-channel identity. | HIGH | Requires unified memory store keyed to agent ID, not channel session |
+| Tiered multi-tenancy with upgrade path | Starter (RLS) → Team (schema) → Enterprise (dedicated namespace). Competitors are one-size-fits-all. Enables SMB-friendly pricing that scales to enterprise. | HIGH | RLS for v1; schema isolation in v2. Architecture must account for future upgrade path. |
+| LLM provider flexibility (local + commercial) | BYO model or use platform models. Privacy-conscious SMBs can stay on-prem (Ollama). Cost-sensitive ones use smaller models for simple tasks. No competitor offers this at SMB scale. | HIGH | LiteLLM router handles provider abstraction. BYO API keys in v2. |
+| Agent-level cost tracking and budgets | Paperclip-inspired: per-agent monthly budget with auto-pause at limit. SMB operators want cost predictability — they hired an "employee," not a runaway credit card. | MEDIUM | Track LLM tokens per agent per tenant. Surface in portal dashboard. |
+| Coordinator + specialist team pattern (v2) | One "coordinator" agent routes to specialist agents. Enables AI departments, not just individual employees. Market gap identified by TeamDay.ai research — no platform does this for SMBs. | VERY HIGH | v2 feature. Requires inter-agent communication, shared context, audit trail for delegation. |
+| Self-hosted deployment option (v2+) | Enterprise and compliance-sensitive customers can run their own Konstruct. No other SMB-focused competitor offers this. Differentiated vs. SaaS-only solutions. | VERY HIGH | Helm chart + Docker Compose package. Deferred to v2+. |
+| Pre-built agent role templates (v3) | "Customer support lead," "sales development rep," "project coordinator" — pre-configured roles reduce time-to-value. Competitors require extensive config (Lindy = "days or weeks" of setup). | MEDIUM | v3 marketplace. Platform must support importable agent configs first. |
+| Sentiment detection and auto-escalation | Agent detects negative sentiment or frustration and proactively escalates before the customer asks. Competitors handle explicit escalation triggers; proactive sentiment escalation is rare. | HIGH | Requires sentiment scoring in message processing pipeline. Configurable thresholds. |
+
+---
+
+### Anti-Features (Commonly Requested, Often Problematic)
+
+Features that seem good but create problems when built too early or built wrong.
+
+| Feature | Why Requested | Why Problematic | Alternative |
+|---------|---------------|-----------------|-------------|
+| Open-ended general-purpose chatbot on WhatsApp | "Let users ask anything" seems like maximum flexibility | Meta banned general-purpose bots on WhatsApp Business API (effective Jan 2026). Violates ToS and risks account suspension. | Scope agents to specific business functions (support, sales, ops). Use intent detection to handle off-topic gracefully. |
+| Real-time streaming token output in chat | Feels more responsive and "alive" | Slack and WhatsApp do not support partial message streaming — you can only update a message after initial send. Streaming architecture adds complexity for zero user benefit in these channels. | Send complete responses. Use typing indicators during generation. |
+| Full no-code agent builder for customers | "Let customers build their own agents" reduces support burden | Premature abstraction. If core agent quality isn't proven, giving customers a builder produces bad agents and they blame the platform. Increases surface area dramatically before PMF. | Provide config-based setup (YAML/form) with guardrails. Add builder UX in v2 after workflows are understood. |
+| Autonomous multi-step actions without confirmation | Fully autonomous "just do it" appeals to power users | SMBs have low tolerance for irreversible mistakes. Gartner: 40%+ of agentic AI projects cancelled. Trust must be built incrementally. | Support human-in-the-loop confirmation for consequential actions (send email, create ticket, book meeting). Make it opt-out, not opt-in. |
+| Cross-tenant agent communication | "Marketplace scenario: agents from different companies collaborating" | Major security and isolation violation. No current compliance framework supports it. Creates massive liability. | Keep agents strictly tenant-scoped. Marketplace is about sharing templates, not live agent-to-agent communication. |
+| Voice/telephony channels (Twilio integration) | Broadens market reach | Completely different technical stack, latency requirements, and regulatory environment (TCPA, call recording laws). Dilutes focus before channel-native messaging is proven. | Defer to v3+. Validate Slack + WhatsApp first. |
+| Dashboard-first UX (separate webapp for users to talk to AI) | Familiar pattern from other SaaS | Defeats the core value proposition. Konstruct's differentiator is zero behavior change — agent lives in existing channels. A separate dashboard makes Konstruct just another chatbot SaaS. | Keep all agent interactions in the messaging channel. Portal is for operators only, never for end-user conversations. |
+| Context dumping (all docs into vector store at once) | "The more context the better" | Research shows context flooding degrades LLM reasoning. Indiscriminate RAG causes hallucinations and irrelevant responses. | Implement selective retrieval with relevance scoring. Start with narrow, high-quality knowledge sources. Add context hygiene controls in admin portal. |
+
+---
+
+## Feature Dependencies
+
+```
+[Slack Integration]
+    └──requires──> [Channel Gateway (normalize messages)]
+                       └──requires──> [Unified Message Format (KonstructMessage)]
+
+[WhatsApp Integration]
+    └──requires──> [Channel Gateway (normalize messages)]
+                       └──requires──> [Unified Message Format (KonstructMessage)]
+
+[Conversational Memory]
+    └──requires──> [Tenant-scoped conversation store (PostgreSQL)]
+    └──requires──> [Vector store for long-term memory (pgvector)]
+
+[Tool / Integration Capability]
+    └──requires──> [Tool Registry]
+    └──requires──> [Sandboxed Execution Environment]
+    └──requires──> [Agent Orchestrator (decides when to call tools)]
+
+[Agent Orchestrator]
+    └──requires──> [LLM Backend Pool (LiteLLM)]
+    └──requires──> [Conversational Memory]
+    └──requires──> [Tool Registry]
+
+[Multi-tenant Isolation]
+    └──requires──> [Tenant Resolution (Router)]
+    └──requires──> [PostgreSQL RLS configuration]
+    └──requires──> [Per-tenant Redis namespace]
+
+[Subscription Billing]
+    └──requires──> [Tenant management (CRUD)]
+    └──requires──> [Stripe integration]
+    └──enhances──> [Agent-level cost tracking]
+
+[Admin Portal]
+    └──requires──> [Tenant management (CRUD)]
+    └──requires──> [Agent configuration storage]
+    └──requires──> [Channel connection management]
+    └──requires──> [Auth (NextAuth.js / Keycloak)]
+
+[Human Escalation / Handoff]
+    └──requires──> [Audit log (context must transfer)]
+    └──requires──> [Configurable escalation rules in agent config]
+
+[Agent-level Cost Tracking] ──enhances──> [Subscription Billing]
+[Audit Log] ──enhances──> [Human Escalation]
+[Audit Log] ──enhances──> [Admin Portal monitoring view]
+
+[Coordinator + Specialist Teams (v2)]
+    └──requires──> [Single-agent orchestrator (v1) proven stable]
+    └──requires──> [Inter-agent communication bus]
+    └──requires──> [Shared team context store]
+
+[Cross-channel Identity (same agent on Slack + WhatsApp)]
+    └──requires──> [Agent memory keyed to agent_id, not channel session_id]
+    └──requires──> [Both channel integrations working]
+
+[Self-hosted Deployment (v2+)]
+    └──requires──> [All v1 services containerized]
+    └──requires──> [Helm chart or Docker Compose packaging]
+    └──requires──> [External secrets management documented]
+```
+
+### Dependency Notes
+
+- **Channel integrations require Channel Gateway:** All Slack/WhatsApp adapters must normalize to KonstructMessage before reaching any business logic. This isolation is what enables future channels to be added without touching the orchestrator.
+- **Agent Orchestrator requires LLM Pool:** The orchestrator cannot function without a working LiteLLM router. LLM Pool is a prerequisite, not a parallel track.
+- **Human handoff requires Audit Log:** The full conversation context (including tool calls) must be available at handoff time. Audit Log is not just a compliance feature — it's operationally required.
+- **Coordinator teams (v2) require stable v1 single-agent:** Multi-agent coordination multiplies failure modes. The single-agent path must be reliable and instrumented before introducing delegation.
+- **Cross-channel identity requires memory keyed to agent_id:** If conversation history is stored per-channel-session rather than per-agent, the same agent on two channels will have fragmented memory. This is an architectural decision that must be correct in v1.
+
+---
+
+## MVP Definition
+
+### Launch With (v1 — Beta-Ready)
+
+Minimum viable product to validate the channel-native AI employee thesis with real paying users.
+
+- [ ] Slack integration (Events API + Socket Mode via slack-bolt) — primary channel, where SMB teams work
+- [ ] WhatsApp Business Cloud API integration — secondary channel, massive business communication reach
+- [ ] Channel Gateway with unified KonstructMessage normalization — architectural foundation for future channels
+- [ ] Single AI employee per tenant with configurable role, persona, and tools — prove the core thesis
+- [ ] Conversational memory (sliding window + pgvector long-term) — agents must remember; goldfish agents get churned
+- [ ] Tool framework with at least 2-3 built-in tools (web search, knowledge base search, calendar lookup) — agent must DO things, not just chat
+- [ ] Human escalation / handoff with full context transfer — required for trust, required for WhatsApp ToS compliance
+- [ ] LiteLLM backend pool (Ollama local + Anthropic/OpenAI commercial) — cost/quality flexibility
+- [ ] Multi-tenant PostgreSQL RLS isolation — prerequisite to accepting multiple real customers
+- [ ] Admin portal: tenant onboarding, agent config, channel connection wizard — operators need a UI, not config files
+- [ ] Stripe billing integration (subscription plans) — no billing = no revenue = not a real product
+- [ ] Rate limiting per tenant + per channel — platform protection before accepting real users
+- [ ] Audit log for agent actions — debugging, trust-building, future compliance foundation
+- [ ] Agent-level cost tracking — SMB operators need cost predictability; surfaces in portal dashboard
+
+### Add After Validation (v1.x)
+
+Features to add once core is stable and validated with early users.
+
+- [ ] BYO API key support — validated demand from privacy-conscious or cost-sensitive customers
+- [ ] Additional channels (Mattermost, Telegram, Microsoft Teams) — after Slack + WhatsApp patterns proven
+- [ ] Cross-channel agent identity (same agent memory across Slack + WhatsApp) — architectural upgrade once both channels are stable
+- [ ] Sentiment-based auto-escalation — requires volume of real conversations to tune thresholds
+- [ ] Pre-built tool integrations (Zendesk, HubSpot, Google Calendar) — validated by what tools early users actually request
+- [ ] Agent analytics dashboard in portal — requires baseline data from real usage
+
+### Future Consideration (v2+)
+
+Features to defer until product-market fit is established.
+
+- [ ] Multi-agent coordinator + specialist team pattern — complex orchestration only after single-agent is proven
+- [ ] AI company hierarchy (teams of teams) — organizational complexity requires strong single-agent foundation
+- [ ] Self-hosted deployment (Helm chart) — compliance-driven demand; validate SaaS first
+- [ ] Schema-per-tenant isolation (Team tier) — upgrade from RLS when scale requires it
+- [ ] Agent marketplace / pre-built role templates — requires understanding of what roles customers actually use
+- [ ] White-labeling for agencies — secondary market; validate direct SMB first
+- [ ] Voice/telephony channels — completely different stack; defer until messaging is proven
+
+---
+
+## Feature Prioritization Matrix
+
+| Feature | User Value | Implementation Cost | Priority |
+|---------|------------|---------------------|----------|
+| Slack integration | HIGH | MEDIUM | P1 |
+| WhatsApp integration | HIGH | MEDIUM | P1 |
+| Channel Gateway (normalization) | HIGH (architectural) | MEDIUM | P1 |
+| Conversational memory | HIGH | MEDIUM | P1 |
+| Human escalation / handoff | HIGH | MEDIUM | P1 |
+| Single agent per tenant (config + orchestration) | HIGH | HIGH | P1 |
+| Multi-tenant isolation (RLS) | HIGH (invisible, but critical) | HIGH | P1 |
+| Admin portal (onboarding + agent config) | HIGH | HIGH | P1 |
+| Stripe billing | HIGH | MEDIUM | P1 |
+| LiteLLM backend pool | HIGH (architectural) | MEDIUM | P1 |
+| Tool framework (registry + execution) | HIGH | HIGH | P1 |
+| Rate limiting | MEDIUM | LOW | P1 |
+| Audit logging | MEDIUM | MEDIUM | P1 |
+| Agent cost tracking | MEDIUM | MEDIUM | P2 |
+| BYO API keys | MEDIUM | MEDIUM | P2 |
+| Cross-channel agent identity | MEDIUM | HIGH | P2 |
+| Sentiment-based auto-escalation | MEDIUM | HIGH | P2 |
+| Pre-built tool integrations (Zendesk, HubSpot) | MEDIUM | MEDIUM | P2 |
+| Multi-agent coordinator teams | HIGH (v2) | VERY HIGH | P3 |
+| Self-hosted deployment | MEDIUM (v2+) | HIGH | P3 |
+| Agent marketplace / templates | MEDIUM (v3) | MEDIUM | P3 |
+
+**Priority key:**
+- P1: Must have for v1 beta launch
+- P2: Should have, add after v1 validation
+- P3: Future roadmap, defer until PMF established
+
+---
+
+## Competitor Feature Analysis
+
+| Feature | Lindy / Relevance AI | Sintra | Agentforce (Salesforce) | Paperclip.ing | Our Approach |
+|---------|----------------------|--------|-------------------------|----------------|--------------|
+| Channel-native presence | No — separate dashboard UI | No — separate UI | Partial — Slack only via enterprise plan | No — orchestration layer only (uses OpenClaw as channel layer) | Yes — primary value proposition; agents live IN Slack/WhatsApp |
+| SMB pricing | $49+/month, usage-based | $97/month flat | Enterprise pricing ($150+/user) | Open-source self-hosted | Subscription tiers starting SMB-friendly; transparent per-agent pricing |
+| Setup time | Days to weeks (no-code builder) | Fast but limited | Weeks (Salesforce ecosystem required) | Fast CLI setup; agents via connected frameworks | Under 30 minutes via wizard onboarding in portal |
+| Multi-agent teams | Yes (workflow chains) | No (siloed assistants) | Yes (enterprise) | Yes (org chart of agents) | v2 — single agent for v1, teams in v2 |
+| Memory / conversation history | Yes (varies by plan) | Limited | Yes (Slack Enterprise Search + CRM) | Yes (persistent agent state) | Yes — sliding window + pgvector long-term; cross-channel memory in v1.x |
+| Tool integrations | 1,600+ (Lindy) | Limited | Salesforce CRM native | Any HTTP webhook / bash | Start with essential SMB tools; expandable registry |
+| BYO LLM models | Partial | No | No (Salesforce models only) | Yes (any agent framework) | Yes — LiteLLM abstracts providers; BYO keys in v2 |
+| Self-hosted option | No | No | No | Yes (MIT license) | v2+ (Helm chart) |
+| Human escalation | Yes | Limited | Yes | No (out of scope) | Yes — required for WhatsApp ToS and trust |
+| Audit trail | Partial | No | Yes (enterprise) | Yes (ticket system, tool-call tracing) | Yes — every action logged; surfaces in admin portal |
+| Multi-tenancy (SaaS) | Yes | Yes | Yes (enterprise) | No (single-tenant self-hosted) | Yes — PostgreSQL RLS v1, schema isolation v2 |
+| Cost tracking per agent | No | No | Limited | Yes (per-agent budgets) | Yes — adopting Paperclip's budget model; surface in portal |
+
+---
+
+## Critical External Constraint: WhatsApp 2026 Policy
+
+**HIGH confidence** — verified against Meta's official policy rollout (effective January 15, 2026):
+
+Meta banned open-ended general-purpose chatbots on the WhatsApp Business API. Agents must serve **specific business functions** (customer support, order tracking, lead qualification, booking). This constraint shapes how agent roles are defined and marketed:
+
+- Agent personas must be scoped to a business domain (support, sales, HR, ops)
+- "Ask me anything" configurations must be blocked or warned against in the admin portal
+- Escalation to humans is implicitly required for compliance (unresolvable queries must have an out)
+- General-purpose Q&A capabilities (weather, general knowledge) should be disabled in the WhatsApp adapter or gracefully declined
+
+This is not optional — violating it risks WhatsApp Business account suspension.
+
+---
+
+## Sources
+
+- [TeamDay.ai: AI Employees Market Map 2026](https://www.teamday.ai/blog/ai-employees-market-map-2026) — Platform comparison, market gap analysis (MEDIUM confidence — single source, industry blog)
+- [Paperclip.ing](https://paperclip.ing/) — Feature reference for AI workforce orchestration, cost tracking model (HIGH confidence — official source)
+- [OpenClaw: Multi-Channel AI Agent](https://openclaw.ai/) — Channel-native agent reference implementation (MEDIUM confidence — official source)
+- [Respond.io: WhatsApp General Purpose Chatbots Ban](https://respond.io/blog/whatsapp-general-purpose-chatbots-ban) — WhatsApp 2026 AI policy details (HIGH confidence — verified against Meta policy dates)
+- [Composio: Why AI Agent Pilots Fail 2026](https://composio.dev/blog/why-ai-agent-pilots-fail-2026-integration-roadmap) — Anti-patterns, failure modes (MEDIUM confidence — industry report)
+- [Kore.ai: Navigating Pitfalls of AI Agent Development](https://www.kore.ai/blog/navigating-the-pitfalls-of-ai-agent-development) — Agent development pitfalls (MEDIUM confidence)
+- [Stripe: Framework for Pricing AI Products](https://stripe.com/blog/a-framework-for-pricing-ai-products) — Billing model guidance (HIGH confidence — Stripe official)
+- [Slack: AI Agent Solutions](https://slack.com/ai-agents) — Slack AI agent capabilities reference (HIGH confidence — official Slack docs)
+- [Vendasta: AI Employees](https://www.vendasta.com/blog/ai-employee/) — SMB AI workforce patterns (MEDIUM confidence — industry blog)
+- [HBR: Why Agentic AI Projects Fail](https://hbr.org/2025/10/why-agentic-ai-projects-fail-and-how-to-set-yours-up-for-success) — Anti-pattern validation (HIGH confidence — peer-reviewed publication)
+
+---
+*Feature research for: AI workforce platform — channel-native AI employees for SMBs*
+*Researched: 2026-03-22*