docs(02): capture phase context
This commit is contained in:
122
.planning/phases/02-agent-features/02-CONTEXT.md
Normal file
122
.planning/phases/02-agent-features/02-CONTEXT.md
Normal file
@@ -0,0 +1,122 @@
|
|||||||
|
# Phase 2: Agent Features - Context
|
||||||
|
|
||||||
|
**Gathered:** 2026-03-23
|
||||||
|
**Status:** Ready for planning
|
||||||
|
|
||||||
|
<domain>
|
||||||
|
## Phase Boundary
|
||||||
|
|
||||||
|
The AI employee maintains conversation memory, can execute tools, handles WhatsApp messages, and escalates to humans when rules trigger. Includes conversational memory (sliding window + pgvector long-term), tool framework with 4 built-in tools, WhatsApp Business Cloud API integration with Meta 2026 policy compliance, human escalation/handoff, audit logging, and bidirectional media support across all channels.
|
||||||
|
|
||||||
|
</domain>
|
||||||
|
|
||||||
|
<decisions>
|
||||||
|
## Implementation Decisions
|
||||||
|
|
||||||
|
### Conversational Memory
|
||||||
|
- Full conversation history stored in pgvector — no messages are dropped
|
||||||
|
- Vector retrieval surfaces relevant past context (not full history dump) when assembling LLM prompt
|
||||||
|
- Cross-conversation memory — agent remembers user preferences and context across separate conversations
|
||||||
|
- Memory keyed per-user per-agent — memory follows the user across channels (same agent remembers you in Slack and WhatsApp)
|
||||||
|
- Indefinite retention — memory never expires. Operator can purge manually if needed.
|
||||||
|
- Sliding window for immediate context (last N messages verbatim in prompt), vector search for older/cross-conversation context
|
||||||
|
|
||||||
|
### Tool Framework
|
||||||
|
- 4 built-in tools for v1: web search, knowledge base search, HTTP request, calendar lookup
|
||||||
|
- Knowledge base content populated via both file upload (PDFs, docs) and URL ingestion (crawl, chunk, embed)
|
||||||
|
- Seamless tool usage — agent incorporates tool results naturally into responses, does NOT announce "let me look that up"
|
||||||
|
- Always confirm before consequential actions — agent asks user before booking calendar slots, sending HTTP requests, or any action with side effects. Read-only tools (search, KB lookup) execute without confirmation.
|
||||||
|
- Tool invocations are schema-validated before execution — prevents prompt injection into tool arguments
|
||||||
|
- Every tool invocation logged in audit trail
|
||||||
|
|
||||||
|
### Human Escalation
|
||||||
|
- Handoff destination: DM to an assigned human (configured per tenant in Agent Designer)
|
||||||
|
- Full conversation transcript included in the DM — human sees complete context
|
||||||
|
- Agent stays in the thread as assistant after escalation — can provide context to the human if asked, but defers to the human for responses to the end user
|
||||||
|
- Natural language escalation ("can I talk to a human?") is configurable per tenant — operator enables/disables in Agent Designer
|
||||||
|
- Configurable rule-based escalation triggers (from Agent Designer: e.g., failed resolution attempts, billing disputes)
|
||||||
|
|
||||||
|
### WhatsApp Interaction Model
|
||||||
|
- Same persona across Slack and WhatsApp — consistent employee identity, same name, tone, and behavior
|
||||||
|
- Business-function scoping: layered enforcement — explicit allowlist as first gate (canned rejection for clearly off-topic, no LLM call), then role-based LLM handling for edge cases
|
||||||
|
- Operator defines allowed business functions in Agent Designer (e.g., "order tracking", "returns", "support")
|
||||||
|
- Off-topic messages on WhatsApp get a polite redirect: "[Name] is here to help with [allowed topics]. How can I assist you with one of those?"
|
||||||
|
|
||||||
|
### Media Support (All Channels)
|
||||||
|
- Bidirectional media support across Slack and WhatsApp
|
||||||
|
- Agent can RECEIVE images and documents (invoices, receipts, screenshots) and interpret them via multimodal LLM
|
||||||
|
- Agent can SEND images and documents back to users (templates, reports, generated files)
|
||||||
|
- Use case examples: user sends an invoice photo, agent extracts line items; agent sends a company expense report template
|
||||||
|
- KonstructMessage format must be extended to handle media attachments (image URLs, file references)
|
||||||
|
- Media stored in MinIO (self-hosted) / S3 with per-tenant isolation
|
||||||
|
|
||||||
|
### Audit Logging
|
||||||
|
- Every LLM call, tool invocation, and handoff event recorded in immutable audit trail
|
||||||
|
- Audit entries include: timestamp, tenant_id, agent_id, user_id, action_type, input/output summary, latency
|
||||||
|
- Queryable by tenant — operators can review agent actions
|
||||||
|
- Audit data feeds into Phase 3 cost tracking dashboard
|
||||||
|
|
||||||
|
### Claude's Discretion
|
||||||
|
- Sliding window size (how many recent messages kept verbatim)
|
||||||
|
- Vector similarity threshold for memory retrieval
|
||||||
|
- KB chunking strategy and embedding model choice
|
||||||
|
- Calendar lookup integration approach (Google Calendar API vs generic iCal)
|
||||||
|
- Web search provider (Brave Search API, SerpAPI, etc.)
|
||||||
|
- HTTP request tool: timeout limits, allowed methods, response size caps
|
||||||
|
- WhatsApp message template format for outbound media
|
||||||
|
- Audit log storage strategy (PostgreSQL table vs append-only log)
|
||||||
|
|
||||||
|
</decisions>
|
||||||
|
|
||||||
|
<specifics>
|
||||||
|
## Specific Ideas
|
||||||
|
|
||||||
|
- Agents should feel like they genuinely remember you — not "according to my records" but natural recall like a real colleague would have
|
||||||
|
- Invoice/document interpretation is a key use case for SMBs — the agent reading a photo of a receipt and extracting amounts is a powerful demo
|
||||||
|
- The "employee" metaphor extends to escalation — it should feel like "let me get my manager" not "transferring to support"
|
||||||
|
- Tool confirmation should feel conversational: "I found a slot at 2pm Thursday — shall I book it?" not "Confirm action: calendar.book()"
|
||||||
|
|
||||||
|
</specifics>
|
||||||
|
|
||||||
|
<code_context>
|
||||||
|
## Existing Code Insights
|
||||||
|
|
||||||
|
### Reusable Assets
|
||||||
|
- `packages/shared/shared/models/tenant.py:Agent` — Agent model with name, role, persona, system_prompt, escalation_rules, model_preference fields
|
||||||
|
- `packages/shared/shared/models/message.py:KonstructMessage` — Unified message format (needs extension for media attachments)
|
||||||
|
- `packages/orchestrator/orchestrator/tasks.py:handle_message` — Celery task entry point (sync def with asyncio.run)
|
||||||
|
- `packages/orchestrator/orchestrator/agents/builder.py:build_system_prompt` — System prompt assembly with AI transparency clause
|
||||||
|
- `packages/orchestrator/orchestrator/agents/runner.py:run_agent` — LLM call via httpx to llm-pool
|
||||||
|
- `packages/gateway/gateway/normalize.py:normalize_slack_event` — Slack normalization (pattern for WhatsApp normalizer)
|
||||||
|
- `packages/gateway/gateway/channels/slack.py` — Slack adapter (pattern for WhatsApp adapter)
|
||||||
|
- `packages/router/router/ratelimit.py` — Token bucket rate limiter (reuse for WhatsApp)
|
||||||
|
- `packages/router/router/idempotency.py` — Idempotency dedup (reuse for WhatsApp webhooks)
|
||||||
|
- `packages/shared/shared/redis_keys.py` — Tenant-namespaced Redis key constructors
|
||||||
|
- `packages/shared/shared/rls.py` — RLS context hook (all new tables must use this)
|
||||||
|
|
||||||
|
### Established Patterns
|
||||||
|
- Celery tasks MUST be sync `def` with `asyncio.run()` — never async def
|
||||||
|
- RLS via `SET LOCAL app.current_tenant` in before_cursor_execute hook
|
||||||
|
- Redis keys always namespaced by tenant_id
|
||||||
|
- Channel adapters normalize to KonstructMessage before any business logic
|
||||||
|
- Placeholder message → async process → chat.update pattern for typing indicator
|
||||||
|
|
||||||
|
### Integration Points
|
||||||
|
- Orchestrator `handle_message` task needs: memory retrieval before LLM call, tool dispatch loop, escalation check, audit logging
|
||||||
|
- Gateway needs new WhatsApp adapter alongside existing Slack adapter
|
||||||
|
- Agent Designer in portal needs: allowed business functions field, escalation assignee field, tool configuration
|
||||||
|
- LLM pool needs multimodal support (image URLs in messages) for document interpretation
|
||||||
|
|
||||||
|
</code_context>
|
||||||
|
|
||||||
|
<deferred>
|
||||||
|
## Deferred Ideas
|
||||||
|
|
||||||
|
None — discussion stayed within phase scope
|
||||||
|
|
||||||
|
</deferred>
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
*Phase: 02-agent-features*
|
||||||
|
*Context gathered: 2026-03-23*
|
||||||
Reference in New Issue
Block a user