4.8 KiB
4.8 KiB
Phase 10: Agent Capabilities - Context
Gathered: 2026-03-26 Status: Ready for planning
## Phase BoundaryConnect the 4 built-in agent tools to real external services. The biggest deliverable is the knowledge base document pipeline (upload → chunk → embed → search). Web search and HTTP request tools already have working implementations that need API keys configured. Calendar tool needs Google Calendar OAuth integration with full CRUD (not just read-only).
## Implementation DecisionsKnowledge Base & Document Upload
- Supported formats:
- Files: PDF, DOCX/Word, TXT, Markdown, CSV/Excel, PPT/PowerPoint
- URLs: Web page scraping/crawling via Firecrawl
- YouTube: Transcriptions (use existing transcripts when available, OpenWhisper for transcription when not)
- KB is per-tenant — all agents in a tenant share the same knowledge base
- Dedicated KB management page in the portal (not inline in Agent Designer)
- Upload files (drag-and-drop + file picker)
- Add URLs for scraping
- Add YouTube URLs for transcription
- View ingested documents with status (processing, ready, error)
- Delete documents (removes chunks from pgvector)
- Re-index option
- Document processing is async/background — upload returns immediately, Celery task handles chunking + embedding
- Processing status visible in portal (progress indicator per document)
Web Search
- Brave Search API (already implemented in
web_search.py) - Configuration: Claude's discretion (platform-wide key recommended for simplicity, BYO optional)
BRAVE_API_KEYadded to.env
HTTP Request Tool
- Already implemented in
http_request.pywith timeout and size limits - Operator configures allowed URLs in Agent Designer tool_assignments
- No changes needed — tool is functional
Calendar Integration
- Google Calendar OAuth per tenant — tenant admin authorizes in portal
- Full CRUD for v1: check availability, list upcoming events, create events (not read-only)
- OAuth callback handled in portal (similar pattern to Slack OAuth)
- Calendar credentials stored encrypted per tenant (reuse Fernet encryption from Phase 3)
Claude's Discretion
- Web search: platform-wide vs per-tenant API key (recommend platform-wide)
- Chunking strategy (chunk size, overlap)
- Embedding model for KB (reuse all-MiniLM-L6-v2 or upgrade)
- Firecrawl integration approach (self-hosted vs cloud API)
- YouTube transcription: when to use existing captions vs OpenWhisper
- Document size limits
- KB chunk deduplication strategy
- The KB page should show document processing status live — operators need to know when their docs are ready for agents to search
- YouTube transcription is a killer feature for SMBs — they can feed training videos, product demos, and meeting recordings into the agent's knowledge base
- URL scraping via Firecrawl means agents can learn from the company's website, help docs, and blog posts automatically
- Calendar event creation makes the Sales Assistant and Office Manager templates immediately valuable — they can actually book meetings
<code_context>
Existing Code Insights
Reusable Assets
packages/orchestrator/orchestrator/tools/builtins/web_search.py— Brave Search API integration (working, needs key)packages/orchestrator/orchestrator/tools/builtins/kb_search.py— pgvector similarity search (needs chunk data)packages/orchestrator/orchestrator/tools/builtins/http_request.py— HTTP client with limits (working)packages/orchestrator/orchestrator/tools/builtins/calendar_lookup.py— Placeholder stub (needs Google Calendar)packages/orchestrator/orchestrator/memory/embedder.py— SentenceTransformer singleton (reuse for KB embedding)packages/shared/shared/models/kb.py— KbDocument and KbChunk ORM models (created in Phase 2 migration)packages/shared/shared/crypto.py— Fernet encryption (reuse for Google Calendar tokens)packages/shared/shared/api/channels.py— OAuth pattern (reuse for Google Calendar OAuth)
Established Patterns
- Celery tasks for background processing (fire-and-forget with
embed_and_store.delay()) - pgvector HNSW cosine similarity with tenant_id pre-filter
- MinIO/S3 for file storage (configured but not used for KB yet)
- Fernet encrypted credential storage per tenant
Integration Points
- Portal needs new
/knowledge-basepage (similar to/settings/api-keys) - Gateway needs document upload endpoint (multipart file upload)
- Gateway needs Google Calendar OAuth callback route
- Agent Designer needs Google Calendar connection status display
- Nav needs KB link added for customer_admin + platform_admin
</code_context>
## Deferred IdeasNone — discussion stayed within phase scope
Phase: 10-agent-capabilities Context gathered: 2026-03-26