Compare commits
3 Commits
003bebc39f
...
eae4b0324d
| Author | SHA1 | Date | |
|---|---|---|---|
| eae4b0324d | |||
| 95d05f5f88 | |||
| 9f70eede69 |
@@ -131,7 +131,7 @@ Plans:
|
||||
## Progress
|
||||
|
||||
**Execution Order:**
|
||||
Phases execute in numeric order: 1 -> 2 -> 3 -> 4 -> 5 -> 6 -> 7 -> 8 -> 9
|
||||
Phases execute in numeric order: 1 -> 2 -> 3 -> 4 -> 5 -> 6 -> 7 -> 8 -> 9 -> 10
|
||||
|
||||
| Phase | Plans Complete | Status | Completed |
|
||||
|-------|----------------|--------|-----------|
|
||||
@@ -144,7 +144,7 @@ Phases execute in numeric order: 1 -> 2 -> 3 -> 4 -> 5 -> 6 -> 7 -> 8 -> 9
|
||||
| 7. Multilanguage | 4/4 | Complete | 2026-03-25 |
|
||||
| 8. Mobile + PWA | 4/4 | Complete | 2026-03-26 |
|
||||
| 9. Testing & QA | 3/3 | Complete | 2026-03-26 |
|
||||
| 10. Agent Capabilities | 0/0 | Not started | - |
|
||||
| 10. Agent Capabilities | 0/3 | In progress | - |
|
||||
|
||||
---
|
||||
|
||||
@@ -210,7 +210,7 @@ Plans:
|
||||
- [ ] 09-03-PLAN.md — Gitea Actions CI pipeline (backend lint+pytest, portal build+E2E+Lighthouse) + human verification
|
||||
|
||||
### Phase 10: Agent Capabilities
|
||||
**Goal**: Connect the 4 built-in agent tools to real external services so AI Employees can actually search the web, query a knowledge base of uploaded documents, make HTTP API calls, and check calendar availability
|
||||
**Goal**: Connect the 4 built-in agent tools to real external services so AI Employees can actually search the web, query a knowledge base of uploaded documents, make HTTP API calls, and check calendar availability — with full CRUD Google Calendar integration and a dedicated KB management portal page
|
||||
**Depends on**: Phase 9
|
||||
**Requirements**: CAP-01, CAP-02, CAP-03, CAP-04, CAP-05, CAP-06, CAP-07
|
||||
**Success Criteria** (what must be TRUE):
|
||||
@@ -221,11 +221,13 @@ Plans:
|
||||
5. Calendar tool can check availability on Google Calendar (read-only for v1)
|
||||
6. Tool results are incorporated naturally into agent responses (no raw JSON dumps)
|
||||
7. All tool invocations are logged in the audit trail with input/output
|
||||
**Plans**: 0 plans
|
||||
**Plans**: 3 plans
|
||||
|
||||
Plans:
|
||||
- [ ] TBD (run /gsd:plan-phase 10 to break down)
|
||||
- [ ] 10-01-PLAN.md — KB ingestion pipeline backend: migration 013, text extractors (PDF/DOCX/PPTX/XLSX/CSV/TXT/MD), chunking + embedding Celery task, KB API router (upload/list/delete/reindex/URL), executor tenant_id injection, web search config
|
||||
- [ ] 10-02-PLAN.md — Google Calendar OAuth per tenant: install/callback endpoints, calendar_lookup replacement with list/create/check_availability, encrypted token storage, router mounting, tool response formatting
|
||||
- [ ] 10-03-PLAN.md — Portal KB management page: document list with status polling, file upload (drag-and-drop), URL/YouTube ingestion, delete/reindex, RBAC, human verification
|
||||
|
||||
---
|
||||
*Roadmap created: 2026-03-23*
|
||||
*Coverage: 25/25 v1 requirements + 6 RBAC requirements + 5 Employee Design requirements + 5 Web Chat requirements + 6 Multilanguage requirements + 6 Mobile+PWA requirements + 7 Testing & QA requirements mapped*
|
||||
*Coverage: 25/25 v1 requirements + 6 RBAC requirements + 5 Employee Design requirements + 5 Web Chat requirements + 6 Multilanguage requirements + 6 Mobile+PWA requirements + 7 Testing & QA requirements + 7 Agent Capabilities requirements mapped*
|
||||
|
||||
338
.planning/phases/10-agent-capabilities/10-01-PLAN.md
Normal file
338
.planning/phases/10-agent-capabilities/10-01-PLAN.md
Normal file
@@ -0,0 +1,338 @@
|
||||
---
|
||||
phase: 10-agent-capabilities
|
||||
plan: 01
|
||||
type: execute
|
||||
wave: 1
|
||||
depends_on: []
|
||||
files_modified:
|
||||
- migrations/versions/013_kb_status_and_calendar.py
|
||||
- packages/shared/shared/models/kb.py
|
||||
- packages/shared/shared/models/tenant.py
|
||||
- packages/shared/shared/config.py
|
||||
- packages/shared/shared/api/kb.py
|
||||
- packages/orchestrator/orchestrator/tools/ingest.py
|
||||
- packages/orchestrator/orchestrator/tools/extractors.py
|
||||
- packages/orchestrator/orchestrator/tasks.py
|
||||
- packages/orchestrator/orchestrator/tools/executor.py
|
||||
- packages/orchestrator/orchestrator/tools/builtins/kb_search.py
|
||||
- packages/orchestrator/pyproject.toml
|
||||
- .env.example
|
||||
- tests/unit/test_extractors.py
|
||||
- tests/unit/test_kb_upload.py
|
||||
autonomous: true
|
||||
requirements:
|
||||
- CAP-01
|
||||
- CAP-02
|
||||
- CAP-03
|
||||
- CAP-04
|
||||
- CAP-07
|
||||
|
||||
must_haves:
|
||||
truths:
|
||||
- "Documents uploaded via API are saved to MinIO and a KbDocument row is created with status=processing"
|
||||
- "The Celery ingestion task extracts text from PDF, DOCX, PPTX, XLSX, CSV, TXT, and MD files"
|
||||
- "Extracted text is chunked (500 chars, 50 overlap) and embedded via all-MiniLM-L6-v2 into kb_chunks with tenant_id"
|
||||
- "kb_search tool receives tenant_id injection from executor and returns matching chunks"
|
||||
- "BRAVE_API_KEY and FIRECRAWL_API_KEY are platform-wide settings in shared config"
|
||||
- "Tool executor injects tenant_id and agent_id into tool handler kwargs for context-aware tools"
|
||||
artifacts:
|
||||
- path: "migrations/versions/013_kb_status_and_calendar.py"
|
||||
provides: "DB migration: kb_documents status/error_message/chunk_count columns, agent_id nullable, channel_type CHECK update for google_calendar"
|
||||
contains: "status"
|
||||
- path: "packages/orchestrator/orchestrator/tools/extractors.py"
|
||||
provides: "Text extraction functions for all supported document formats"
|
||||
exports: ["extract_text"]
|
||||
- path: "packages/orchestrator/orchestrator/tools/ingest.py"
|
||||
provides: "Document chunking and ingestion pipeline logic"
|
||||
exports: ["chunk_text", "ingest_document_pipeline"]
|
||||
- path: "packages/shared/shared/api/kb.py"
|
||||
provides: "KB management API router (upload, list, delete, re-index)"
|
||||
exports: ["kb_router"]
|
||||
- path: "tests/unit/test_extractors.py"
|
||||
provides: "Unit tests for text extraction functions"
|
||||
key_links:
|
||||
- from: "packages/shared/shared/api/kb.py"
|
||||
to: "packages/orchestrator/orchestrator/tasks.py"
|
||||
via: "ingest_document.delay(document_id, tenant_id)"
|
||||
pattern: "ingest_document\\.delay"
|
||||
- from: "packages/orchestrator/orchestrator/tools/executor.py"
|
||||
to: "tool.handler"
|
||||
via: "tenant_id/agent_id injection into kwargs"
|
||||
pattern: "tenant_id.*agent_id.*handler"
|
||||
---
|
||||
|
||||
<objective>
|
||||
Build the knowledge base document ingestion pipeline backend and activate web search/HTTP tools.
|
||||
|
||||
Purpose: This is the core backend for CAP-02/CAP-03 -- the document upload, text extraction, chunking, embedding, and storage pipeline that makes the KB search tool functional with real data. Also fixes the tool executor to inject tenant context into tool handlers, activates web search via BRAVE_API_KEY config, and confirms HTTP request tool needs no changes (CAP-04).
|
||||
|
||||
Output: Working KB upload API, Celery ingestion task, text extractors for all formats, migration 013, executor tenant_id injection, updated config with new env vars.
|
||||
</objective>
|
||||
|
||||
<execution_context>
|
||||
@/home/adelorenzo/.claude/get-shit-done/workflows/execute-plan.md
|
||||
@/home/adelorenzo/.claude/get-shit-done/templates/summary.md
|
||||
</execution_context>
|
||||
|
||||
<context>
|
||||
@.planning/PROJECT.md
|
||||
@.planning/ROADMAP.md
|
||||
@.planning/STATE.md
|
||||
@.planning/phases/10-agent-capabilities/10-CONTEXT.md
|
||||
@.planning/phases/10-agent-capabilities/10-RESEARCH.md
|
||||
|
||||
<interfaces>
|
||||
<!-- Key types and contracts the executor needs -->
|
||||
|
||||
From packages/shared/shared/models/kb.py:
|
||||
```python
|
||||
class KnowledgeBaseDocument(KBBase):
|
||||
__tablename__ = "kb_documents"
|
||||
id: Mapped[uuid.UUID]
|
||||
tenant_id: Mapped[uuid.UUID]
|
||||
agent_id: Mapped[uuid.UUID] # Currently NOT NULL — migration 013 makes nullable
|
||||
filename: Mapped[str | None]
|
||||
source_url: Mapped[str | None]
|
||||
content_type: Mapped[str | None]
|
||||
created_at: Mapped[datetime]
|
||||
chunks: Mapped[list[KBChunk]]
|
||||
|
||||
class KBChunk(KBBase):
|
||||
__tablename__ = "kb_chunks"
|
||||
id: Mapped[uuid.UUID]
|
||||
tenant_id: Mapped[uuid.UUID]
|
||||
document_id: Mapped[uuid.UUID]
|
||||
content: Mapped[str]
|
||||
chunk_index: Mapped[int | None]
|
||||
created_at: Mapped[datetime]
|
||||
```
|
||||
|
||||
From packages/orchestrator/orchestrator/tools/executor.py:
|
||||
```python
|
||||
async def execute_tool(
|
||||
tool_call: dict[str, Any],
|
||||
registry: dict[str, "ToolDefinition"],
|
||||
tenant_id: uuid.UUID,
|
||||
agent_id: uuid.UUID,
|
||||
audit_logger: "AuditLogger",
|
||||
) -> str:
|
||||
# Line 126: result = await tool.handler(**args)
|
||||
# PROBLEM: only LLM-provided args are passed, tenant_id/agent_id NOT injected
|
||||
```
|
||||
|
||||
From packages/orchestrator/orchestrator/memory/embedder.py:
|
||||
```python
|
||||
def embed_text(text: str) -> list[float]: # Returns 384-dim vector
|
||||
def embed_texts(texts: list[str]) -> list[list[float]]: # Batch embedding
|
||||
```
|
||||
|
||||
From packages/shared/shared/config.py:
|
||||
```python
|
||||
class Settings(BaseSettings):
|
||||
minio_endpoint: str
|
||||
minio_access_key: str
|
||||
minio_secret_key: str
|
||||
minio_media_bucket: str
|
||||
```
|
||||
|
||||
From packages/shared/shared/api/channels.py:
|
||||
```python
|
||||
channels_router = APIRouter(prefix="/api/portal/channels", tags=["channels"])
|
||||
# Uses: require_tenant_admin, get_session, KeyEncryptionService
|
||||
# OAuth state: generate_oauth_state() / verify_oauth_state() with HMAC-SHA256
|
||||
```
|
||||
|
||||
From packages/shared/shared/api/rbac.py:
|
||||
```python
|
||||
class PortalCaller: ...
|
||||
async def require_tenant_admin(...) -> PortalCaller: ...
|
||||
async def require_tenant_member(...) -> PortalCaller: ...
|
||||
```
|
||||
</interfaces>
|
||||
</context>
|
||||
|
||||
<tasks>
|
||||
|
||||
<task type="auto" tdd="true">
|
||||
<name>Task 1: Migration 013, ORM updates, config settings, text extractors, KB API router</name>
|
||||
<files>
|
||||
migrations/versions/013_kb_status_and_calendar.py,
|
||||
packages/shared/shared/models/kb.py,
|
||||
packages/shared/shared/models/tenant.py,
|
||||
packages/shared/shared/config.py,
|
||||
packages/shared/shared/api/kb.py,
|
||||
packages/orchestrator/orchestrator/tools/extractors.py,
|
||||
packages/orchestrator/pyproject.toml,
|
||||
.env.example,
|
||||
tests/unit/test_extractors.py,
|
||||
tests/unit/test_kb_upload.py
|
||||
</files>
|
||||
<behavior>
|
||||
- extract_text("hello.pdf", pdf_bytes) returns extracted text from PDF pages
|
||||
- extract_text("doc.docx", docx_bytes) returns paragraph text from DOCX
|
||||
- extract_text("slides.pptx", pptx_bytes) returns slide text from PPTX
|
||||
- extract_text("data.xlsx", xlsx_bytes) returns CSV-formatted cell data
|
||||
- extract_text("data.csv", csv_bytes) returns decoded UTF-8 text
|
||||
- extract_text("notes.txt", txt_bytes) returns decoded text
|
||||
- extract_text("notes.md", md_bytes) returns decoded text
|
||||
- extract_text("file.exe", bytes) raises ValueError("Unsupported file extension")
|
||||
- KB upload endpoint returns 201 with document_id for valid file
|
||||
- KB list endpoint returns documents with status field
|
||||
- KB delete endpoint removes document and chunks
|
||||
</behavior>
|
||||
<action>
|
||||
1. **Migration 013** (`migrations/versions/013_kb_status_and_calendar.py`):
|
||||
- ALTER TABLE kb_documents ADD COLUMN status TEXT NOT NULL DEFAULT 'processing'
|
||||
- ALTER TABLE kb_documents ADD COLUMN error_message TEXT
|
||||
- ALTER TABLE kb_documents ADD COLUMN chunk_count INTEGER
|
||||
- ALTER TABLE kb_documents ALTER COLUMN agent_id DROP NOT NULL (KB is per-tenant per locked decision)
|
||||
- DROP + re-ADD channel_connections CHECK constraint to include 'google_calendar' (same pattern as migration 008)
|
||||
- New channel types tuple: slack, whatsapp, mattermost, rocketchat, teams, telegram, signal, web, google_calendar
|
||||
- Add CHECK constraint on kb_documents.status: CHECK (status IN ('processing', 'ready', 'error'))
|
||||
|
||||
2. **ORM updates**:
|
||||
- `packages/shared/shared/models/kb.py`: Add status (str, server_default='processing'), error_message (str | None), chunk_count (int | None) mapped columns to KnowledgeBaseDocument. Change agent_id to nullable=True.
|
||||
- `packages/shared/shared/models/tenant.py`: Add GOOGLE_CALENDAR = "google_calendar" to ChannelTypeEnum
|
||||
|
||||
3. **Config** (`packages/shared/shared/config.py`):
|
||||
- Add brave_api_key: str = Field(default="", description="Brave Search API key")
|
||||
- Add firecrawl_api_key: str = Field(default="", description="Firecrawl API key for URL scraping")
|
||||
- Add google_client_id: str = Field(default="", description="Google OAuth client ID")
|
||||
- Add google_client_secret: str = Field(default="", description="Google OAuth client secret")
|
||||
- Add minio_kb_bucket: str = Field(default="kb-documents", description="MinIO bucket for KB documents")
|
||||
- Update .env.example with all new env vars
|
||||
|
||||
4. **Install dependencies** on orchestrator:
|
||||
```bash
|
||||
uv add --project packages/orchestrator pypdf python-docx python-pptx openpyxl pandas firecrawl-py youtube-transcript-api google-api-python-client google-auth-oauthlib
|
||||
```
|
||||
|
||||
5. **Text extractors** (`packages/orchestrator/orchestrator/tools/extractors.py`):
|
||||
- Create extract_text(filename: str, file_bytes: bytes) -> str function
|
||||
- PDF: pypdf PdfReader on BytesIO, join page text with newlines
|
||||
- DOCX: python-docx Document on BytesIO, join paragraph text
|
||||
- PPTX: python-pptx Presentation on BytesIO, iterate slides/shapes for text
|
||||
- XLSX/XLS: pandas read_excel on BytesIO, to_csv(index=False)
|
||||
- CSV: decode UTF-8 with errors="replace"
|
||||
- TXT/MD: decode UTF-8 with errors="replace"
|
||||
- Raise ValueError for unsupported extensions
|
||||
- After extraction, check if len(text.strip()) < 100 chars for PDF — return error message about OCR not supported
|
||||
|
||||
6. **KB API router** (`packages/shared/shared/api/kb.py`):
|
||||
- kb_router = APIRouter(prefix="/api/portal/kb", tags=["knowledge-base"])
|
||||
- POST /{tenant_id}/documents — multipart file upload (UploadFile + File)
|
||||
- Validate file extension against supported list
|
||||
- Read file bytes, upload to MinIO kb-documents bucket with key: {tenant_id}/{doc_id}/{filename}
|
||||
- Insert KnowledgeBaseDocument(tenant_id, filename, content_type, status='processing', agent_id=None)
|
||||
- Call ingest_document.delay(str(doc.id), str(tenant_id)) — import from orchestrator.tasks
|
||||
- Return 201 with {"id": str(doc.id), "filename": filename, "status": "processing"}
|
||||
- Guard with require_tenant_admin
|
||||
- POST /{tenant_id}/documents/url — JSON body {url: str, source_type: "web" | "youtube"}
|
||||
- Insert KnowledgeBaseDocument(tenant_id, source_url=url, status='processing', agent_id=None)
|
||||
- Call ingest_document.delay(str(doc.id), str(tenant_id))
|
||||
- Return 201
|
||||
- Guard with require_tenant_admin
|
||||
- GET /{tenant_id}/documents — list KbDocuments for tenant with status, chunk_count, created_at
|
||||
- Guard with require_tenant_member (operators can view)
|
||||
- DELETE /{tenant_id}/documents/{document_id} — delete document (CASCADE deletes chunks)
|
||||
- Also delete file from MinIO if filename present
|
||||
- Guard with require_tenant_admin
|
||||
- POST /{tenant_id}/documents/{document_id}/reindex — delete existing chunks, re-dispatch ingest_document.delay
|
||||
- Guard with require_tenant_admin
|
||||
|
||||
7. **Tests** (write BEFORE implementation per tdd=true):
|
||||
- test_extractors.py: test each format extraction with minimal valid files (create in-memory test fixtures using the libraries)
|
||||
- test_kb_upload.py: test upload endpoint with mocked MinIO and mocked Celery task dispatch
|
||||
</action>
|
||||
<verify>
|
||||
<automated>cd /home/adelorenzo/repos/konstruct && python -m pytest tests/unit/test_extractors.py tests/unit/test_kb_upload.py -x -q</automated>
|
||||
</verify>
|
||||
<done>Migration 013 exists with all schema changes. Text extractors handle all 7 format families. KB API router has upload, list, delete, URL ingest, and reindex endpoints. All unit tests pass.</done>
|
||||
</task>
|
||||
|
||||
<task type="auto" tdd="true">
|
||||
<name>Task 2: Celery ingestion task, executor tenant_id injection, KB search wiring</name>
|
||||
<files>
|
||||
packages/orchestrator/orchestrator/tasks.py,
|
||||
packages/orchestrator/orchestrator/tools/ingest.py,
|
||||
packages/orchestrator/orchestrator/tools/executor.py,
|
||||
packages/orchestrator/orchestrator/tools/builtins/kb_search.py,
|
||||
packages/orchestrator/orchestrator/tools/builtins/web_search.py,
|
||||
tests/unit/test_ingestion.py,
|
||||
tests/unit/test_executor_injection.py
|
||||
</files>
|
||||
<behavior>
|
||||
- chunk_text("hello world " * 100, chunk_size=500, overlap=50) returns overlapping chunks of correct size
|
||||
- ingest_document_pipeline fetches file from MinIO, extracts text, chunks, embeds, inserts kb_chunks rows, updates status to 'ready'
|
||||
- ingest_document_pipeline sets status='error' with error_message on failure
|
||||
- execute_tool injects tenant_id and agent_id into handler kwargs before calling handler
|
||||
- web_search reads BRAVE_API_KEY from settings (not os.getenv) for consistency
|
||||
- kb_search receives injected tenant_id from executor
|
||||
</behavior>
|
||||
<action>
|
||||
1. **Chunking + ingestion logic** (`packages/orchestrator/orchestrator/tools/ingest.py`):
|
||||
- chunk_text(text: str, chunk_size: int = 500, overlap: int = 50) -> list[str]
|
||||
- Simple sliding window chunker, strip empty chunks
|
||||
- async ingest_document_pipeline(document_id: str, tenant_id: str) -> None:
|
||||
- Load KnowledgeBaseDocument from DB by ID (use RLS with tenant_id)
|
||||
- If filename: download file bytes from MinIO (boto3 client, kb-documents bucket, key: {tenant_id}/{document_id}/{filename})
|
||||
- If source_url and source_url contains "youtube.com" or "youtu.be": use youtube_transcript_api to fetch transcript
|
||||
- If source_url and not YouTube: use firecrawl-py to scrape URL to markdown (graceful error if FIRECRAWL_API_KEY not set)
|
||||
- Call extract_text(filename, file_bytes) for file uploads
|
||||
- Call chunk_text(text) on extracted text
|
||||
- Batch embed chunks using embed_texts() from embedder.py
|
||||
- INSERT kb_chunks rows with embedding vectors (use raw SQL text() with CAST(:embedding AS vector) pattern from kb_search.py)
|
||||
- UPDATE kb_documents SET status='ready', chunk_count=len(chunks)
|
||||
- On any error: UPDATE kb_documents SET status='error', error_message=str(exc)
|
||||
|
||||
2. **Celery task** in `packages/orchestrator/orchestrator/tasks.py`:
|
||||
- Add ingest_document Celery task (sync def with asyncio.run per hard architectural constraint)
|
||||
- @celery_app.task(bind=True, max_retries=2, ignore_result=True)
|
||||
- def ingest_document(self, document_id: str, tenant_id: str) -> None
|
||||
- Calls asyncio.run(ingest_document_pipeline(document_id, tenant_id))
|
||||
- On exception: asyncio.run to mark document as error, then self.retry(countdown=60)
|
||||
|
||||
3. **Executor tenant_id injection** (`packages/orchestrator/orchestrator/tools/executor.py`):
|
||||
- Before calling tool.handler(**args), inject tenant_id and agent_id as string kwargs:
|
||||
args["tenant_id"] = str(tenant_id)
|
||||
args["agent_id"] = str(agent_id)
|
||||
- This makes kb_search, calendar_lookup, and future context-aware tools work without LLM needing to know tenant context
|
||||
- Place injection AFTER schema validation (line ~126) so the injected keys don't fail validation
|
||||
|
||||
4. **Update web_search.py**: Change `os.getenv("BRAVE_API_KEY", "")` to import settings from shared.config and use `settings.brave_api_key` for consistency with platform-wide config pattern.
|
||||
|
||||
5. **Tests** (write BEFORE implementation):
|
||||
- test_ingestion.py: test chunk_text with various inputs, test ingest_document_pipeline with mocked MinIO/DB/embedder
|
||||
- test_executor_injection.py: test that execute_tool injects tenant_id/agent_id into handler kwargs
|
||||
</action>
|
||||
<verify>
|
||||
<automated>cd /home/adelorenzo/repos/konstruct && python -m pytest tests/unit/test_ingestion.py tests/unit/test_executor_injection.py -x -q</automated>
|
||||
</verify>
|
||||
<done>Celery ingest_document task dispatches async ingestion pipeline. Pipeline downloads files from MinIO, extracts text, chunks, embeds, and stores in kb_chunks. Executor injects tenant_id/agent_id into all tool handlers. web_search uses shared config. All tests pass.</done>
|
||||
</task>
|
||||
|
||||
</tasks>
|
||||
|
||||
<verification>
|
||||
- Migration 013 applies cleanly: `cd /home/adelorenzo/repos/konstruct && alembic upgrade head`
|
||||
- All unit tests pass: `pytest tests/unit/test_extractors.py tests/unit/test_kb_upload.py tests/unit/test_ingestion.py tests/unit/test_executor_injection.py -x -q`
|
||||
- KB API router mounts and serves: import kb_router without errors
|
||||
- Executor properly injects tenant context into tool handlers
|
||||
</verification>
|
||||
|
||||
<success_criteria>
|
||||
- KnowledgeBaseDocument has status, error_message, chunk_count columns; agent_id is nullable
|
||||
- channel_connections CHECK constraint includes 'google_calendar'
|
||||
- Text extraction works for PDF, DOCX, PPTX, XLSX, CSV, TXT, MD
|
||||
- KB upload endpoint accepts files and dispatches Celery task
|
||||
- KB list/delete/reindex endpoints work
|
||||
- URL and YouTube ingestion endpoints dispatch Celery tasks
|
||||
- Celery ingestion pipeline: extract -> chunk -> embed -> store
|
||||
- Tool executor injects tenant_id and agent_id into handler kwargs
|
||||
- BRAVE_API_KEY and FIRECRAWL_API_KEY in shared config
|
||||
- All unit tests pass
|
||||
</success_criteria>
|
||||
|
||||
<output>
|
||||
After completion, create `.planning/phases/10-agent-capabilities/10-01-SUMMARY.md`
|
||||
</output>
|
||||
262
.planning/phases/10-agent-capabilities/10-02-PLAN.md
Normal file
262
.planning/phases/10-agent-capabilities/10-02-PLAN.md
Normal file
@@ -0,0 +1,262 @@
|
||||
---
|
||||
phase: 10-agent-capabilities
|
||||
plan: 02
|
||||
type: execute
|
||||
wave: 1
|
||||
depends_on: []
|
||||
files_modified:
|
||||
- packages/shared/shared/api/calendar_auth.py
|
||||
- packages/orchestrator/orchestrator/tools/builtins/calendar_lookup.py
|
||||
- packages/orchestrator/orchestrator/tools/registry.py
|
||||
- tests/unit/test_calendar_lookup.py
|
||||
- tests/unit/test_calendar_auth.py
|
||||
autonomous: true
|
||||
requirements:
|
||||
- CAP-05
|
||||
- CAP-06
|
||||
|
||||
user_setup:
|
||||
- service: google-cloud
|
||||
why: "Google Calendar OAuth for per-tenant calendar access"
|
||||
env_vars:
|
||||
- name: GOOGLE_CLIENT_ID
|
||||
source: "Google Cloud Console -> APIs & Services -> Credentials -> OAuth 2.0 Client ID (Web application)"
|
||||
- name: GOOGLE_CLIENT_SECRET
|
||||
source: "Google Cloud Console -> APIs & Services -> Credentials -> OAuth 2.0 Client ID secret"
|
||||
dashboard_config:
|
||||
- task: "Create OAuth 2.0 Client ID (Web application type)"
|
||||
location: "Google Cloud Console -> APIs & Services -> Credentials"
|
||||
- task: "Add authorized redirect URI: {PORTAL_URL}/api/portal/calendar/callback"
|
||||
location: "Google Cloud Console -> Credentials -> OAuth client -> Authorized redirect URIs"
|
||||
- task: "Enable Google Calendar API"
|
||||
location: "Google Cloud Console -> APIs & Services -> Library -> Google Calendar API"
|
||||
|
||||
must_haves:
|
||||
truths:
|
||||
- "Tenant admin can initiate Google Calendar OAuth from the portal and authorize calendar access"
|
||||
- "Calendar OAuth callback exchanges code for tokens and stores them encrypted per tenant"
|
||||
- "Calendar tool reads per-tenant OAuth tokens from channel_connections and calls Google Calendar API"
|
||||
- "Calendar tool supports list events, check availability, and create event actions"
|
||||
- "Token auto-refresh works — expired access tokens are refreshed via stored refresh_token and written back to DB"
|
||||
- "Tool results are formatted as natural language (no raw JSON)"
|
||||
artifacts:
|
||||
- path: "packages/shared/shared/api/calendar_auth.py"
|
||||
provides: "Google Calendar OAuth install + callback endpoints"
|
||||
exports: ["calendar_auth_router"]
|
||||
- path: "packages/orchestrator/orchestrator/tools/builtins/calendar_lookup.py"
|
||||
provides: "Per-tenant OAuth calendar tool with list/create/check_availability"
|
||||
exports: ["calendar_lookup"]
|
||||
- path: "tests/unit/test_calendar_lookup.py"
|
||||
provides: "Unit tests for calendar tool with mocked Google API"
|
||||
key_links:
|
||||
- from: "packages/shared/shared/api/calendar_auth.py"
|
||||
to: "channel_connections table"
|
||||
via: "Upsert ChannelConnection(channel_type='google_calendar') with encrypted token"
|
||||
pattern: "google_calendar.*encrypt"
|
||||
- from: "packages/orchestrator/orchestrator/tools/builtins/calendar_lookup.py"
|
||||
to: "channel_connections table"
|
||||
via: "Load encrypted token, decrypt, build Credentials, call Google API"
|
||||
pattern: "Credentials.*refresh_token"
|
||||
---
|
||||
|
||||
<objective>
|
||||
Build Google Calendar OAuth per-tenant integration and replace the service-account stub with full CRUD calendar tool.
|
||||
|
||||
Purpose: Enables CAP-05 (calendar availability checking + event creation) by replacing the service account stub in calendar_lookup.py with per-tenant OAuth token lookup. Also addresses CAP-06 (natural language tool results) by ensuring calendar and all tool outputs are formatted as readable text.
|
||||
|
||||
Output: Google Calendar OAuth install/callback endpoints, fully functional calendar_lookup tool with list/create/check_availability actions, encrypted per-tenant token storage, token auto-refresh with write-back.
|
||||
</objective>
|
||||
|
||||
<execution_context>
|
||||
@/home/adelorenzo/.claude/get-shit-done/workflows/execute-plan.md
|
||||
@/home/adelorenzo/.claude/get-shit-done/templates/summary.md
|
||||
</execution_context>
|
||||
|
||||
<context>
|
||||
@.planning/PROJECT.md
|
||||
@.planning/ROADMAP.md
|
||||
@.planning/phases/10-agent-capabilities/10-CONTEXT.md
|
||||
@.planning/phases/10-agent-capabilities/10-RESEARCH.md
|
||||
|
||||
<interfaces>
|
||||
<!-- Existing OAuth pattern from Slack to reuse -->
|
||||
|
||||
From packages/shared/shared/api/channels.py:
|
||||
```python
|
||||
channels_router = APIRouter(prefix="/api/portal/channels", tags=["channels"])
|
||||
|
||||
def _generate_oauth_state(tenant_id: uuid.UUID) -> str:
|
||||
"""HMAC-SHA256 signed state with embedded tenant_id + nonce."""
|
||||
...
|
||||
|
||||
def _verify_oauth_state(state: str) -> uuid.UUID:
|
||||
"""Verify HMAC signature, return tenant_id. Raises HTTPException on failure."""
|
||||
...
|
||||
```
|
||||
|
||||
From packages/shared/shared/crypto.py:
|
||||
```python
|
||||
class KeyEncryptionService:
|
||||
def encrypt(self, plaintext: str) -> str: ...
|
||||
def decrypt(self, ciphertext: str) -> str: ...
|
||||
```
|
||||
|
||||
From packages/shared/shared/models/tenant.py:
|
||||
```python
|
||||
class ChannelConnection(Base):
|
||||
__tablename__ = "channel_connections"
|
||||
id: Mapped[uuid.UUID]
|
||||
tenant_id: Mapped[uuid.UUID]
|
||||
channel_type: Mapped[ChannelTypeEnum] # TEXT + CHECK in DB
|
||||
workspace_id: Mapped[str]
|
||||
config: Mapped[dict] # JSON — stores encrypted token
|
||||
created_at: Mapped[datetime]
|
||||
```
|
||||
|
||||
From packages/shared/shared/config.py (after Plan 01):
|
||||
```python
|
||||
class Settings(BaseSettings):
|
||||
google_client_id: str = ""
|
||||
google_client_secret: str = ""
|
||||
```
|
||||
</interfaces>
|
||||
</context>
|
||||
|
||||
<tasks>
|
||||
|
||||
<task type="auto" tdd="true">
|
||||
<name>Task 1: Google Calendar OAuth endpoints and calendar tool replacement</name>
|
||||
<files>
|
||||
packages/shared/shared/api/calendar_auth.py,
|
||||
packages/orchestrator/orchestrator/tools/builtins/calendar_lookup.py,
|
||||
tests/unit/test_calendar_lookup.py,
|
||||
tests/unit/test_calendar_auth.py
|
||||
</files>
|
||||
<behavior>
|
||||
- OAuth install endpoint returns redirect URL with HMAC-signed state containing tenant_id
|
||||
- OAuth callback verifies HMAC state, exchanges code for tokens, encrypts and stores in channel_connections as google_calendar type
|
||||
- OAuth callback redirects to portal settings page with connected=true param
|
||||
- calendar_lookup(date, action="list", tenant_id=...) loads encrypted token from DB, decrypts, calls Google Calendar API, returns formatted event list
|
||||
- calendar_lookup(date, action="create", event_summary=..., event_start=..., event_end=..., tenant_id=...) creates a Google Calendar event and returns confirmation
|
||||
- calendar_lookup(date, action="check_availability", tenant_id=...) returns free/busy summary
|
||||
- calendar_lookup returns informative message when no Google Calendar is connected for tenant
|
||||
- Token refresh: if access_token expired, google-auth auto-refreshes, updated token written back to DB
|
||||
- All results are natural language strings, not raw JSON
|
||||
</behavior>
|
||||
<action>
|
||||
1. **Calendar OAuth router** (`packages/shared/shared/api/calendar_auth.py`):
|
||||
- calendar_auth_router = APIRouter(prefix="/api/portal/calendar", tags=["calendar"])
|
||||
- Import and reuse _generate_oauth_state / _verify_oauth_state from channels.py (or extract to shared utility if private)
|
||||
- If they are private (_prefix), create equivalent functions in this module using the same HMAC pattern
|
||||
- GET /install?tenant_id={id}:
|
||||
- Guard with require_tenant_admin
|
||||
- Generate HMAC-signed state with tenant_id
|
||||
- Build Google OAuth URL: https://accounts.google.com/o/oauth2/v2/auth with:
|
||||
- client_id from settings
|
||||
- redirect_uri = settings.portal_url + "/api/portal/calendar/callback"
|
||||
- scope = "https://www.googleapis.com/auth/calendar" (full read+write per locked decision)
|
||||
- state = hmac_state
|
||||
- access_type = "offline" (to get refresh_token)
|
||||
- prompt = "consent" (force consent to always get refresh_token)
|
||||
- Return {"url": oauth_url}
|
||||
|
||||
- GET /callback?code={code}&state={state}:
|
||||
- NO auth guard (external redirect from Google — no session cookie)
|
||||
- Verify HMAC state to recover tenant_id
|
||||
- Exchange code for tokens using google_auth_oauthlib or httpx POST to https://oauth2.googleapis.com/token
|
||||
- Encrypt token JSON with KeyEncryptionService (Fernet)
|
||||
- Upsert ChannelConnection(tenant_id=tenant_id, channel_type="google_calendar", workspace_id=str(tenant_id), config={"token": encrypted_token})
|
||||
- Redirect to portal /settings?calendar=connected
|
||||
|
||||
- GET /{tenant_id}/status:
|
||||
- Guard with require_tenant_member
|
||||
- Check if ChannelConnection with channel_type='google_calendar' exists for tenant
|
||||
- Return {"connected": true/false}
|
||||
|
||||
2. **Replace calendar_lookup.py** entirely:
|
||||
- Remove all service account code
|
||||
- New signature: async def calendar_lookup(date: str, action: str = "list", event_summary: str | None = None, event_start: str | None = None, event_end: str | None = None, calendar_id: str = "primary", tenant_id: str | None = None, **kwargs) -> str
|
||||
- If no tenant_id: return "Calendar not available: missing tenant context."
|
||||
- Load ChannelConnection(channel_type='google_calendar', tenant_id=tenant_uuid) from DB
|
||||
- If not found: return "Google Calendar is not connected for this tenant. Ask an admin to connect it in Settings."
|
||||
- Decrypt token JSON, build google.oauth2.credentials.Credentials
|
||||
- Build Calendar service: build("calendar", "v3", credentials=creds, cache_discovery=False)
|
||||
- Run API call in thread executor (same pattern as original — avoid blocking event loop)
|
||||
- action="list": list events for date, format as "Calendar events for {date}:\n- {time}: {summary}\n..."
|
||||
- action="check_availability": list events, format as "Busy slots on {date}:\n..." or "No events — the entire day is free."
|
||||
- action="create": insert event with summary, start, end, return "Event created: {summary} from {start} to {end}"
|
||||
- After API call: check if credentials.token changed (refresh occurred) — if so, encrypt and UPDATE channel_connections.config with new token
|
||||
- All errors return human-readable messages, never raw exceptions
|
||||
|
||||
3. **Update tool registry** if needed — ensure calendar_lookup parameters schema includes action, event_summary, event_start, event_end fields so LLM knows about CRUD capabilities. Check packages/orchestrator/orchestrator/tools/registry.py for the calendar_lookup entry and update its parameters JSON schema.
|
||||
|
||||
4. **Tests** (write BEFORE implementation):
|
||||
- test_calendar_lookup.py: mock Google Calendar API (googleapiclient.discovery.build), mock DB session to return encrypted token, test list/create/check_availability actions, test "not connected" path, test token refresh write-back
|
||||
- test_calendar_auth.py: mock httpx for token exchange, test HMAC state generation/verification, test callback stores encrypted token
|
||||
</action>
|
||||
<verify>
|
||||
<automated>cd /home/adelorenzo/repos/konstruct && python -m pytest tests/unit/test_calendar_lookup.py tests/unit/test_calendar_auth.py -x -q</automated>
|
||||
</verify>
|
||||
<done>Google Calendar OAuth install/callback endpoints work. Calendar tool loads per-tenant tokens, supports list/create/check_availability, formats results as natural language. Token refresh writes back to DB. Service account stub completely removed. All tests pass.</done>
|
||||
</task>
|
||||
|
||||
<task type="auto">
|
||||
<name>Task 2: Mount new API routers on gateway and update tool response formatting</name>
|
||||
<files>
|
||||
packages/gateway/gateway/main.py,
|
||||
packages/orchestrator/orchestrator/tools/registry.py,
|
||||
packages/orchestrator/orchestrator/agents/prompt.py
|
||||
</files>
|
||||
<action>
|
||||
1. **Mount routers on gateway** (`packages/gateway/gateway/main.py`):
|
||||
- Import kb_router from shared.api.kb and include it on the FastAPI app (same pattern as channels_router, billing_router, etc.)
|
||||
- Import calendar_auth_router from shared.api.calendar_auth and include it on the app
|
||||
- Verify both are accessible via curl or import
|
||||
|
||||
2. **Update tool registry** (`packages/orchestrator/orchestrator/tools/registry.py`):
|
||||
- Update calendar_lookup tool definition's parameters schema to include:
|
||||
- action: enum ["list", "check_availability", "create"] (required)
|
||||
- event_summary: string (optional, for create)
|
||||
- event_start: string (optional, ISO 8601 with timezone, for create)
|
||||
- event_end: string (optional, ISO 8601 with timezone, for create)
|
||||
- date: string (required, YYYY-MM-DD format)
|
||||
- Update description to mention CRUD capabilities: "Look up, check availability, or create calendar events"
|
||||
|
||||
3. **Tool result formatting check** (CAP-06):
|
||||
- Review agent runner prompt — the LLM already receives tool results as 'tool' role messages and formulates a response. Verify the system prompt does NOT contain instructions to dump raw JSON.
|
||||
- If the system prompt builder (`packages/orchestrator/orchestrator/agents/prompt.py` or similar) has tool-related instructions, ensure it says: "When using tool results, incorporate the information naturally into your response. Never show raw data or JSON to the user."
|
||||
- If no such instruction exists, add it as a tool usage instruction appended to the system prompt when tools are assigned.
|
||||
|
||||
4. **Verify CAP-04 (HTTP request tool)**: Confirm http_request.py needs no changes — it already works. Just verify it's in the tool registry and functions correctly.
|
||||
|
||||
5. **Verify CAP-07 (audit logging)**: Confirm executor.py already calls audit_logger.log_tool_call() on every invocation (it does — verified in code review). No changes needed.
|
||||
</action>
|
||||
<verify>
|
||||
<automated>cd /home/adelorenzo/repos/konstruct && python -c "from shared.api.kb import kb_router; from shared.api.calendar_auth import calendar_auth_router; print('Routers import OK')" && python -c "from orchestrator.tools.registry import TOOL_REGISTRY; print(f'Registry has {len(TOOL_REGISTRY)} tools')"</automated>
|
||||
</verify>
|
||||
<done>KB and Calendar Auth routers mounted on gateway. Calendar tool registry updated with CRUD parameters. System prompt includes tool result formatting instruction. CAP-04 (HTTP) confirmed working. CAP-07 (audit) confirmed working. All routers importable.</done>
|
||||
</task>
|
||||
|
||||
</tasks>
|
||||
|
||||
<verification>
|
||||
- Calendar OAuth endpoints accessible: GET /api/portal/calendar/install, GET /api/portal/calendar/callback
|
||||
- KB API endpoints accessible: POST/GET/DELETE /api/portal/kb/{tenant_id}/documents
|
||||
- Calendar tool supports list, create, check_availability actions
|
||||
- All unit tests pass: `pytest tests/unit/test_calendar_lookup.py tests/unit/test_calendar_auth.py -x -q`
|
||||
- Tool registry has updated calendar_lookup schema with CRUD params
|
||||
</verification>
|
||||
|
||||
<success_criteria>
|
||||
- Google Calendar OAuth flow: install -> Google consent -> callback -> encrypted token stored in channel_connections
|
||||
- Calendar tool reads per-tenant tokens and calls Google Calendar API for list, create, and availability check
|
||||
- Token auto-refresh works with write-back to DB
|
||||
- Natural language formatting on all tool results (no raw JSON)
|
||||
- All new routers mounted on gateway
|
||||
- CAP-04 and CAP-07 confirmed already working
|
||||
- All unit tests pass
|
||||
</success_criteria>
|
||||
|
||||
<output>
|
||||
After completion, create `.planning/phases/10-agent-capabilities/10-02-SUMMARY.md`
|
||||
</output>
|
||||
197
.planning/phases/10-agent-capabilities/10-03-PLAN.md
Normal file
197
.planning/phases/10-agent-capabilities/10-03-PLAN.md
Normal file
@@ -0,0 +1,197 @@
|
||||
---
|
||||
phase: 10-agent-capabilities
|
||||
plan: 03
|
||||
type: execute
|
||||
wave: 2
|
||||
depends_on: ["10-01"]
|
||||
files_modified:
|
||||
- packages/portal/app/(dashboard)/knowledge-base/page.tsx
|
||||
- packages/portal/components/kb/document-list.tsx
|
||||
- packages/portal/components/kb/upload-dialog.tsx
|
||||
- packages/portal/components/kb/url-ingest-dialog.tsx
|
||||
- packages/portal/components/nav/sidebar.tsx
|
||||
- packages/portal/lib/api.ts
|
||||
autonomous: false
|
||||
requirements:
|
||||
- CAP-03
|
||||
|
||||
must_haves:
|
||||
truths:
|
||||
- "Operators can see a Knowledge Base page in the portal navigation"
|
||||
- "Operators can upload files via drag-and-drop or file picker dialog"
|
||||
- "Operators can add URLs (web pages) and YouTube URLs for ingestion"
|
||||
- "Uploaded documents show processing status (processing, ready, error) with live polling"
|
||||
- "Operators can delete documents from the knowledge base"
|
||||
- "Operators can re-index a document"
|
||||
- "Customer operators can view the KB but not upload or delete (RBAC)"
|
||||
artifacts:
|
||||
- path: "packages/portal/app/(dashboard)/knowledge-base/page.tsx"
|
||||
provides: "KB management page with document list, upload, and URL ingestion"
|
||||
min_lines: 50
|
||||
- path: "packages/portal/components/kb/document-list.tsx"
|
||||
provides: "Document list component with status badges and action buttons"
|
||||
- path: "packages/portal/components/kb/upload-dialog.tsx"
|
||||
provides: "File upload dialog with drag-and-drop and file picker"
|
||||
key_links:
|
||||
- from: "packages/portal/app/(dashboard)/knowledge-base/page.tsx"
|
||||
to: "/api/portal/kb/{tenant_id}/documents"
|
||||
via: "TanStack Query fetch + polling"
|
||||
pattern: "useQuery.*kb.*documents"
|
||||
- from: "packages/portal/components/kb/upload-dialog.tsx"
|
||||
to: "/api/portal/kb/{tenant_id}/documents"
|
||||
via: "FormData multipart POST"
|
||||
pattern: "FormData.*upload"
|
||||
---
|
||||
|
||||
<objective>
|
||||
Build the Knowledge Base management page in the portal where operators can upload documents, add URLs, view processing status, and manage their tenant's knowledge base.
|
||||
|
||||
Purpose: Completes CAP-03 by providing the user-facing interface for document management. Operators need to see what's in their KB, upload new content, and monitor ingestion status.
|
||||
|
||||
Output: Fully functional /knowledge-base portal page with file upload, URL/YouTube ingestion, document list with status polling, delete, and re-index.
|
||||
</objective>
|
||||
|
||||
<execution_context>
|
||||
@/home/adelorenzo/.claude/get-shit-done/workflows/execute-plan.md
|
||||
@/home/adelorenzo/.claude/get-shit-done/templates/summary.md
|
||||
</execution_context>
|
||||
|
||||
<context>
|
||||
@.planning/PROJECT.md
|
||||
@.planning/ROADMAP.md
|
||||
@.planning/phases/10-agent-capabilities/10-CONTEXT.md
|
||||
@.planning/phases/10-agent-capabilities/10-01-SUMMARY.md
|
||||
|
||||
<interfaces>
|
||||
<!-- KB API endpoints from Plan 01 -->
|
||||
POST /api/portal/kb/{tenant_id}/documents — multipart file upload, returns 201 {id, filename, status}
|
||||
POST /api/portal/kb/{tenant_id}/documents/url — JSON {url, source_type}, returns 201 {id, source_url, status}
|
||||
GET /api/portal/kb/{tenant_id}/documents — returns [{id, filename, source_url, content_type, status, error_message, chunk_count, created_at}]
|
||||
DELETE /api/portal/kb/{tenant_id}/documents/{document_id} — returns 204
|
||||
POST /api/portal/kb/{tenant_id}/documents/{document_id}/reindex — returns 200
|
||||
|
||||
<!-- Portal patterns -->
|
||||
- TanStack Query for data fetching (useQuery, useMutation)
|
||||
- shadcn/ui components (Button, Dialog, Badge, Table, etc.)
|
||||
- Tailwind CSS for styling
|
||||
- next-intl useTranslations() for i18n
|
||||
- RBAC: session.user.role determines admin vs operator capabilities
|
||||
</interfaces>
|
||||
</context>
|
||||
|
||||
<tasks>
|
||||
|
||||
<task type="auto">
|
||||
<name>Task 1: Knowledge Base page with document list, upload, and URL ingestion</name>
|
||||
<files>
|
||||
packages/portal/app/(dashboard)/knowledge-base/page.tsx,
|
||||
packages/portal/components/kb/document-list.tsx,
|
||||
packages/portal/components/kb/upload-dialog.tsx,
|
||||
packages/portal/components/kb/url-ingest-dialog.tsx,
|
||||
packages/portal/lib/api.ts,
|
||||
packages/portal/components/nav/sidebar.tsx
|
||||
</files>
|
||||
<action>
|
||||
1. **Add KB link to navigation** (`sidebar.tsx` or equivalent nav component):
|
||||
- Add "Knowledge Base" link to sidebar nav, visible for platform_admin and customer_admin roles
|
||||
- customer_operator can view (read-only) — add to nav but upload/delete buttons hidden
|
||||
- Icon: use a document/book icon from lucide-react
|
||||
|
||||
2. **KB page** (`packages/portal/app/(dashboard)/knowledge-base/page.tsx`):
|
||||
- Server Component wrapper that renders the client KB content
|
||||
- Page title: "Knowledge Base" with subtitle showing tenant context
|
||||
- Two action buttons for admins: "Upload Files" (opens upload dialog), "Add URL" (opens URL dialog)
|
||||
- Document list component below actions
|
||||
- Use tenant_id from session/route context (same pattern as other dashboard pages)
|
||||
|
||||
3. **Document list** (`packages/portal/components/kb/document-list.tsx`):
|
||||
- Client component using useQuery to fetch GET /api/portal/kb/{tenant_id}/documents
|
||||
- Poll every 5 seconds while any document has status='processing' (refetchInterval: 5000 conditional)
|
||||
- Table with columns: Name (filename or source_url), Type (file/url/youtube), Status (badge), Chunks, Date, Actions
|
||||
- Status badges: "Processing" (amber/spinning), "Ready" (green), "Error" (red with tooltip showing error_message)
|
||||
- Actions per row (admin only): Delete button, Re-index button
|
||||
- Empty state: "No documents in knowledge base yet. Upload files or add URLs to get started."
|
||||
- Delete: useMutation calling DELETE endpoint, invalidate query on success, confirm dialog before delete
|
||||
- Re-index: useMutation calling POST reindex endpoint, invalidate query on success
|
||||
|
||||
4. **Upload dialog** (`packages/portal/components/kb/upload-dialog.tsx`):
|
||||
- shadcn/ui Dialog component
|
||||
- Drag-and-drop zone (onDragOver, onDrop handlers) with visual feedback
|
||||
- File picker button (input type="file" with accept for supported extensions: .pdf,.docx,.pptx,.xlsx,.csv,.txt,.md)
|
||||
- Support multiple file selection
|
||||
- Show selected files list before upload
|
||||
- Upload button: for each file, POST FormData to /api/portal/kb/{tenant_id}/documents
|
||||
- Show upload progress (file-by-file)
|
||||
- Close dialog and invalidate document list query on success
|
||||
- Error handling: show toast on failure
|
||||
|
||||
5. **URL ingest dialog** (`packages/portal/components/kb/url-ingest-dialog.tsx`):
|
||||
- shadcn/ui Dialog component
|
||||
- Input field for URL
|
||||
- Radio or select for source type: "Web Page" or "YouTube Video"
|
||||
- Auto-detect: if URL contains youtube.com or youtu.be, default to YouTube
|
||||
- Submit: POST to /api/portal/kb/{tenant_id}/documents/url
|
||||
- Close dialog and invalidate document list query on success
|
||||
|
||||
6. **API client updates** (`packages/portal/lib/api.ts`):
|
||||
- Add KB API functions: fetchKbDocuments, uploadKbDocument, addKbUrl, deleteKbDocument, reindexKbDocument
|
||||
- Use the same fetch wrapper pattern as existing API calls
|
||||
|
||||
7. **i18n**: Add English, Spanish, and Portuguese translations for KB page strings (following existing i18n pattern with next-intl message files). Add keys like: kb.title, kb.upload, kb.addUrl, kb.empty, kb.status.processing, kb.status.ready, kb.status.error, kb.delete.confirm, etc.
|
||||
</action>
|
||||
<verify>
|
||||
<automated>cd /home/adelorenzo/repos/konstruct/packages/portal && npx next build 2>&1 | tail -5</automated>
|
||||
</verify>
|
||||
<done>Knowledge Base page exists at /knowledge-base with document list, file upload dialog (drag-and-drop + picker), URL/YouTube ingest dialog, status polling, delete, and re-index. Navigation updated. i18n strings added for all three languages. Portal builds successfully.</done>
|
||||
</task>
|
||||
|
||||
<task type="checkpoint:human-verify" gate="blocking">
|
||||
<name>Task 2: Human verification of Knowledge Base portal page</name>
|
||||
<files>packages/portal/app/(dashboard)/knowledge-base/page.tsx</files>
|
||||
<action>
|
||||
Verify the Knowledge Base management page in the portal:
|
||||
- File upload via drag-and-drop and file picker (PDF, DOCX, PPTX, XLSX, CSV, TXT, MD)
|
||||
- URL ingestion (web pages via Firecrawl, YouTube transcripts)
|
||||
- Document list with live processing status (processing/ready/error)
|
||||
- Delete and re-index actions
|
||||
- RBAC: admins can upload/delete, operators can only view
|
||||
|
||||
Steps:
|
||||
1. Navigate to the portal and confirm "Knowledge Base" appears in the sidebar navigation
|
||||
2. Click Knowledge Base — verify the page loads with empty state message
|
||||
3. Click "Upload Files" — verify drag-and-drop zone and file picker appear
|
||||
4. Upload a small PDF or TXT file — verify it appears in the document list with "Processing" status
|
||||
5. Wait for processing to complete — verify status changes to "Ready" with chunk count
|
||||
6. Click "Add URL" — verify URL input dialog with web/YouTube type selector
|
||||
7. Add a URL — verify it appears in the list and processes
|
||||
8. Click delete on a document — verify confirmation dialog, then document removed
|
||||
9. If logged in as customer_operator — verify upload/delete buttons are hidden but document list is visible
|
||||
</action>
|
||||
<verify>Human verification of KB page functionality and RBAC</verify>
|
||||
<done>KB page approved by human testing — upload, URL ingest, status polling, delete, re-index, and RBAC all working</done>
|
||||
</task>
|
||||
|
||||
</tasks>
|
||||
|
||||
<verification>
|
||||
- Portal builds: `cd packages/portal && npx next build`
|
||||
- KB page renders at /knowledge-base
|
||||
- Document upload triggers backend ingestion
|
||||
- Status polling shows processing -> ready transition
|
||||
- RBAC enforced on upload/delete actions
|
||||
</verification>
|
||||
|
||||
<success_criteria>
|
||||
- Knowledge Base page accessible in portal navigation
|
||||
- File upload works with drag-and-drop and file picker
|
||||
- URL and YouTube ingestion works
|
||||
- Document list shows live processing status with polling
|
||||
- Delete and re-index work
|
||||
- RBAC enforced (admin: full access, operator: view only)
|
||||
- All three languages have KB translations
|
||||
- Human verification approved
|
||||
</success_criteria>
|
||||
|
||||
<output>
|
||||
After completion, create `.planning/phases/10-agent-capabilities/10-03-SUMMARY.md`
|
||||
</output>
|
||||
621
.planning/phases/10-agent-capabilities/10-RESEARCH.md
Normal file
621
.planning/phases/10-agent-capabilities/10-RESEARCH.md
Normal file
@@ -0,0 +1,621 @@
|
||||
# Phase 10: Agent Capabilities - Research
|
||||
|
||||
**Researched:** 2026-03-26
|
||||
**Domain:** Document ingestion pipeline, Google Calendar OAuth, web search activation, KB portal UI
|
||||
**Confidence:** HIGH
|
||||
|
||||
---
|
||||
|
||||
<user_constraints>
|
||||
## User Constraints (from CONTEXT.md)
|
||||
|
||||
### Locked Decisions
|
||||
|
||||
- **KB format support:** PDF, DOCX/Word, TXT, Markdown, CSV/Excel, PPT/PowerPoint, URLs (via Firecrawl), YouTube (transcript API + Whisper fallback)
|
||||
- **KB scope:** Per-tenant — all agents in a tenant share the same knowledge base
|
||||
- **KB portal:** Dedicated KB management page (not inline in Agent Designer)
|
||||
- Upload files (drag-and-drop + file picker)
|
||||
- Add URLs for scraping
|
||||
- Add YouTube URLs for transcription
|
||||
- View ingested documents with status (processing, ready, error)
|
||||
- Delete documents (removes chunks from pgvector)
|
||||
- Re-index option
|
||||
- **Document processing:** Async/background via Celery — upload returns immediately
|
||||
- **Processing status:** Visible in portal (progress indicator per document)
|
||||
- **Web search:** Brave Search API already implemented in `web_search.py` — just needs `BRAVE_API_KEY` added to `.env`
|
||||
- **HTTP request tool:** Already implemented — no changes needed
|
||||
- **Calendar:** Google Calendar OAuth per tenant — tenant admin authorizes in portal; full CRUD for v1 (check availability, list upcoming events, create events); OAuth callback in portal; credentials stored encrypted via Fernet
|
||||
|
||||
### Claude's Discretion
|
||||
|
||||
- Web search: platform-wide vs per-tenant API key (recommend platform-wide)
|
||||
- Chunking strategy (chunk size, overlap)
|
||||
- Embedding model for KB (reuse all-MiniLM-L6-v2 or upgrade)
|
||||
- Firecrawl integration approach (self-hosted vs cloud API)
|
||||
- YouTube transcription: when to use existing captions vs OpenWhisper
|
||||
- Document size limits
|
||||
- KB chunk deduplication strategy
|
||||
|
||||
### Deferred Ideas (OUT OF SCOPE)
|
||||
|
||||
None — discussion stayed within phase scope.
|
||||
</user_constraints>
|
||||
|
||||
---
|
||||
|
||||
<phase_requirements>
|
||||
## Phase Requirements
|
||||
|
||||
| ID | Description | Research Support |
|
||||
|----|-------------|-----------------|
|
||||
| CAP-01 | Web search tool returns real results from Brave Search | Tool already calls Brave API — just needs `BRAVE_API_KEY` env var set; `web_search.py` is production-ready |
|
||||
| CAP-02 | KB tool searches tenant-scoped documents that have been uploaded, chunked, and embedded in pgvector | `kb_search.py` + `kb_chunks` table + HNSW index all exist; needs real chunk data from the ingestion pipeline |
|
||||
| CAP-03 | Operators can upload documents (PDF, DOCX, TXT + more formats) via portal | Needs: new FastAPI `/api/portal/kb/*` router, Celery ingestion task, portal `/knowledge-base` page, per-format text extraction libraries |
|
||||
| CAP-04 | HTTP request tool can call operator-configured URLs with response parsing and timeout handling | `http_request.py` is fully implemented — no code changes needed, only documentation |
|
||||
| CAP-05 | Calendar tool can check Google Calendar availability | Stub in `calendar_lookup.py` must be replaced with per-tenant OAuth token read + Google Calendar API call |
|
||||
| CAP-06 | Tool results incorporated naturally into agent responses — no raw JSON | Agent runner already formats tool results as text strings; this is an LLM prompt quality concern, not architecture |
|
||||
| CAP-07 | All tool invocations logged in audit trail with input parameters and output summary | `execute_tool()` in executor.py already calls `audit_logger.log_tool_call()` on every invocation — already satisfied |
|
||||
</phase_requirements>
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
Phase 10 has two distinct effort levels. CAP-01, CAP-04, CAP-07, and partially CAP-06 are already architecturally complete — they need configuration, environment variables, or documentation rather than new code. The heavy lifting is CAP-03 (document ingestion pipeline) and CAP-05 (Google Calendar OAuth per tenant).
|
||||
|
||||
The document ingestion pipeline is the largest deliverable: a multipart file upload endpoint, text extraction for 7 format families, chunking + embedding Celery task, MinIO storage for original files, status tracking on `kb_documents`, and a new portal page with drag-and-drop upload and live status polling. The KB table schema and pgvector HNSW index already exist from Phase 2 migration 004.
|
||||
|
||||
The Google Calendar integration requires replacing the service-account stub in `calendar_lookup.py` with per-tenant OAuth token lookup (decrypt from DB), building a Google OAuth initiation + callback endpoint pair in the gateway, storing encrypted access+refresh tokens per tenant, and expanding the calendar tool to support event creation in addition to read. This follows the same HMAC-signed state + encrypted token storage pattern already used for Slack OAuth.
|
||||
|
||||
**Primary recommendation:** Build the document ingestion pipeline first (CAP-02/CAP-03), then Google Calendar OAuth (CAP-05), then wire CAP-01 via `.env` configuration.
|
||||
|
||||
---
|
||||
|
||||
## Standard Stack
|
||||
|
||||
### Core (Python backend)
|
||||
|
||||
| Library | Version | Purpose | Why Standard |
|
||||
|---------|---------|---------|--------------|
|
||||
| `pypdf` | >=4.0 | PDF text extraction | Pure Python, no C deps, fast, reliable for standard PDFs |
|
||||
| `python-docx` | >=1.1 | DOCX text extraction | Official-style library, handles paragraphs + tables |
|
||||
| `python-pptx` | >=1.0 | PPT/PPTX text extraction | Standard library for PowerPoint, iterates slides/shapes |
|
||||
| `openpyxl` | >=3.1 | XLSX text extraction | Already likely installed; reads cell values with `data_only=True` |
|
||||
| `pandas` | >=2.0 | CSV + Excel parsing | Handles encodings, type coercion, multi-sheet Excel |
|
||||
| `firecrawl-py` | >=1.0 | URL scraping to markdown | Returns clean LLM-ready markdown, handles JS rendering |
|
||||
| `youtube-transcript-api` | >=1.2 | YouTube caption extraction | No API key needed, works with auto-generated captions |
|
||||
| `google-api-python-client` | >=2.0 | Google Calendar API calls | Official Google client |
|
||||
| `google-auth-oauthlib` | >=1.0 | Google OAuth 2.0 web flow | Handles code exchange, token refresh |
|
||||
|
||||
### Supporting
|
||||
|
||||
| Library | Version | Purpose | When to Use |
|
||||
|---------|---------|---------|-------------|
|
||||
| `aiofiles` | >=23.0 | Async file I/O in FastAPI upload handler | Prevents blocking event loop during file writes |
|
||||
| `python-multipart` | already installed (FastAPI dep) | Multipart form parsing for UploadFile | Required by FastAPI for file upload endpoints |
|
||||
|
||||
### Alternatives Considered
|
||||
|
||||
| Instead of | Could Use | Tradeoff |
|
||||
|------------|-----------|----------|
|
||||
| `pypdf` | `pymupdf4llm` | pymupdf4llm is faster and higher quality but has GPL/AGPL license restrictions |
|
||||
| `pypdf` | `pdfplumber` | pdfplumber is better for tables but 4x slower; sufficient for KB ingestion |
|
||||
| `firecrawl-py` (cloud API) | Self-hosted Firecrawl | Self-hosted has full feature parity via Docker but adds infrastructure overhead; cloud API is simpler for v1 |
|
||||
| `youtube-transcript-api` | `openai-whisper` | Whisper requires model download + GPU; use youtube-transcript-api first and fall back to Whisper only when captions are unavailable |
|
||||
| Simple text chunking | `langchain-text-splitters` | langchain-text-splitters adds a large dependency for what is ~20 lines of custom code; write a simple recursive chunker inline |
|
||||
|
||||
**Installation:**
|
||||
|
||||
```bash
|
||||
# Orchestrator: document processing + Google Calendar
|
||||
uv add --project packages/orchestrator \
|
||||
pypdf python-docx python-pptx openpyxl pandas \
|
||||
firecrawl-py youtube-transcript-api \
|
||||
google-api-python-client google-auth-oauthlib
|
||||
|
||||
# Gateway: file upload endpoint (python-multipart already installed via FastAPI)
|
||||
# No additional deps needed for gateway
|
||||
|
||||
# Add status column to kb_documents: handled in new Alembic migration
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Architecture Patterns
|
||||
|
||||
### Recommended Project Structure (new files this phase)
|
||||
|
||||
```
|
||||
packages/
|
||||
├── orchestrator/orchestrator/
|
||||
│ ├── tasks.py # Add: ingest_document Celery task
|
||||
│ └── tools/builtins/
|
||||
│ └── calendar_lookup.py # Replace stub with OAuth token lookup + full CRUD
|
||||
├── shared/shared/
|
||||
│ ├── api/
|
||||
│ │ ├── kb.py # New: KB management router (upload, list, delete)
|
||||
│ │ └── calendar_auth.py # New: Google Calendar OAuth initiation + callback
|
||||
│ └── models/
|
||||
│ └── kb.py # Extend: add status + error_message columns
|
||||
migrations/versions/
|
||||
└── 013_kb_document_status.py # New: add status + error_message to kb_documents
|
||||
packages/portal/app/(dashboard)/
|
||||
└── knowledge-base/
|
||||
└── page.tsx # New: KB management page
|
||||
```
|
||||
|
||||
### Pattern 1: Document Ingestion Pipeline (CAP-02/CAP-03)
|
||||
|
||||
**What:** Upload returns immediately (201), a Celery task handles text extraction → chunking → embedding → pgvector insert asynchronously.
|
||||
|
||||
**When to use:** All document types (file, URL, YouTube).
|
||||
|
||||
```
|
||||
POST /api/portal/kb/upload (multipart file)
|
||||
→ Save file to MinIO (kb-documents bucket)
|
||||
→ Insert KbDocument with status='processing'
|
||||
→ Return 201 with document ID
|
||||
→ [async] ingest_document.delay(document_id, tenant_id)
|
||||
→ Extract text (format-specific extractor)
|
||||
→ Chunk text (500 chars, 50 char overlap)
|
||||
→ embed_texts(chunks) in batch
|
||||
→ INSERT kb_chunks rows
|
||||
→ UPDATE kb_documents SET status='ready'
|
||||
→ On error: UPDATE kb_documents SET status='error', error_message=...
|
||||
|
||||
GET /api/portal/kb/{tenant_id}/documents
|
||||
→ List KbDocument rows with status field for portal polling
|
||||
|
||||
DELETE /api/portal/kb/{document_id}
|
||||
→ DELETE KbDocument (CASCADE deletes kb_chunks via FK)
|
||||
→ DELETE file from MinIO
|
||||
```
|
||||
|
||||
**Migration 013 needed — add to `kb_documents`:**
|
||||
|
||||
```sql
|
||||
-- status: processing | ready | error
|
||||
ALTER TABLE kb_documents ADD COLUMN status TEXT NOT NULL DEFAULT 'processing';
|
||||
ALTER TABLE kb_documents ADD COLUMN error_message TEXT;
|
||||
ALTER TABLE kb_documents ADD COLUMN chunk_count INTEGER;
|
||||
```
|
||||
|
||||
Note: `kb_documents.agent_id` is `NOT NULL` in the existing schema but KB is now tenant-scoped (all agents share it). Resolution: use a sentinel UUID (e.g., all-zeros UUID) or make `agent_id` nullable in migration 013. Making it nullable is cleaner.
|
||||
|
||||
### Pattern 2: Text Extraction by Format
|
||||
|
||||
```python
|
||||
# Source: standard library usage — no external doc needed
|
||||
|
||||
def extract_text(file_bytes: bytes, filename: str) -> str:
|
||||
ext = filename.lower().rsplit(".", 1)[-1]
|
||||
|
||||
if ext == "pdf":
|
||||
from pypdf import PdfReader
|
||||
import io
|
||||
reader = PdfReader(io.BytesIO(file_bytes))
|
||||
return "\n".join(p.extract_text() or "" for p in reader.pages)
|
||||
|
||||
elif ext in ("docx",):
|
||||
from docx import Document
|
||||
import io
|
||||
doc = Document(io.BytesIO(file_bytes))
|
||||
return "\n".join(p.text for p in doc.paragraphs)
|
||||
|
||||
elif ext in ("pptx",):
|
||||
from pptx import Presentation
|
||||
import io
|
||||
prs = Presentation(io.BytesIO(file_bytes))
|
||||
lines = []
|
||||
for slide in prs.slides:
|
||||
for shape in slide.shapes:
|
||||
if hasattr(shape, "text"):
|
||||
lines.append(shape.text)
|
||||
return "\n".join(lines)
|
||||
|
||||
elif ext in ("xlsx", "xls"):
|
||||
import pandas as pd
|
||||
import io
|
||||
df = pd.read_excel(io.BytesIO(file_bytes))
|
||||
return df.to_csv(index=False)
|
||||
|
||||
elif ext == "csv":
|
||||
return file_bytes.decode("utf-8", errors="replace")
|
||||
|
||||
elif ext in ("txt", "md"):
|
||||
return file_bytes.decode("utf-8", errors="replace")
|
||||
|
||||
else:
|
||||
raise ValueError(f"Unsupported file extension: {ext}")
|
||||
```
|
||||
|
||||
### Pattern 3: Chunking Strategy (Claude's Discretion)
|
||||
|
||||
**Recommendation:** Simple recursive chunking with `chunk_size=500, overlap=50` (characters, not tokens). This matches the `all-MiniLM-L6-v2` model's effective input length (~256 tokens ≈ ~1000 chars) with room to spare.
|
||||
|
||||
```python
|
||||
def chunk_text(text: str, chunk_size: int = 500, overlap: int = 50) -> list[str]:
|
||||
"""Split text into overlapping chunks."""
|
||||
chunks = []
|
||||
start = 0
|
||||
while start < len(text):
|
||||
end = start + chunk_size
|
||||
chunks.append(text[start:end])
|
||||
start += chunk_size - overlap
|
||||
return [c.strip() for c in chunks if c.strip()]
|
||||
```
|
||||
|
||||
No external library needed. `langchain-text-splitters` would add ~50MB of dependencies for this single use case.
|
||||
|
||||
### Pattern 4: Google Calendar OAuth per Tenant (CAP-05)
|
||||
|
||||
**What:** Each tenant authorizes Konstruct to access their Google Calendar. OAuth tokens (access + refresh) stored encrypted in a new `calendar_tokens` DB table per tenant (or in `channel_connections` as a `google_calendar` entry — reuse existing pattern).
|
||||
|
||||
**Reuse `channel_connections` table:** Add `channel_type = 'google_calendar'` entry per tenant. Store encrypted token JSON in `config` JSONB column. This avoids a new migration for a new table.
|
||||
|
||||
```
|
||||
GET /api/portal/calendar/install?tenant_id={id}
|
||||
→ Generate HMAC-signed OAuth state (same generate_oauth_state() as Slack)
|
||||
→ Return Google OAuth URL with state param
|
||||
|
||||
GET /api/portal/calendar/callback?code={code}&state={state}
|
||||
→ Verify HMAC state → extract tenant_id
|
||||
→ Exchange code for {access_token, refresh_token, expiry}
|
||||
→ Encrypt token JSON with Fernet
|
||||
→ Upsert ChannelConnection(channel_type='google_calendar', config={...})
|
||||
→ Redirect to portal /settings/calendar?connected=true
|
||||
```
|
||||
|
||||
**Google OAuth scopes needed (FULL CRUD per locked decision):**
|
||||
|
||||
```python
|
||||
_GOOGLE_CALENDAR_SCOPES = [
|
||||
"https://www.googleapis.com/auth/calendar", # Full read+write
|
||||
]
|
||||
# NOT readonly — create events requires full calendar scope
|
||||
```
|
||||
|
||||
**calendar_lookup.py replacement — per-tenant token lookup:**
|
||||
|
||||
```python
|
||||
async def calendar_lookup(
|
||||
date: str,
|
||||
action: str = "list", # list | create | check_availability
|
||||
event_summary: str | None = None,
|
||||
event_start: str | None = None, # ISO 8601 with timezone
|
||||
event_end: str | None = None,
|
||||
calendar_id: str = "primary",
|
||||
tenant_id: str | None = None, # Injected by executor
|
||||
**kwargs: object,
|
||||
) -> str:
|
||||
# 1. Load encrypted token from channel_connections
|
||||
# 2. Decrypt with KeyEncryptionService
|
||||
# 3. Build google.oauth2.credentials.Credentials from token dict
|
||||
# 4. Auto-refresh if expired (google-auth handles this)
|
||||
# 5. Call Calendar API (list or insert)
|
||||
# 6. Format result as natural language
|
||||
```
|
||||
|
||||
**Token refresh:** `google.oauth2.credentials.Credentials` auto-refreshes using the stored `refresh_token` when `access_token` is expired. After any refresh, write the updated token back to `channel_connections.config`.
|
||||
|
||||
### Pattern 5: URL Ingestion via Firecrawl (CAP-03)
|
||||
|
||||
```python
|
||||
from firecrawl import FirecrawlApp
|
||||
|
||||
async def scrape_url(url: str) -> str:
|
||||
app = FirecrawlApp(api_key=settings.firecrawl_api_key)
|
||||
result = app.scrape_url(url, params={"formats": ["markdown"]})
|
||||
return result.get("markdown", "")
|
||||
```
|
||||
|
||||
**Claude's Discretion recommendation:** Use Firecrawl cloud API for v1. Add `FIRECRAWL_API_KEY` to `.env`. Self-host only when data sovereignty is required.
|
||||
|
||||
### Pattern 6: YouTube Ingestion (CAP-03)
|
||||
|
||||
```python
|
||||
from youtube_transcript_api import YouTubeTranscriptApi
|
||||
from youtube_transcript_api.formatters import TextFormatter
|
||||
|
||||
def get_youtube_transcript(video_url: str) -> str:
|
||||
# Extract video ID from URL
|
||||
video_id = _extract_video_id(video_url)
|
||||
|
||||
# Try to fetch existing captions (no API key needed)
|
||||
ytt_api = YouTubeTranscriptApi()
|
||||
try:
|
||||
transcript = ytt_api.fetch(video_id)
|
||||
formatter = TextFormatter()
|
||||
return formatter.format_transcript(transcript)
|
||||
except Exception:
|
||||
# Fall back to Whisper transcription if captions unavailable
|
||||
raise ValueError("No captions available and Whisper not configured")
|
||||
```
|
||||
|
||||
**Claude's Discretion recommendation:** For v1, skip Whisper entirely — only ingest YouTube videos that have existing captions (auto-generated counts). Add Whisper as a future enhancement. Return a user-friendly error when captions are unavailable.
|
||||
|
||||
### Anti-Patterns to Avoid
|
||||
|
||||
- **Synchronous text extraction in FastAPI endpoint:** Extracting PDF/DOCX text blocks the event loop. Always delegate to the Celery task.
|
||||
- **Storing raw file bytes in PostgreSQL:** Use MinIO for file storage; only store the MinIO key in `kb_documents`.
|
||||
- **Re-embedding on every search:** Embed the search query in `kb_search.py` (already done), not at document query time.
|
||||
- **Loading SentenceTransformer per Celery task invocation:** Already solved via the lazy singleton in `embedder.py`. Import `embed_texts` from the same module.
|
||||
- **Using service account for Google Calendar:** The stub uses `GOOGLE_SERVICE_ACCOUNT_KEY` (wrong for per-tenant user data). Replace with per-tenant OAuth tokens.
|
||||
- **Storing Google refresh tokens in env vars:** Must be per-tenant in DB, encrypted with Fernet.
|
||||
- **Making `agent_id NOT NULL` on KB documents:** KB is now tenant-scoped (per locked decision). Migration 013 must make `agent_id` nullable. The `kb_search.py` tool already accepts `agent_id` but does not filter by it.
|
||||
|
||||
---
|
||||
|
||||
## Don't Hand-Roll
|
||||
|
||||
| Problem | Don't Build | Use Instead | Why |
|
||||
|---------|-------------|-------------|-----|
|
||||
| PDF text extraction | Custom PDF parser | `pypdf` | PDF binary format is extremely complex; pypdf handles encryption, compressed streams, multi-page |
|
||||
| DOCX parsing | XML unzipper | `python-docx` | DOCX is a zip of XML schemas; python-docx handles versioning, embedded tables, styles |
|
||||
| YouTube caption fetching | YouTube Data API scraper | `youtube-transcript-api` | No API key needed, handles 10+ subtitle track formats, works with auto-generated captions |
|
||||
| OAuth token refresh | Custom token refresh logic | `google.oauth2.credentials.Credentials` | google-auth handles expiry, refresh, and HTTP headers automatically |
|
||||
| URL → clean text | httpx + BeautifulSoup | `firecrawl-py` | Firecrawl handles JS rendering, anti-bot bypass, returns clean markdown |
|
||||
| Text chunking | Custom sentence splitter | Simple recursive char splitter (20 lines) | No library needed; langchain-text-splitters adds bloat for a single use case |
|
||||
|
||||
**Key insight:** Document parsing libraries handle edge cases that take months to rediscover (corrupted headers, nested tables, character encoding, password-protected files). The only thing worth writing custom is the chunking algorithm, which is genuinely trivial.
|
||||
|
||||
---
|
||||
|
||||
## Common Pitfalls
|
||||
|
||||
### Pitfall 1: `kb_documents.agent_id` is NOT NULL in Migration 004
|
||||
|
||||
**What goes wrong:** Inserting a KB document without an `agent_id` will fail with a DB constraint error. The locked decision says KB is per-tenant (not per-agent), so there is no `agent_id` context at upload time.
|
||||
|
||||
**Why it happens:** The original Phase 2 schema assumed per-agent knowledge bases. The locked decision changed this to per-tenant.
|
||||
|
||||
**How to avoid:** Migration 013 must `ALTER TABLE kb_documents ALTER COLUMN agent_id DROP NOT NULL`. Update the ORM model in `shared/models/kb.py` to match.
|
||||
|
||||
**Warning signs:** `IntegrityError: null value in column "agent_id"` when uploading a KB document.
|
||||
|
||||
### Pitfall 2: Celery Tasks Are Always `sync def` with `asyncio.run()`
|
||||
|
||||
**What goes wrong:** Writing `async def ingest_document(...)` as a Celery task causes `RuntimeError: no running event loop` or silent task hang.
|
||||
|
||||
**Why it happens:** Celery workers are not async-native. This is a hard architectural constraint documented in `tasks.py`.
|
||||
|
||||
**How to avoid:** `ingest_document` must be `def ingest_document(...)` with `asyncio.run()` for any async DB operations.
|
||||
|
||||
**Warning signs:** Task appears in the Celery queue but never completes; no exception in logs.
|
||||
|
||||
### Pitfall 3: Google OAuth Callback Must Not Require Auth
|
||||
|
||||
**What goes wrong:** If the `/api/portal/calendar/callback` endpoint has `Depends(require_tenant_admin)`, Google's redirect will fail because the callback URL has no session cookie.
|
||||
|
||||
**Why it happens:** OAuth callbacks are external redirects — they arrive unauthenticated.
|
||||
|
||||
**How to avoid:** The callback endpoint must be unauthenticated (no RBAC dependency). Tenant identity is recovered from the HMAC-signed `state` parameter, same as the Slack callback pattern in `channels.py`.
|
||||
|
||||
**Warning signs:** HTTP 401 or redirect loop on the callback URL.
|
||||
|
||||
### Pitfall 4: Google Access Token Expiry + Write-Back
|
||||
|
||||
**What goes wrong:** A calendar tool call fails with 401 after the access token (1-hour TTL) expires, even though the refresh token is stored.
|
||||
|
||||
**Why it happens:** `google.oauth2.credentials.Credentials` auto-refreshes in-memory but does not persist the new token to the database.
|
||||
|
||||
**How to avoid:** After every Google API call, check `credentials.token` — if it changed (i.e., a refresh occurred), write the updated token JSON back to `channel_connections.config`. Use an `after_refresh` callback or check the token before and after.
|
||||
|
||||
**Warning signs:** Calendar tool works once, then fails 1 hour later.
|
||||
|
||||
### Pitfall 5: pypdf Returns Empty String for Scanned PDFs
|
||||
|
||||
**What goes wrong:** `page.extract_text()` returns `""` for image-based scanned PDFs. The document is ingested with zero chunks and returns no results in KB search.
|
||||
|
||||
**Why it happens:** pypdf only reads embedded text — it cannot OCR images.
|
||||
|
||||
**How to avoid:** After extraction, check if text length < 100 characters. If so, set `status='error'` with `error_message="This PDF contains images only. Text extraction requires OCR, which is not yet supported."`.
|
||||
|
||||
**Warning signs:** Document status shows "ready" but KB search returns nothing.
|
||||
|
||||
### Pitfall 6: `ChannelTypeEnum` Does Not Include `google_calendar`
|
||||
|
||||
**What goes wrong:** Inserting a `ChannelConnection` with `channel_type='google_calendar'` fails if `ChannelTypeEnum` only includes messaging channels.
|
||||
|
||||
**Why it happens:** `ChannelTypeEnum` was defined in Phase 1 for messaging channels only.
|
||||
|
||||
**How to avoid:** Check `shared/models/tenant.py` — if `ChannelTypeEnum` is a Python `Enum` using `sa.Enum`, adding a new value requires a DB migration. Per the Phase 1 ADR, channel_type is stored as `TEXT` with a `CHECK` constraint, which makes adding new values trivial.
|
||||
|
||||
**Warning signs:** `LookupError` or `IntegrityError` when inserting the Google Calendar connection.
|
||||
|
||||
---
|
||||
|
||||
## Code Examples
|
||||
|
||||
### Upload Endpoint Pattern (FastAPI multipart)
|
||||
|
||||
```python
|
||||
# Source: FastAPI official docs — https://fastapi.tiangolo.com/tutorial/request-files/
|
||||
from fastapi import UploadFile, File, Form
|
||||
import uuid
|
||||
|
||||
@kb_router.post("/{tenant_id}/documents", status_code=201)
|
||||
async def upload_document(
|
||||
tenant_id: uuid.UUID,
|
||||
file: UploadFile = File(...),
|
||||
caller: PortalCaller = Depends(require_tenant_admin),
|
||||
session: AsyncSession = Depends(get_session),
|
||||
) -> dict:
|
||||
file_bytes = await file.read()
|
||||
# 1. Upload to MinIO
|
||||
# 2. Insert KbDocument(status='processing')
|
||||
# 3. ingest_document.delay(str(doc.id), str(tenant_id))
|
||||
# 4. Return 201 with doc.id
|
||||
```
|
||||
|
||||
### Google Calendar Token Storage Pattern
|
||||
|
||||
```python
|
||||
# Reuse existing ChannelConnection + HMAC OAuth state from channels.py
|
||||
# After OAuth callback:
|
||||
token_data = {
|
||||
"token": credentials.token,
|
||||
"refresh_token": credentials.refresh_token,
|
||||
"token_uri": credentials.token_uri,
|
||||
"client_id": settings.google_client_id,
|
||||
"client_secret": settings.google_client_secret,
|
||||
"scopes": list(credentials.scopes),
|
||||
"expiry": credentials.expiry.isoformat() if credentials.expiry else None,
|
||||
}
|
||||
enc_svc = _get_encryption_service()
|
||||
encrypted_token = enc_svc.encrypt(json.dumps(token_data))
|
||||
|
||||
conn = ChannelConnection(
|
||||
tenant_id=tenant_id,
|
||||
channel_type="google_calendar", # TEXT column — no enum migration needed
|
||||
workspace_id=str(tenant_id), # Sentinel: tenant ID as workspace ID
|
||||
config={"token": encrypted_token},
|
||||
)
|
||||
```
|
||||
|
||||
### Celery Ingestion Task Structure
|
||||
|
||||
```python
|
||||
# Source: tasks.py architectural pattern (always sync def + asyncio.run())
|
||||
@celery_app.task(bind=True, max_retries=3)
|
||||
def ingest_document(self, document_id: str, tenant_id: str) -> None:
|
||||
"""Background document ingestion — extract, chunk, embed, store."""
|
||||
try:
|
||||
asyncio.run(_ingest_document_async(document_id, tenant_id))
|
||||
except Exception as exc:
|
||||
asyncio.run(_mark_document_error(document_id, str(exc)))
|
||||
raise self.retry(exc=exc, countdown=60)
|
||||
```
|
||||
|
||||
### Google Calendar Event Creation
|
||||
|
||||
```python
|
||||
# Source: https://developers.google.com/workspace/calendar/api/guides/create-events
|
||||
event_body = {
|
||||
"summary": event_summary,
|
||||
"start": {"dateTime": event_start, "timeZone": "UTC"},
|
||||
"end": {"dateTime": event_end, "timeZone": "UTC"},
|
||||
}
|
||||
event = service.events().insert(calendarId="primary", body=event_body).execute()
|
||||
return f"Event created: {event.get('summary')} at {event.get('start', {}).get('dateTime')}"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## State of the Art
|
||||
|
||||
| Old Approach | Current Approach | When Changed | Impact |
|
||||
|--------------|------------------|--------------|--------|
|
||||
| `calendar_lookup.py` uses service account (global) | Per-tenant OAuth tokens (per locked decision) | Phase 10 | Agents access each tenant's own calendar, not a shared service account |
|
||||
| KB is per-agent (`agent_id NOT NULL`) | KB is per-tenant (`agent_id` nullable) | Phase 10 locked decision | All agents in a tenant share one knowledge base |
|
||||
| `youtube-transcript-api` v0.x synchronous only | v1.2.4 (Jan 2026) uses `YouTubeTranscriptApi()` instance | 2025 | Minor API change — instantiate the class, call `.fetch(video_id)` |
|
||||
|
||||
**Deprecated/outdated:**
|
||||
|
||||
- `calendar_lookup.py` service account path: To be replaced entirely. The `GOOGLE_SERVICE_ACCOUNT_KEY` env var check should be removed.
|
||||
- `agent_id NOT NULL` on `kb_documents`: Migration 013 removes this constraint.
|
||||
|
||||
---
|
||||
|
||||
## Open Questions
|
||||
|
||||
1. **Firecrawl API key management**
|
||||
- What we know: `firecrawl-py` SDK connects to cloud API by default; self-hosted option available
|
||||
- What's unclear: Whether to add `FIRECRAWL_API_KEY` as a platform-wide setting in `shared/config.py` or as a tenant BYO credential
|
||||
- Recommendation: Add as platform-wide `FIRECRAWL_API_KEY` in `settings` (same pattern as `BRAVE_API_KEY`); make it optional with graceful degradation
|
||||
|
||||
2. **`ChannelTypeEnum` compatibility for `google_calendar`**
|
||||
- What we know: Phase 1 ADR chose `TEXT + CHECK` over `sa.Enum` to avoid migration DDL conflicts
|
||||
- What's unclear: Whether there's a CHECK constraint that needs updating, or if it's open TEXT
|
||||
- Recommendation: Inspect `channel_connections` table DDL in migration 001 before writing migration 013
|
||||
|
||||
3. **Document re-index flow**
|
||||
- What we know: CONTEXT.md mentions a re-index option in the KB portal
|
||||
- What's unclear: Whether re-index deletes all existing chunks first or appends
|
||||
- Recommendation: Delete all `kb_chunks` for the document, then re-run `ingest_document.delay()` — simplest and idempotent
|
||||
|
||||
4. **Whisper fallback for YouTube**
|
||||
- What we know: `openai-whisper` requires model download (~140MB minimum) and GPU for reasonable speed
|
||||
- What's unclear: Whether v1 should include Whisper at all given the infrastructure cost
|
||||
- Recommendation: Omit Whisper for v1; return error when captions unavailable; add to v2 requirements
|
||||
|
||||
---
|
||||
|
||||
## Validation Architecture
|
||||
|
||||
### Test Framework
|
||||
|
||||
| Property | Value |
|
||||
|----------|-------|
|
||||
| Framework | pytest + pytest-asyncio (existing) |
|
||||
| Config file | `pytest.ini` or `pyproject.toml [tool.pytest]` at repo root |
|
||||
| Quick run command | `pytest tests/unit -x -q` |
|
||||
| Full suite command | `pytest tests/unit tests/integration -x -q` |
|
||||
|
||||
### Phase Requirements → Test Map
|
||||
|
||||
| Req ID | Behavior | Test Type | Automated Command | File Exists? |
|
||||
|--------|----------|-----------|-------------------|-------------|
|
||||
| CAP-01 | `web_search()` returns Brave results when key is set; gracefully degrades when key is missing | unit | `pytest tests/unit/test_web_search.py -x` | ❌ Wave 0 |
|
||||
| CAP-02 | `kb_search()` returns ranked chunks for a query after ingestion | integration | `pytest tests/integration/test_kb_search.py -x` | ❌ Wave 0 |
|
||||
| CAP-03 | File upload endpoint accepts PDF/DOCX/TXT, creates KbDocument with status=processing, triggers Celery task | unit+integration | `pytest tests/unit/test_kb_upload.py tests/integration/test_kb_ingestion.py -x` | ❌ Wave 0 |
|
||||
| CAP-04 | `http_request()` returns correct response; rejects invalid methods; handles timeout | unit | `pytest tests/unit/test_http_request.py -x` | ❌ Wave 0 |
|
||||
| CAP-05 | Calendar tool reads tenant token from DB, calls Google API, returns formatted events | unit (mock Google) | `pytest tests/unit/test_calendar_lookup.py -x` | ❌ Wave 0 |
|
||||
| CAP-06 | Tool results in agent responses are natural language, not raw JSON | unit (prompt check) | `pytest tests/unit/test_tool_response_format.py -x` | ❌ Wave 0 |
|
||||
| CAP-07 | Every tool invocation writes an audit_events row with tool name + args summary | integration | Covered by existing `tests/integration/test_audit.py` — extend with tool invocation cases | ✅ (extend) |
|
||||
|
||||
### Sampling Rate
|
||||
|
||||
- **Per task commit:** `pytest tests/unit -x -q`
|
||||
- **Per wave merge:** `pytest tests/unit tests/integration -x -q`
|
||||
- **Phase gate:** Full suite green before `/gsd:verify-work`
|
||||
|
||||
### Wave 0 Gaps
|
||||
|
||||
- [ ] `tests/unit/test_web_search.py` — covers CAP-01 (mock httpx, test key-missing degradation + success path)
|
||||
- [ ] `tests/unit/test_kb_upload.py` — covers CAP-03 upload endpoint (mock MinIO, mock Celery task dispatch)
|
||||
- [ ] `tests/unit/test_kb_ingestion.py` — covers text extraction functions per format (PDF, DOCX, TXT, CSV)
|
||||
- [ ] `tests/integration/test_kb_search.py` — covers CAP-02 (real pgvector, insert test chunks, verify similarity search)
|
||||
- [ ] `tests/integration/test_kb_ingestion.py` — covers CAP-03 end-to-end (upload → task → chunks in DB)
|
||||
- [ ] `tests/unit/test_http_request.py` — covers CAP-04 (mock httpx, test method validation, timeout)
|
||||
- [ ] `tests/unit/test_calendar_lookup.py` — covers CAP-05 (mock Google API, mock DB token lookup)
|
||||
|
||||
---
|
||||
|
||||
## Sources
|
||||
|
||||
### Primary (HIGH confidence)
|
||||
|
||||
- FastAPI official docs (https://fastapi.tiangolo.com/tutorial/request-files/) — UploadFile pattern
|
||||
- Google Calendar API docs (https://developers.google.com/workspace/calendar/api/guides/create-events) — event creation
|
||||
- Google OAuth 2.0 web server docs (https://developers.google.com/identity/protocols/oauth2/web-server) — token exchange flow
|
||||
- Existing codebase: `packages/orchestrator/orchestrator/tools/builtins/` — 4 tool files reviewed
|
||||
- Existing codebase: `migrations/versions/004_phase2_audit_kb.py` — KB schema confirmed
|
||||
- Existing codebase: `packages/shared/shared/api/channels.py` — Slack OAuth HMAC pattern to reuse
|
||||
- Existing codebase: `packages/orchestrator/orchestrator/tools/executor.py` — CAP-07 already implemented
|
||||
|
||||
### Secondary (MEDIUM confidence)
|
||||
|
||||
- PyPI: `youtube-transcript-api` v1.2.4 (Jan 2026) — version + API confirmed
|
||||
- PyPI: `firecrawl-py` — cloud + self-hosted documented
|
||||
- WebSearch 2025: pypdf for PDF extraction — confirmed as lightweight, no C-deps option
|
||||
- WebSearch 2025: Celery sync def constraint confirmed via tasks.py docstring cross-reference
|
||||
|
||||
### Tertiary (LOW confidence)
|
||||
|
||||
- Chunking parameters (500 chars, 50 overlap) — from community RAG practice, not benchmarked for this dataset
|
||||
- Firecrawl cloud vs self-hosted recommendation — based on project stage, not measured performance comparison
|
||||
|
||||
---
|
||||
|
||||
## Metadata
|
||||
|
||||
**Confidence breakdown:**
|
||||
|
||||
- Standard stack: HIGH — all libraries verified via PyPI + official docs
|
||||
- Architecture: HIGH — pattern directly extends existing Phase 1-3 Slack OAuth and Celery task patterns in codebase
|
||||
- Pitfalls: HIGH — agent_id NOT NULL issue is verified directly from migration 004 source code; token write-back is documented in google-auth source
|
||||
- Chunking strategy: MEDIUM — recommended values are community defaults, not project-specific benchmarks
|
||||
|
||||
**Research date:** 2026-03-26
|
||||
**Valid until:** 2026-06-26 (stable domain; Google OAuth API is very stable)
|
||||
82
.planning/phases/10-agent-capabilities/10-VALIDATION.md
Normal file
82
.planning/phases/10-agent-capabilities/10-VALIDATION.md
Normal file
@@ -0,0 +1,82 @@
|
||||
---
|
||||
phase: 10
|
||||
slug: agent-capabilities
|
||||
status: draft
|
||||
nyquist_compliant: false
|
||||
wave_0_complete: false
|
||||
created: 2026-03-26
|
||||
---
|
||||
|
||||
# Phase 10 — Validation Strategy
|
||||
|
||||
> Per-phase validation contract for feedback sampling during execution.
|
||||
|
||||
---
|
||||
|
||||
## Test Infrastructure
|
||||
|
||||
| Property | Value |
|
||||
|----------|-------|
|
||||
| **Framework** | pytest 8.x + pytest-asyncio (existing) |
|
||||
| **Config file** | `pyproject.toml` (existing) |
|
||||
| **Quick run command** | `pytest tests/unit -x -q` |
|
||||
| **Full suite command** | `pytest tests/ -x` |
|
||||
| **Estimated runtime** | ~45 seconds |
|
||||
|
||||
---
|
||||
|
||||
## Sampling Rate
|
||||
|
||||
- **After every task commit:** Run `pytest tests/unit -x -q`
|
||||
- **After every plan wave:** Run `pytest tests/ -x`
|
||||
- **Before `/gsd:verify-work`:** Full suite must be green
|
||||
- **Max feedback latency:** 45 seconds
|
||||
|
||||
---
|
||||
|
||||
## Per-Task Verification Map
|
||||
|
||||
| Task ID | Plan | Wave | Requirement | Test Type | Automated Command | File Exists | Status |
|
||||
|---------|------|------|-------------|-----------|-------------------|-------------|--------|
|
||||
| 10-xx | 01 | 1 | CAP-01 | unit | `pytest tests/unit/test_web_search.py -x` | ❌ W0 | ⬜ pending |
|
||||
| 10-xx | 01 | 1 | CAP-02,03 | unit | `pytest tests/unit/test_kb_ingestion.py -x` | ❌ W0 | ⬜ pending |
|
||||
| 10-xx | 01 | 1 | CAP-04 | unit | `pytest tests/unit/test_http_request.py -x` | ❌ W0 | ⬜ pending |
|
||||
| 10-xx | 02 | 2 | CAP-05 | unit | `pytest tests/unit/test_calendar.py -x` | ❌ W0 | ⬜ pending |
|
||||
| 10-xx | 02 | 2 | CAP-06 | unit | `pytest tests/unit/test_tool_output.py -x` | ❌ W0 | ⬜ pending |
|
||||
| 10-xx | 03 | 2 | CAP-03 | build | `cd packages/portal && npx next build` | ✅ | ⬜ pending |
|
||||
| 10-xx | 03 | 2 | CAP-07 | integration | `pytest tests/integration/test_audit.py -x` | ✅ extend | ⬜ pending |
|
||||
|
||||
---
|
||||
|
||||
## Wave 0 Requirements
|
||||
|
||||
- [ ] `tests/unit/test_web_search.py` — CAP-01: Brave Search API integration
|
||||
- [ ] `tests/unit/test_kb_ingestion.py` — CAP-02,03: document chunking, embedding, search
|
||||
- [ ] `tests/unit/test_http_request.py` — CAP-04: HTTP request tool validation
|
||||
- [ ] `tests/unit/test_calendar.py` — CAP-05: Google Calendar OAuth + CRUD
|
||||
- [ ] `tests/unit/test_tool_output.py` — CAP-06: natural language tool result formatting
|
||||
- [ ] Install: `uv add pypdf python-docx python-pptx openpyxl pandas firecrawl-py youtube-transcript-api google-auth google-auth-oauthlib google-api-python-client`
|
||||
|
||||
---
|
||||
|
||||
## Manual-Only Verifications
|
||||
|
||||
| Behavior | Requirement | Why Manual | Test Instructions |
|
||||
|----------|-------------|------------|-------------------|
|
||||
| Web search returns real results | CAP-01 | Requires live Brave API key | Send message requiring web search, verify results |
|
||||
| Document upload + search works end-to-end | CAP-02,03 | Requires file upload + LLM | Upload PDF, ask agent about its content |
|
||||
| Calendar books a meeting | CAP-05 | Requires live Google Calendar OAuth | Connect calendar, ask agent to book a meeting |
|
||||
| Agent response reads naturally with tool data | CAP-06 | Qualitative assessment | Chat with agent using tools, verify natural language |
|
||||
|
||||
---
|
||||
|
||||
## Validation Sign-Off
|
||||
|
||||
- [ ] All tasks have `<automated>` verify or Wave 0 dependencies
|
||||
- [ ] Sampling continuity: no 3 consecutive tasks without automated verify
|
||||
- [ ] Wave 0 covers all MISSING references
|
||||
- [ ] No watch-mode flags
|
||||
- [ ] Feedback latency < 45s
|
||||
- [ ] `nyquist_compliant: true` set in frontmatter
|
||||
|
||||
**Approval:** pending
|
||||
Reference in New Issue
Block a user