From 95d05f5f880f7fcf2db9c45765ab7c3c87ad4e49 Mon Sep 17 00:00:00 2001 From: Adolfo Delorenzo Date: Wed, 25 Mar 2026 23:24:53 -0600 Subject: [PATCH] docs(10): add research and validation strategy --- .../10-agent-capabilities/10-VALIDATION.md | 82 +++++++++++++++++++ 1 file changed, 82 insertions(+) create mode 100644 .planning/phases/10-agent-capabilities/10-VALIDATION.md diff --git a/.planning/phases/10-agent-capabilities/10-VALIDATION.md b/.planning/phases/10-agent-capabilities/10-VALIDATION.md new file mode 100644 index 0000000..5e9f65c --- /dev/null +++ b/.planning/phases/10-agent-capabilities/10-VALIDATION.md @@ -0,0 +1,82 @@ +--- +phase: 10 +slug: agent-capabilities +status: draft +nyquist_compliant: false +wave_0_complete: false +created: 2026-03-26 +--- + +# Phase 10 — Validation Strategy + +> Per-phase validation contract for feedback sampling during execution. + +--- + +## Test Infrastructure + +| Property | Value | +|----------|-------| +| **Framework** | pytest 8.x + pytest-asyncio (existing) | +| **Config file** | `pyproject.toml` (existing) | +| **Quick run command** | `pytest tests/unit -x -q` | +| **Full suite command** | `pytest tests/ -x` | +| **Estimated runtime** | ~45 seconds | + +--- + +## Sampling Rate + +- **After every task commit:** Run `pytest tests/unit -x -q` +- **After every plan wave:** Run `pytest tests/ -x` +- **Before `/gsd:verify-work`:** Full suite must be green +- **Max feedback latency:** 45 seconds + +--- + +## Per-Task Verification Map + +| Task ID | Plan | Wave | Requirement | Test Type | Automated Command | File Exists | Status | +|---------|------|------|-------------|-----------|-------------------|-------------|--------| +| 10-xx | 01 | 1 | CAP-01 | unit | `pytest tests/unit/test_web_search.py -x` | ❌ W0 | ⬜ pending | +| 10-xx | 01 | 1 | CAP-02,03 | unit | `pytest tests/unit/test_kb_ingestion.py -x` | ❌ W0 | ⬜ pending | +| 10-xx | 01 | 1 | CAP-04 | unit | `pytest tests/unit/test_http_request.py -x` | ❌ W0 | ⬜ pending | +| 10-xx | 02 | 2 | CAP-05 | unit | `pytest tests/unit/test_calendar.py -x` | ❌ W0 | ⬜ pending | +| 10-xx | 02 | 2 | CAP-06 | unit | `pytest tests/unit/test_tool_output.py -x` | ❌ W0 | ⬜ pending | +| 10-xx | 03 | 2 | CAP-03 | build | `cd packages/portal && npx next build` | ✅ | ⬜ pending | +| 10-xx | 03 | 2 | CAP-07 | integration | `pytest tests/integration/test_audit.py -x` | ✅ extend | ⬜ pending | + +--- + +## Wave 0 Requirements + +- [ ] `tests/unit/test_web_search.py` — CAP-01: Brave Search API integration +- [ ] `tests/unit/test_kb_ingestion.py` — CAP-02,03: document chunking, embedding, search +- [ ] `tests/unit/test_http_request.py` — CAP-04: HTTP request tool validation +- [ ] `tests/unit/test_calendar.py` — CAP-05: Google Calendar OAuth + CRUD +- [ ] `tests/unit/test_tool_output.py` — CAP-06: natural language tool result formatting +- [ ] Install: `uv add pypdf python-docx python-pptx openpyxl pandas firecrawl-py youtube-transcript-api google-auth google-auth-oauthlib google-api-python-client` + +--- + +## Manual-Only Verifications + +| Behavior | Requirement | Why Manual | Test Instructions | +|----------|-------------|------------|-------------------| +| Web search returns real results | CAP-01 | Requires live Brave API key | Send message requiring web search, verify results | +| Document upload + search works end-to-end | CAP-02,03 | Requires file upload + LLM | Upload PDF, ask agent about its content | +| Calendar books a meeting | CAP-05 | Requires live Google Calendar OAuth | Connect calendar, ask agent to book a meeting | +| Agent response reads naturally with tool data | CAP-06 | Qualitative assessment | Chat with agent using tools, verify natural language | + +--- + +## Validation Sign-Off + +- [ ] All tasks have `` verify or Wave 0 dependencies +- [ ] Sampling continuity: no 3 consecutive tasks without automated verify +- [ ] Wave 0 covers all MISSING references +- [ ] No watch-mode flags +- [ ] Feedback latency < 45s +- [ ] `nyquist_compliant: true` set in frontmatter + +**Approval:** pending