From ac54d819f8add630f406304da70f31996eefec4a Mon Sep 17 00:00:00 2001 From: Adolfo Delorenzo Date: Mon, 23 Mar 2026 14:16:42 -0600 Subject: [PATCH] docs(02): add research and validation strategy --- .../phases/02-agent-features/02-VALIDATION.md | 88 +++++++++++++++++++ 1 file changed, 88 insertions(+) create mode 100644 .planning/phases/02-agent-features/02-VALIDATION.md diff --git a/.planning/phases/02-agent-features/02-VALIDATION.md b/.planning/phases/02-agent-features/02-VALIDATION.md new file mode 100644 index 0000000..218bc69 --- /dev/null +++ b/.planning/phases/02-agent-features/02-VALIDATION.md @@ -0,0 +1,88 @@ +--- +phase: 2 +slug: agent-features +status: draft +nyquist_compliant: false +wave_0_complete: false +created: 2026-03-23 +--- + +# Phase 2 — Validation Strategy + +> Per-phase validation contract for feedback sampling during execution. + +--- + +## Test Infrastructure + +| Property | Value | +|----------|-------| +| **Framework** | pytest 8.x + pytest-asyncio (existing from Phase 1) | +| **Config file** | `pyproject.toml` (existing `[tool.pytest.ini_options]`) | +| **Quick run command** | `pytest tests/unit -x -q` | +| **Full suite command** | `pytest tests/ -x` | +| **Estimated runtime** | ~45 seconds | + +--- + +## Sampling Rate + +- **After every task commit:** Run `pytest tests/unit -x -q` +- **After every plan wave:** Run `pytest tests/ -x` +- **Before `/gsd:verify-work`:** Full suite must be green +- **Max feedback latency:** 45 seconds + +--- + +## Per-Task Verification Map + +| Task ID | Plan | Wave | Requirement | Test Type | Automated Command | File Exists | Status | +|---------|------|------|-------------|-----------|-------------------|-------------|--------| +| 02-01-xx | 01 | 1 | AGNT-02 | unit | `pytest tests/unit/test_memory_short_term.py -x` | ❌ W0 | ⬜ pending | +| 02-01-xx | 01 | 1 | AGNT-03 | integration | `pytest tests/integration/test_memory_long_term.py -x` | ❌ W0 | ⬜ pending | +| 02-02-xx | 02 | 1 | AGNT-04 | unit | `pytest tests/unit/test_tool_registry.py tests/unit/test_tool_executor.py -x` | ❌ W0 | ⬜ pending | +| 02-02-xx | 02 | 1 | AGNT-06 | integration | `pytest tests/integration/test_audit.py -x` | ❌ W0 | ⬜ pending | +| 02-03-xx | 03 | 2 | CHAN-03 | unit | `pytest tests/unit/test_whatsapp_verify.py tests/unit/test_whatsapp_normalize.py -x` | ❌ W0 | ⬜ pending | +| 02-03-xx | 03 | 2 | CHAN-04 | unit | `pytest tests/unit/test_whatsapp_scoping.py -x` | ❌ W0 | ⬜ pending | +| 02-04-xx | 04 | 2 | AGNT-05 | unit+integ | `pytest tests/unit/test_escalation.py tests/integration/test_escalation.py -x` | ❌ W0 | ⬜ pending | + +*Status: ⬜ pending · ✅ green · ❌ red · ⚠️ flaky* + +--- + +## Wave 0 Requirements + +- [ ] `tests/unit/test_memory_short_term.py` — AGNT-02: Redis sliding window +- [ ] `tests/integration/test_memory_long_term.py` — AGNT-03: pgvector retrieval + tenant isolation +- [ ] `tests/unit/test_tool_registry.py` — AGNT-04: tool registry lookup +- [ ] `tests/unit/test_tool_executor.py` — AGNT-04: schema validation + confirmation +- [ ] `tests/integration/test_audit.py` — AGNT-06: audit immutability +- [ ] `tests/unit/test_escalation.py` — AGNT-05: transcript packaging +- [ ] `tests/integration/test_escalation.py` — AGNT-05: DM delivery +- [ ] `tests/unit/test_whatsapp_verify.py` — CHAN-03: webhook signature verification +- [ ] `tests/unit/test_whatsapp_normalize.py` — CHAN-03: message normalization +- [ ] `tests/unit/test_whatsapp_scoping.py` — CHAN-04: business-function gate +- [ ] `tests/conftest.py` — extend with pgvector fixtures, mock MinIO (moto) +- [ ] Install: `uv add --dev moto` (S3/MinIO mocking) + +--- + +## Manual-Only Verifications + +| Behavior | Requirement | Why Manual | Test Instructions | +|----------|-------------|------------|-------------------| +| WhatsApp message delivery via live Cloud API | CHAN-03 | Requires Meta-verified WABA + phone number | Send test message to configured WhatsApp number, verify reply | +| Media (image) round-trip via WhatsApp | CHAN-03 | Requires live WhatsApp + multimodal LLM | Send photo to agent, verify interpretation; request doc from agent | + +--- + +## Validation Sign-Off + +- [ ] All tasks have `` verify or Wave 0 dependencies +- [ ] Sampling continuity: no 3 consecutive tasks without automated verify +- [ ] Wave 0 covers all MISSING references +- [ ] No watch-mode flags +- [ ] Feedback latency < 45s +- [ ] `nyquist_compliant: true` set in frontmatter + +**Approval:** pending