3.7 KiB
3.7 KiB
phase, slug, status, nyquist_compliant, wave_0_complete, created
| phase | slug | status | nyquist_compliant | wave_0_complete | created |
|---|---|---|---|---|---|
| 2 | agent-features | draft | false | false | 2026-03-23 |
Phase 2 — Validation Strategy
Per-phase validation contract for feedback sampling during execution.
Test Infrastructure
| Property | Value |
|---|---|
| Framework | pytest 8.x + pytest-asyncio (existing from Phase 1) |
| Config file | pyproject.toml (existing [tool.pytest.ini_options]) |
| Quick run command | pytest tests/unit -x -q |
| Full suite command | pytest tests/ -x |
| Estimated runtime | ~45 seconds |
Sampling Rate
- After every task commit: Run
pytest tests/unit -x -q - After every plan wave: Run
pytest tests/ -x - Before
/gsd:verify-work: Full suite must be green - Max feedback latency: 45 seconds
Per-Task Verification Map
| Task ID | Plan | Wave | Requirement | Test Type | Automated Command | File Exists | Status |
|---|---|---|---|---|---|---|---|
| 02-01-xx | 01 | 1 | AGNT-02 | unit | pytest tests/unit/test_memory_short_term.py -x |
❌ W0 | ⬜ pending |
| 02-01-xx | 01 | 1 | AGNT-03 | integration | pytest tests/integration/test_memory_long_term.py -x |
❌ W0 | ⬜ pending |
| 02-02-xx | 02 | 1 | AGNT-04 | unit | pytest tests/unit/test_tool_registry.py tests/unit/test_tool_executor.py -x |
❌ W0 | ⬜ pending |
| 02-02-xx | 02 | 1 | AGNT-06 | integration | pytest tests/integration/test_audit.py -x |
❌ W0 | ⬜ pending |
| 02-03-xx | 03 | 2 | CHAN-03 | unit | pytest tests/unit/test_whatsapp_verify.py tests/unit/test_whatsapp_normalize.py -x |
❌ W0 | ⬜ pending |
| 02-03-xx | 03 | 2 | CHAN-04 | unit | pytest tests/unit/test_whatsapp_scoping.py -x |
❌ W0 | ⬜ pending |
| 02-04-xx | 04 | 2 | AGNT-05 | unit+integ | pytest tests/unit/test_escalation.py tests/integration/test_escalation.py -x |
❌ W0 | ⬜ pending |
Status: ⬜ pending · ✅ green · ❌ red · ⚠️ flaky
Wave 0 Requirements
tests/unit/test_memory_short_term.py— AGNT-02: Redis sliding windowtests/integration/test_memory_long_term.py— AGNT-03: pgvector retrieval + tenant isolationtests/unit/test_tool_registry.py— AGNT-04: tool registry lookuptests/unit/test_tool_executor.py— AGNT-04: schema validation + confirmationtests/integration/test_audit.py— AGNT-06: audit immutabilitytests/unit/test_escalation.py— AGNT-05: transcript packagingtests/integration/test_escalation.py— AGNT-05: DM deliverytests/unit/test_whatsapp_verify.py— CHAN-03: webhook signature verificationtests/unit/test_whatsapp_normalize.py— CHAN-03: message normalizationtests/unit/test_whatsapp_scoping.py— CHAN-04: business-function gatetests/conftest.py— extend with pgvector fixtures, mock MinIO (moto)- Install:
uv add --dev moto(S3/MinIO mocking)
Manual-Only Verifications
| Behavior | Requirement | Why Manual | Test Instructions |
|---|---|---|---|
| WhatsApp message delivery via live Cloud API | CHAN-03 | Requires Meta-verified WABA + phone number | Send test message to configured WhatsApp number, verify reply |
| Media (image) round-trip via WhatsApp | CHAN-03 | Requires live WhatsApp + multimodal LLM | Send photo to agent, verify interpretation; request doc from agent |
Validation Sign-Off
- All tasks have
<automated>verify or Wave 0 dependencies - Sampling continuity: no 3 consecutive tasks without automated verify
- Wave 0 covers all MISSING references
- No watch-mode flags
- Feedback latency < 45s
nyquist_compliant: trueset in frontmatter
Approval: pending