Compare commits
2 Commits
df6bce7289
...
1db2e0c052
| Author | SHA1 | Date | |
|---|---|---|---|
| 1db2e0c052 | |||
| 972ef9b1f7 |
@@ -3,11 +3,11 @@ gsd_state_version: 1.0
|
|||||||
milestone: v1.0
|
milestone: v1.0
|
||||||
milestone_name: milestone
|
milestone_name: milestone
|
||||||
status: completed
|
status: completed
|
||||||
stopped_at: Completed 08-mobile-pwa 08-04-PLAN.md — Phase 08 and v1.0 milestone complete
|
stopped_at: Phase 9 context gathered
|
||||||
last_updated: "2026-03-26T03:38:45.402Z"
|
last_updated: "2026-03-26T04:11:53.479Z"
|
||||||
last_activity: 2026-03-23 — Completed 03-02 onboarding wizard, Slack OAuth, BYO API keys
|
last_activity: 2026-03-23 — Completed 03-02 onboarding wizard, Slack OAuth, BYO API keys
|
||||||
progress:
|
progress:
|
||||||
total_phases: 8
|
total_phases: 9
|
||||||
completed_phases: 8
|
completed_phases: 8
|
||||||
total_plans: 33
|
total_plans: 33
|
||||||
completed_plans: 33
|
completed_plans: 33
|
||||||
@@ -212,6 +212,6 @@ None — all phases complete.
|
|||||||
|
|
||||||
## Session Continuity
|
## Session Continuity
|
||||||
|
|
||||||
Last session: 2026-03-26T03:33:24.016Z
|
Last session: 2026-03-26T04:11:53.475Z
|
||||||
Stopped at: Completed 08-mobile-pwa 08-04-PLAN.md — Phase 08 and v1.0 milestone complete
|
Stopped at: Phase 9 context gathered
|
||||||
Resume file: None
|
Resume file: .planning/phases/09-testing-qa/09-CONTEXT.md
|
||||||
|
|||||||
115
.planning/phases/09-testing-qa/09-CONTEXT.md
Normal file
115
.planning/phases/09-testing-qa/09-CONTEXT.md
Normal file
@@ -0,0 +1,115 @@
|
|||||||
|
# Phase 9: Testing & QA - Context
|
||||||
|
|
||||||
|
**Gathered:** 2026-03-26
|
||||||
|
**Status:** Ready for planning
|
||||||
|
|
||||||
|
<domain>
|
||||||
|
## Phase Boundary
|
||||||
|
|
||||||
|
Automated testing infrastructure and quality audits. Playwright E2E tests for critical user flows, Lighthouse performance/accessibility audits, visual regression snapshots at 3 viewports, axe-core accessibility validation, cross-browser testing (Chrome/Firefox/Safari), and a CI-ready pipeline. Goal: beta-ready confidence that the platform works.
|
||||||
|
|
||||||
|
</domain>
|
||||||
|
|
||||||
|
<decisions>
|
||||||
|
## Implementation Decisions
|
||||||
|
|
||||||
|
All decisions at Claude's discretion — user trusts judgment.
|
||||||
|
|
||||||
|
### E2E Test Scope & Priority
|
||||||
|
- Playwright for all E2E tests (cross-browser built-in, official Next.js recommendation)
|
||||||
|
- Critical flows to test (priority order):
|
||||||
|
1. Login → dashboard loads → session persists
|
||||||
|
2. Create tenant → tenant appears in list
|
||||||
|
3. Deploy template agent → agent appears in employees list
|
||||||
|
4. Chat: open conversation → send message → receive streaming response (mock LLM)
|
||||||
|
5. RBAC: operator cannot access /agents/new, /billing, /users
|
||||||
|
6. Language switcher → UI updates to selected language
|
||||||
|
7. Mobile viewport: bottom tab bar renders, sidebar hidden
|
||||||
|
- LLM responses mocked in E2E tests (no real Ollama/API calls) — deterministic, fast, CI-safe
|
||||||
|
- Test data: seed a test tenant + test user via API calls in test setup, clean up after
|
||||||
|
|
||||||
|
### Lighthouse & Performance
|
||||||
|
- Target scores: >= 90 for Performance, Accessibility, Best Practices, SEO
|
||||||
|
- Run Lighthouse CI on: login page, dashboard, chat page, agents/new page
|
||||||
|
- Fail CI if any score drops below 80 (warning at 85, target 90)
|
||||||
|
|
||||||
|
### Visual Regression
|
||||||
|
- Playwright screenshot comparison at 3 viewports: desktop (1280x800), tablet (768x1024), mobile (375x812)
|
||||||
|
- Key pages: login, dashboard, agents list, agents/new (3-card entry), chat (empty state), templates gallery
|
||||||
|
- Baseline snapshots committed to repo — CI fails on unexpected visual diff
|
||||||
|
- Update snapshots intentionally via `npx playwright test --update-snapshots`
|
||||||
|
|
||||||
|
### Accessibility
|
||||||
|
- axe-core integrated via @axe-core/playwright
|
||||||
|
- Run on every page during E2E flows — zero critical violations required
|
||||||
|
- Violations at "serious" level logged as warnings, not blockers (for beta)
|
||||||
|
- Keyboard navigation test: Tab through login form, chat input, nav items
|
||||||
|
|
||||||
|
### Cross-Browser
|
||||||
|
- Playwright projects: chromium, firefox, webkit (Safari)
|
||||||
|
- All E2E tests run on all 3 browsers
|
||||||
|
- Visual regression only on chromium (browser rendering diffs are expected)
|
||||||
|
|
||||||
|
### CI Pipeline
|
||||||
|
- Gitea Actions (matches existing infrastructure at git.oe74.net)
|
||||||
|
- Workflow triggers: push to main, pull request to main
|
||||||
|
- Pipeline stages: lint → type-check → unit tests (pytest) → build portal → E2E tests → Lighthouse
|
||||||
|
- Docker Compose for CI (postgres + redis + gateway + portal) — same containers as dev
|
||||||
|
- Test results: JUnit XML for test reports, HTML for Playwright trace viewer
|
||||||
|
- Fail-fast: lint/type errors block everything; unit test failures block E2E
|
||||||
|
|
||||||
|
### Claude's Discretion
|
||||||
|
- Playwright config details (timeouts, retries, parallelism)
|
||||||
|
- Test file organization (by feature vs by page)
|
||||||
|
- Fixture/helper patterns for auth, tenant setup, API mocking
|
||||||
|
- Lighthouse CI tool (lighthouse-ci vs @lhci/cli)
|
||||||
|
- Whether to include a smoke test for the WebSocket chat connection
|
||||||
|
- Visual regression threshold (pixel diff tolerance)
|
||||||
|
|
||||||
|
</decisions>
|
||||||
|
|
||||||
|
<specifics>
|
||||||
|
## Specific Ideas
|
||||||
|
|
||||||
|
- E2E tests should be the "would I trust this with a real customer?" gate
|
||||||
|
- Mock the LLM but test the full WebSocket flow — the streaming UX was the hardest part to get right
|
||||||
|
- The CI pipeline should be fast enough to not block development — target < 5 minutes total
|
||||||
|
- Visual regression catches the kind of CSS regressions that unit tests miss entirely
|
||||||
|
|
||||||
|
</specifics>
|
||||||
|
|
||||||
|
<code_context>
|
||||||
|
## Existing Code Insights
|
||||||
|
|
||||||
|
### Reusable Assets
|
||||||
|
- `packages/portal/` — Next.js 16 standalone output (Playwright can test against it)
|
||||||
|
- `docker-compose.yml` — Full stack definition (reuse for CI with test DB)
|
||||||
|
- `tests/` directory — Backend pytest suite (316+ tests) — already CI-compatible
|
||||||
|
- `.env.example` — Template for CI environment variables
|
||||||
|
- Playwright MCP plugin already installed (used for manual testing during development)
|
||||||
|
|
||||||
|
### Established Patterns
|
||||||
|
- Backend tests use pytest + pytest-asyncio with integration test fixtures
|
||||||
|
- Portal builds via `npm run build` (already verified in every phase)
|
||||||
|
- Auth: email/password via Auth.js v5 JWT (Playwright can automate login)
|
||||||
|
- API: FastAPI with RBAC headers (E2E tests need to set session cookies)
|
||||||
|
|
||||||
|
### Integration Points
|
||||||
|
- CI needs: PostgreSQL, Redis, gateway, llm-pool (or mock), portal containers
|
||||||
|
- Playwright tests run against the built portal (localhost:3000)
|
||||||
|
- Backend tests run against test DB (separate from dev DB)
|
||||||
|
- Gitea Actions runner on git.oe74.net (needs Docker-in-Docker or host Docker access)
|
||||||
|
|
||||||
|
</code_context>
|
||||||
|
|
||||||
|
<deferred>
|
||||||
|
## Deferred Ideas
|
||||||
|
|
||||||
|
None — discussion stayed within phase scope
|
||||||
|
|
||||||
|
</deferred>
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
*Phase: 09-testing-qa*
|
||||||
|
*Context gathered: 2026-03-26*
|
||||||
Reference in New Issue
Block a user