Files

4.9 KiB

Phase 9: Testing & QA - Context

Gathered: 2026-03-26 Status: Ready for planning

## Phase Boundary

Automated testing infrastructure and quality audits. Playwright E2E tests for critical user flows, Lighthouse performance/accessibility audits, visual regression snapshots at 3 viewports, axe-core accessibility validation, cross-browser testing (Chrome/Firefox/Safari), and a CI-ready pipeline. Goal: beta-ready confidence that the platform works.

## Implementation Decisions

All decisions at Claude's discretion — user trusts judgment.

E2E Test Scope & Priority

  • Playwright for all E2E tests (cross-browser built-in, official Next.js recommendation)
  • Critical flows to test (priority order):
    1. Login → dashboard loads → session persists
    2. Create tenant → tenant appears in list
    3. Deploy template agent → agent appears in employees list
    4. Chat: open conversation → send message → receive streaming response (mock LLM)
    5. RBAC: operator cannot access /agents/new, /billing, /users
    6. Language switcher → UI updates to selected language
    7. Mobile viewport: bottom tab bar renders, sidebar hidden
  • LLM responses mocked in E2E tests (no real Ollama/API calls) — deterministic, fast, CI-safe
  • Test data: seed a test tenant + test user via API calls in test setup, clean up after

Lighthouse & Performance

  • Target scores: >= 90 for Performance, Accessibility, Best Practices, SEO
  • Run Lighthouse CI on: login page, dashboard, chat page, agents/new page
  • Fail CI if any score drops below 80 (warning at 85, target 90)

Visual Regression

  • Playwright screenshot comparison at 3 viewports: desktop (1280x800), tablet (768x1024), mobile (375x812)
  • Key pages: login, dashboard, agents list, agents/new (3-card entry), chat (empty state), templates gallery
  • Baseline snapshots committed to repo — CI fails on unexpected visual diff
  • Update snapshots intentionally via npx playwright test --update-snapshots

Accessibility

  • axe-core integrated via @axe-core/playwright
  • Run on every page during E2E flows — zero critical violations required
  • Violations at "serious" level logged as warnings, not blockers (for beta)
  • Keyboard navigation test: Tab through login form, chat input, nav items

Cross-Browser

  • Playwright projects: chromium, firefox, webkit (Safari)
  • All E2E tests run on all 3 browsers
  • Visual regression only on chromium (browser rendering diffs are expected)

CI Pipeline

  • Gitea Actions (matches existing infrastructure at git.oe74.net)
  • Workflow triggers: push to main, pull request to main
  • Pipeline stages: lint → type-check → unit tests (pytest) → build portal → E2E tests → Lighthouse
  • Docker Compose for CI (postgres + redis + gateway + portal) — same containers as dev
  • Test results: JUnit XML for test reports, HTML for Playwright trace viewer
  • Fail-fast: lint/type errors block everything; unit test failures block E2E

Claude's Discretion

  • Playwright config details (timeouts, retries, parallelism)
  • Test file organization (by feature vs by page)
  • Fixture/helper patterns for auth, tenant setup, API mocking
  • Lighthouse CI tool (lighthouse-ci vs @lhci/cli)
  • Whether to include a smoke test for the WebSocket chat connection
  • Visual regression threshold (pixel diff tolerance)
## Specific Ideas
  • E2E tests should be the "would I trust this with a real customer?" gate
  • Mock the LLM but test the full WebSocket flow — the streaming UX was the hardest part to get right
  • The CI pipeline should be fast enough to not block development — target < 5 minutes total
  • Visual regression catches the kind of CSS regressions that unit tests miss entirely

<code_context>

Existing Code Insights

Reusable Assets

  • packages/portal/ — Next.js 16 standalone output (Playwright can test against it)
  • docker-compose.yml — Full stack definition (reuse for CI with test DB)
  • tests/ directory — Backend pytest suite (316+ tests) — already CI-compatible
  • .env.example — Template for CI environment variables
  • Playwright MCP plugin already installed (used for manual testing during development)

Established Patterns

  • Backend tests use pytest + pytest-asyncio with integration test fixtures
  • Portal builds via npm run build (already verified in every phase)
  • Auth: email/password via Auth.js v5 JWT (Playwright can automate login)
  • API: FastAPI with RBAC headers (E2E tests need to set session cookies)

Integration Points

  • CI needs: PostgreSQL, Redis, gateway, llm-pool (or mock), portal containers
  • Playwright tests run against the built portal (localhost:3000)
  • Backend tests run against test DB (separate from dev DB)
  • Gitea Actions runner on git.oe74.net (needs Docker-in-Docker or host Docker access)

</code_context>

## Deferred Ideas

None — discussion stayed within phase scope


Phase: 09-testing-qa Context gathered: 2026-03-26