docs(09-testing-qa): create phase plan

2026-03-25 22:26:03 -06:00
parent a46ff0a970
commit e31690e37a
4 changed files with 607 additions and 5 deletions
--- a/.planning/ROADMAP.md
+++ b/.planning/ROADMAP.md
@@ -131,7 +131,7 @@ Plans:
 ## Progress
 **Execution Order:**
-Phases execute in numeric order: 1 -> 2 -> 3 -> 4 -> 5 -> 6 -> 7 -> 8
+Phases execute in numeric order: 1 -> 2 -> 3 -> 4 -> 5 -> 6 -> 7 -> 8 -> 9
 | Phase | Plans Complete | Status | Completed |
 |-------|----------------|--------|-----------|
@@ -143,7 +143,7 @@ Phases execute in numeric order: 1 -> 2 -> 3 -> 4 -> 5 -> 6 -> 7 -> 8
 | 6. Web Chat | 3/3 | Complete    | 2026-03-25 |
 | 7. Multilanguage | 4/4 | Complete    | 2026-03-25 |
 | 8. Mobile + PWA | 4/4 | Complete    | 2026-03-26 |
-| 9. Testing & QA | 0/0 | Not started | - |
+| 9. Testing & QA | 0/3 | In progress | - |
 ---
@@ -201,11 +201,13 @@ Plans:
  5. All E2E tests pass on Chrome, Firefox, and Safari (WebKit)
  6. Empty states, error states, and loading states are tested and render correctly
  7. CI-ready test suite that can run in a GitHub Actions / Gitea Actions pipeline
-**Plans**: 0 plans
+**Plans**: 3 plans
 Plans:
- [ ] TBD (run /gsd:plan-phase 9 to break down)
+- [ ] 09-01-PLAN.md — Playwright infrastructure (config, auth fixtures, seed helpers) + all 7 critical flow E2E tests (login, tenant CRUD, agent deploy, chat, RBAC, i18n, mobile)
 - [ ] 09-02-PLAN.md — Visual regression snapshots at 3 viewports, axe-core accessibility scans, Lighthouse CI score gating
 - [ ] 09-03-PLAN.md — Gitea Actions CI pipeline (backend lint+pytest, portal build+E2E+Lighthouse) + human verification
 ---
 *Roadmap created: 2026-03-23*
-*Coverage: 25/25 v1 requirements + 6 RBAC requirements + 5 Employee Design requirements + 5 Web Chat requirements + 6 Multilanguage requirements + 6 Mobile+PWA requirements mapped*
+*Coverage: 25/25 v1 requirements + 6 RBAC requirements + 5 Employee Design requirements + 5 Web Chat requirements + 6 Multilanguage requirements + 6 Mobile+PWA requirements + 7 Testing & QA requirements mapped*
--- a/.planning/phases/09-testing-qa/09-01-PLAN.md
+++ b/.planning/phases/09-testing-qa/09-01-PLAN.md
@@ -0,0 +1,239 @@
 ---
 phase: 09-testing-qa
 plan: 01
 type: execute
 wave: 1
 depends_on: []
 files_modified:
  - packages/portal/playwright.config.ts
  - packages/portal/e2e/auth.setup.ts
  - packages/portal/e2e/fixtures.ts
  - packages/portal/e2e/helpers/seed.ts
  - packages/portal/e2e/flows/login.spec.ts
  - packages/portal/e2e/flows/tenant-crud.spec.ts
  - packages/portal/e2e/flows/agent-deploy.spec.ts
  - packages/portal/e2e/flows/chat.spec.ts
  - packages/portal/e2e/flows/rbac.spec.ts
  - packages/portal/e2e/flows/i18n.spec.ts
  - packages/portal/e2e/flows/mobile.spec.ts
  - packages/portal/package.json
 autonomous: true
 requirements:
  - QA-01
  - QA-05
  - QA-06
 must_haves:
  truths:
    - "Playwright E2E tests cover all 7 critical user flows and pass on chromium"
    - "Tests pass on all 3 browsers (chromium, firefox, webkit)"
    - "Empty states, error states, and loading states are tested within flow specs"
    - "Auth setup saves storageState for 3 roles (platform_admin, customer_admin, customer_operator)"
  artifacts:
    - path: "packages/portal/playwright.config.ts"
      provides: "Playwright configuration with 3 browser projects + setup project"
      contains: "defineConfig"
    - path: "packages/portal/e2e/auth.setup.ts"
      provides: "Auth state generation for 3 roles"
      contains: "storageState"
    - path: "packages/portal/e2e/fixtures.ts"
      provides: "Shared test fixtures with axe builder and role-based auth"
      exports: ["test", "expect"]
    - path: "packages/portal/e2e/helpers/seed.ts"
      provides: "Test data seeding via FastAPI admin API"
      exports: ["seedTestTenant"]
    - path: "packages/portal/e2e/flows/login.spec.ts"
      provides: "Login flow E2E test"
    - path: "packages/portal/e2e/flows/chat.spec.ts"
      provides: "Chat flow E2E test with WebSocket mock"
    - path: "packages/portal/e2e/flows/rbac.spec.ts"
      provides: "RBAC enforcement E2E test"
  key_links:
    - from: "packages/portal/e2e/auth.setup.ts"
      to: "playwright/.auth/*.json"
      via: "storageState save"
      pattern: "storageState.*path"
    - from: "packages/portal/e2e/flows/*.spec.ts"
      to: "packages/portal/e2e/fixtures.ts"
      via: "import { test } from fixtures"
      pattern: "from.*fixtures"
    - from: "packages/portal/playwright.config.ts"
      to: ".next/standalone/server.js"
      via: "webServer command"
      pattern: "node .next/standalone/server.js"
 ---
 <objective>
 Set up Playwright E2E testing infrastructure and implement all 7 critical user flow tests covering login, tenant CRUD, agent deployment, chat with mocked WebSocket, RBAC enforcement, i18n language switching, and mobile viewport behavior.
 Purpose: Establishes the automated E2E test suite that validates all critical user paths work end-to-end across Chrome, Firefox, and Safari -- the primary quality gate for beta readiness.
 Output: Playwright config, auth fixtures for 3 roles, seed helpers, and 7 flow spec files that pass on all 3 browsers.
 </objective>
 <execution_context>
@/home/adelorenzo/.claude/get-shit-done/workflows/execute-plan.md
@/home/adelorenzo/.claude/get-shit-done/templates/summary.md
 </execution_context>
 <context>
@.planning/PROJECT.md
@.planning/ROADMAP.md
@.planning/STATE.md
@.planning/phases/09-testing-qa/09-CONTEXT.md
@.planning/phases/09-testing-qa/09-RESEARCH.md
 Key codebase references:
@packages/portal/package.json
@packages/portal/next.config.ts
@packages/portal/app/layout.tsx
@packages/portal/app/login/page.tsx
@packages/portal/lib/use-chat-socket.ts
@packages/portal/app/(app)/chat/page.tsx
 </context>
 <tasks>
 <task type="auto">
  <name>Task 1: Install Playwright and create test infrastructure</name>
  <files>
    packages/portal/package.json
    packages/portal/playwright.config.ts
    packages/portal/e2e/auth.setup.ts
    packages/portal/e2e/fixtures.ts
    packages/portal/e2e/helpers/seed.ts
    packages/portal/playwright/.auth/.gitkeep
    packages/portal/.gitignore
  </files>
  <action>
 1. Install test dependencies:
   ```
   cd packages/portal
   npm install --save-dev @playwright/test @axe-core/playwright @lhci/cli
   npx playwright install --with-deps chromium firefox webkit
   ```
 2. Create `packages/portal/playwright.config.ts` following the RESEARCH Pattern 6 exactly:
   - testDir: "./e2e"
   - fullyParallel: false (CI stability with shared DB state)
   - forbidOnly: !!process.env.CI
   - retries: process.env.CI ? 1 : 0
   - workers: process.env.CI ? 1 : undefined
   - timeout: 30_000
   - reporter: html + junit + list
   - use.baseURL from PLAYWRIGHT_BASE_URL env or localhost:3000
   - use.trace: "on-first-retry"
   - use.screenshot: "only-on-failure"
   - use.serviceWorkers: "block" (CRITICAL: prevents Serwist from intercepting test requests)
   - expect.toHaveScreenshot: maxDiffPixelRatio 0.02, threshold 0.2
   - Projects: setup, chromium, firefox, webkit (all depend on setup, testMatch "e2e/flows/**")
   - Visual projects: visual-desktop (1280x800), visual-tablet (768x1024), visual-mobile (iPhone 12 375x812) -- all chromium only, testMatch "e2e/visual/**"
   - A11y project: chromium, testMatch "e2e/accessibility/**"
   - webServer: command "node .next/standalone/server.js", url localhost:3000, reuseExistingServer: !process.env.CI
   - webServer env: PORT 3000, API_URL from env or localhost:8001, AUTH_SECRET test-secret, AUTH_URL localhost:3000
   - Default storageState for chromium/firefox/webkit: "playwright/.auth/platform-admin.json"
 3. Create `packages/portal/e2e/auth.setup.ts`:
   - 3 setup blocks: platform admin, customer admin, customer operator
   - Each: goto /login, fill Email + Password from env vars (E2E_ADMIN_EMAIL/E2E_ADMIN_PASSWORD, E2E_CADMIN_EMAIL/E2E_CADMIN_PASSWORD, E2E_OPERATOR_EMAIL/E2E_OPERATOR_PASSWORD), click Sign In button, waitForURL /dashboard, save storageState to playwright/.auth/{role}.json
   - Use path.resolve(__dirname, ...) for auth file paths
 4. Create `packages/portal/e2e/fixtures.ts`:
   - Extend base test with: axe fixture (returns () => AxeBuilder with wcag2a, wcag2aa, wcag21aa tags), auth state paths as constants
   - Export `test` and `expect` from the extended fixture
   - Export AUTH_PATHS object: { platformAdmin, customerAdmin, operator } with resolved paths
 5. Create `packages/portal/e2e/helpers/seed.ts`:
   - seedTestTenant(request: APIRequestContext) -- POST to /api/portal/tenants with X-User-Id, X-User-Role headers, returns { tenantId, tenantSlug }
   - cleanupTenant(request: APIRequestContext, tenantId: string) -- DELETE /api/portal/tenants/{id}
   - Use random suffix for tenant names to avoid collisions
 6. Create `packages/portal/playwright/.auth/.gitkeep` (empty file)
 7. Add to packages/portal/.gitignore: `playwright/.auth/*.json`, `playwright-report/`, `playwright-results.xml`, `.lighthouseci/`
  </action>
  <verify>
    <automated>cd /home/adelorenzo/repos/konstruct/packages/portal && npx playwright --version && test -f playwright.config.ts && test -f e2e/auth.setup.ts && test -f e2e/fixtures.ts && test -f e2e/helpers/seed.ts && echo "PASS"</automated>
  </verify>
  <done>Playwright installed, config created with 3 browser + 3 visual + 1 a11y projects, auth setup saves storageState for 3 roles, fixtures export axe builder and auth paths, seed helper creates/deletes test tenants via API</done>
 </task>
 <task type="auto">
  <name>Task 2: Implement all 7 critical flow E2E tests</name>
  <files>
    packages/portal/e2e/flows/login.spec.ts
    packages/portal/e2e/flows/tenant-crud.spec.ts
    packages/portal/e2e/flows/agent-deploy.spec.ts
    packages/portal/e2e/flows/chat.spec.ts
    packages/portal/e2e/flows/rbac.spec.ts
    packages/portal/e2e/flows/i18n.spec.ts
    packages/portal/e2e/flows/mobile.spec.ts
  </files>
  <action>
 All specs import `{ test, expect }` from `../fixtures`. Use semantic selectors (getByRole, getByLabel, getByText) -- never CSS IDs or data-testid unless no semantic selector exists.
 1. `login.spec.ts` (Flow 1):
   - Test "login -> dashboard loads -> session persists": goto /login, fill credentials, click Sign In, waitForURL /dashboard, assert dashboard heading visible. Reload page, assert still on /dashboard (session persists).
   - Test "invalid credentials show error": fill wrong password, submit, assert error message visible.
   - Test "empty state: no session redirects to login": use empty storageState ({}), goto /dashboard, assert redirected to /login.
 2. `tenant-crud.spec.ts` (Flow 2):
   - Uses platform_admin storageState
   - Test "create tenant -> appears in list": navigate to tenants page, click create button, fill tenant name + slug (random suffix), submit, assert new tenant appears in list.
   - Test "delete tenant": create tenant, then delete it, assert it disappears from list.
   - Use seed helper for setup where possible.
 3. `agent-deploy.spec.ts` (Flow 3):
   - Uses customer_admin storageState (or platform_admin with tenant context)
   - Test "deploy template agent -> appears in employees": navigate to /agents/new, select template option, pick first available template, click deploy, assert agent appears in agents list.
   - Test "loading state: template gallery shows loading skeleton": mock API to delay, assert skeleton/loading indicator visible.
 4. `chat.spec.ts` (Flow 4):
   - Uses routeWebSocket per RESEARCH Pattern 2
   - Test "send message -> receive streaming response": routeWebSocket matching /\/chat\/ws\//, mock auth acknowledgment and message response with simulated streaming tokens. Open chat page, select an agent/conversation, type message, press Enter, assert response text appears.
   - Test "typing indicator shows during response": assert typing indicator visible between message send and response arrival.
   - Test "empty state: no conversations shows prompt": navigate to /chat without selecting agent, assert empty state message visible.
   - IMPORTANT: Use regex pattern for routeWebSocket: `/\/chat\/ws\//` (not string) -- the portal derives WS URL from NEXT_PUBLIC_API_URL which is absolute.
 5. `rbac.spec.ts` (Flow 5):
   - Uses customer_operator storageState
   - Test "operator cannot access restricted paths": for each of ["/agents/new", "/billing", "/users"], goto path, assert NOT on that URL (proxy.ts redirects to /dashboard).
   - Test "operator can view dashboard and chat": goto /dashboard, assert visible. Goto /chat, assert visible.
   - Uses customer_admin storageState for contrast test: "admin can access /agents/new".
 6. `i18n.spec.ts` (Flow 6):
   - Test "language switcher changes UI to Spanish": find language switcher, select Espanol, assert key UI elements render in Spanish (check a known label like "Dashboard" -> "Panel" or whatever the Spanish translation is -- read from the messages/es.json file).
   - Test "language persists across page navigation": switch to Portuguese, navigate to another page, assert Portuguese labels still showing.
 7. `mobile.spec.ts` (Flow 7):
   - Test "mobile viewport: bottom tab bar renders, sidebar hidden": setViewportSize 375x812, goto /dashboard, assert mobile bottom navigation visible, assert desktop sidebar not visible.
   - Test "mobile chat: full screen message view": setViewportSize 375x812, navigate to chat, assert chat interface fills viewport.
   - Test "error state: offline banner" (if applicable): if the PWA has offline detection, test it shows a banner.
 For QA-06 coverage, embed empty/error/loading state tests within the relevant flow specs (noted above with specific test cases).
  </action>
  <verify>
    <automated>cd /home/adelorenzo/repos/konstruct/packages/portal && npx playwright test e2e/flows/ --project=chromium --reporter=list 2>&1 | tail -20</automated>
  </verify>
  <done>All 7 flow spec files exist with tests for critical paths plus empty/error/loading states. Tests pass on chromium. Cross-browser pass (firefox, webkit) confirmed by running full project suite.</done>
 </task>
 </tasks>
 <verification>
 1. `cd packages/portal && npx playwright test e2e/flows/ --project=chromium` -- all flow tests pass on chromium
 2. `cd packages/portal && npx playwright test e2e/flows/` -- all flow tests pass on chromium + firefox + webkit
 3. Each flow spec covers at least one happy path and one edge/error case
 4. Auth setup generates 3 storageState files in playwright/.auth/
 </verification>
 <success_criteria>
 - 7 flow spec files exist and pass on chromium
 - Cross-browser (chromium + firefox + webkit) all green
 - Empty/error/loading states tested within flow specs
 - Auth storageState generated for 3 roles without manual intervention
 - No real LLM calls in any test (WebSocket mocked)
 </success_criteria>
 <output>
 After completion, create `.planning/phases/09-testing-qa/09-01-SUMMARY.md`
 </output>
--- a/.planning/phases/09-testing-qa/09-02-PLAN.md
+++ b/.planning/phases/09-testing-qa/09-02-PLAN.md
@@ -0,0 +1,178 @@
 ---
 phase: 09-testing-qa
 plan: 02
 type: execute
 wave: 2
 depends_on: ["09-01"]
 files_modified:
  - packages/portal/e2e/visual/snapshots.spec.ts
  - packages/portal/e2e/accessibility/a11y.spec.ts
  - packages/portal/e2e/lighthouse/lighthouserc.json
 autonomous: true
 requirements:
  - QA-02
  - QA-03
  - QA-04
 must_haves:
  truths:
    - "Visual regression snapshots exist for all key pages at 3 viewports (desktop, tablet, mobile)"
    - "axe-core accessibility scan passes with zero critical violations on all key pages"
    - "Lighthouse scores meet >= 80 hard floor on login page (90 target)"
    - "Serious a11y violations are logged as warnings, not blockers"
  artifacts:
    - path: "packages/portal/e2e/visual/snapshots.spec.ts"
      provides: "Visual regression tests at 3 viewports"
      contains: "toHaveScreenshot"
    - path: "packages/portal/e2e/accessibility/a11y.spec.ts"
      provides: "axe-core accessibility scans on key pages"
      contains: "AxeBuilder"
    - path: "packages/portal/e2e/lighthouse/lighthouserc.json"
      provides: "Lighthouse CI config with score thresholds"
      contains: "minScore"
  key_links:
    - from: "packages/portal/e2e/accessibility/a11y.spec.ts"
      to: "packages/portal/e2e/fixtures.ts"
      via: "import axe fixture"
      pattern: "from.*fixtures"
    - from: "packages/portal/e2e/visual/snapshots.spec.ts"
      to: "packages/portal/playwright.config.ts"
      via: "visual-desktop/tablet/mobile projects"
      pattern: "toHaveScreenshot"
 ---
 <objective>
 Add visual regression testing at 3 viewports, axe-core accessibility scanning on all key pages, and Lighthouse CI performance/accessibility score gating.
 Purpose: Catches CSS regressions that unit tests miss, ensures WCAG 2.1 AA compliance, and validates performance baselines before beta launch.
 Output: Visual snapshot specs, accessibility scan specs, Lighthouse CI config, and baseline screenshots.
 </objective>
 <execution_context>
@/home/adelorenzo/.claude/get-shit-done/workflows/execute-plan.md
@/home/adelorenzo/.claude/get-shit-done/templates/summary.md
 </execution_context>
 <context>
@.planning/PROJECT.md
@.planning/ROADMAP.md
@.planning/phases/09-testing-qa/09-CONTEXT.md
@.planning/phases/09-testing-qa/09-RESEARCH.md
@.planning/phases/09-testing-qa/09-01-SUMMARY.md
 Depends on Plan 01 for: playwright.config.ts (visual projects, a11y project), e2e/fixtures.ts (axe fixture), auth.setup.ts (storageState)
 </context>
 <tasks>
 <task type="auto">
  <name>Task 1: Visual regression snapshots and axe-core accessibility tests</name>
  <files>
    packages/portal/e2e/visual/snapshots.spec.ts
    packages/portal/e2e/accessibility/a11y.spec.ts
  </files>
  <action>
 1. Create `packages/portal/e2e/visual/snapshots.spec.ts`:
   - Import `{ test, expect }` from `../fixtures`
   - Use platform_admin storageState for authenticated pages
   - Key pages to snapshot (each as a separate test):
     a. Login page (no auth needed -- use empty storageState or navigate directly)
     b. Dashboard
     c. Agents list (/agents or /employees)
     d. Agents/new (3-card entry screen)
     e. Chat (empty state -- no conversation selected)
     f. Templates gallery (/agents/new then select templates option, or /templates)
   - Each test: goto page, wait for network idle or key element visible, call `await expect(page).toHaveScreenshot('page-name.png')`
   - The 3 viewport sizes are handled by the playwright.config.ts visual-desktop/visual-tablet/visual-mobile projects -- the spec runs once, projects provide viewport variation
   - For login page: navigate to /login without storageState
   - For authenticated pages: use default storageState (platform_admin)
 2. Create `packages/portal/e2e/accessibility/a11y.spec.ts`:
   - Import `{ test, expect }` from `../fixtures` (gets axe fixture)
   - Use platform_admin storageState
   - Pages to scan: login, dashboard, agents list, agents/new, chat, templates, billing, users
   - For each page, create a test:
     ```
     test('page-name has no critical a11y violations', async ({ page, axe }) => {
       await page.goto('/path');
       await page.waitForLoadState('networkidle');
       const results = await axe().analyze();
       const critical = results.violations.filter(v => v.impact === 'critical');
       const serious = results.violations.filter(v => v.impact === 'serious');
       if (serious.length > 0) {
         console.warn(`Serious a11y violations on /path:`, serious.map(v => v.id));
       }
       expect(critical, `Critical a11y violations on /path`).toHaveLength(0);
     });
     ```
   - Add keyboard navigation test: "Tab through login form fields": goto /login, press Tab repeatedly, assert focus moves through Email -> Password -> Sign In button using `page.locator(':focus')`.
   - Add keyboard nav for chat: Tab to message input, type message, Enter to send.
 3. Generate initial visual regression baselines:
   - Build the portal: `cd packages/portal && npm run build`
   - Copy static assets for standalone: `cp -r .next/static .next/standalone/.next/static && cp -r public .next/standalone/public`
   - Run with --update-snapshots: `npx playwright test e2e/visual/ --update-snapshots`
   - This creates baseline screenshots in the __snapshots__ directory
   - NOTE: If the full stack (gateway + DB) is not running, authenticated page snapshots may fail. In that case, generate baselines only for login page and document that full baselines require the running stack. The executor should start the stack via docker compose if possible.
  </action>
  <verify>
    <automated>cd /home/adelorenzo/repos/konstruct/packages/portal && npx playwright test e2e/accessibility/ --project=a11y --reporter=list 2>&1 | tail -20</automated>
  </verify>
  <done>Visual regression spec covers 6 key pages (runs at 3 viewports via projects), baseline screenshots generated. Accessibility spec scans 8+ pages with zero critical violations, serious violations logged as warnings. Keyboard navigation tested on login and chat.</done>
 </task>
 <task type="auto">
  <name>Task 2: Lighthouse CI configuration and score gating</name>
  <files>
    packages/portal/e2e/lighthouse/lighthouserc.json
  </files>
  <action>
 1. Create `packages/portal/e2e/lighthouse/lighthouserc.json`:
   - Based on RESEARCH Pattern 5
   - collect.url: only "/login" page (authenticated pages redirect to login when Lighthouse runs unauthenticated -- see RESEARCH Pitfall 5)
   - collect.numberOfRuns: 1 (speed for CI)
   - collect.settings.preset: "desktop"
   - collect.settings.chromeFlags: "--no-sandbox --disable-dev-shm-usage"
   - assert.assertions:
     - categories:performance: ["error", {"minScore": 0.80}] (hard floor)
     - categories:accessibility: ["error", {"minScore": 0.80}]
     - categories:best-practices: ["error", {"minScore": 0.80}]
     - categories:seo: ["error", {"minScore": 0.80}]
   - upload.target: "filesystem"
   - upload.outputDir: ".lighthouseci"
 2. Verify Lighthouse runs successfully:
   - Ensure portal is built and standalone server can start
   - Run: `cd packages/portal && npx lhci autorun --config=e2e/lighthouse/lighthouserc.json`
   - Verify scores are printed and assertions pass
   - If score is below 80 on any category, investigate and document (do NOT lower thresholds)
 NOTE: Per RESEARCH Pitfall 5, only /login is tested with Lighthouse because authenticated pages redirect. The 90 target is aspirational -- the 80 hard floor is what CI enforces. Dashboard/chat performance should be validated manually or via Web Vitals in production.
  </action>
  <verify>
    <automated>cd /home/adelorenzo/repos/konstruct/packages/portal && test -f e2e/lighthouse/lighthouserc.json && cat e2e/lighthouse/lighthouserc.json | grep -q "minScore" && echo "PASS"</automated>
  </verify>
  <done>lighthouserc.json exists with score thresholds (80 hard floor, 90 aspirational). Lighthouse CI runs against /login and produces scores. All 4 categories (performance, accessibility, best practices, SEO) pass the 80 floor.</done>
 </task>
 </tasks>
 <verification>
 1. `cd packages/portal && npx playwright test e2e/visual/ --project=visual-desktop` -- visual regression passes (or creates baselines on first run)
 2. `cd packages/portal && npx playwright test e2e/accessibility/ --project=a11y` -- zero critical violations
 3. `cd packages/portal && npx lhci autorun --config=e2e/lighthouse/lighthouserc.json` -- all scores >= 80
 4. Baseline screenshots committed to repo
 </verification>
 <success_criteria>
 - Visual regression snapshots exist for 6 key pages at 3 viewports
 - axe-core scans all key pages with zero critical a11y violations
 - Serious a11y violations logged but not blocking
 - Lighthouse CI passes with >= 80 on all 4 categories for /login
 - Keyboard navigation tests pass for login form and chat input
 </success_criteria>
 <output>
 After completion, create `.planning/phases/09-testing-qa/09-02-SUMMARY.md`
 </output>
--- a/.planning/phases/09-testing-qa/09-03-PLAN.md
+++ b/.planning/phases/09-testing-qa/09-03-PLAN.md
@@ -0,0 +1,183 @@
 ---
 phase: 09-testing-qa
 plan: 03
 type: execute
 wave: 2
 depends_on: ["09-01"]
 files_modified:
  - .gitea/workflows/ci.yml
 autonomous: false
 requirements:
  - QA-07
 must_haves:
  truths:
    - "CI pipeline YAML exists and is syntactically valid for Gitea Actions"
    - "Pipeline stages enforce fail-fast: lint/type-check block unit tests, unit tests block E2E"
    - "Pipeline includes backend tests (lint, type-check, pytest) and portal tests (build, E2E, Lighthouse)"
    - "Test reports (JUnit XML, HTML) are uploaded as artifacts"
  artifacts:
    - path: ".gitea/workflows/ci.yml"
      provides: "Complete CI pipeline for Gitea Actions"
      contains: "playwright test"
  key_links:
    - from: ".gitea/workflows/ci.yml"
      to: "packages/portal/playwright.config.ts"
      via: "npx playwright test command"
      pattern: "playwright test"
    - from: ".gitea/workflows/ci.yml"
      to: "packages/portal/e2e/lighthouse/lighthouserc.json"
      via: "npx lhci autorun --config"
      pattern: "lhci autorun"
 ---
 <objective>
 Create the Gitea Actions CI pipeline that runs the full test suite (backend lint + type-check + pytest, portal build + E2E + Lighthouse) on every push and PR to main.
 Purpose: Makes the test suite CI-ready so quality gates are enforced automatically, not just locally. Completes the beta-readiness quality infrastructure.
 Output: .gitea/workflows/ci.yml with fail-fast stages and artifact uploads.
 </objective>
 <execution_context>
@/home/adelorenzo/.claude/get-shit-done/workflows/execute-plan.md
@/home/adelorenzo/.claude/get-shit-done/templates/summary.md
 </execution_context>
 <context>
@.planning/PROJECT.md
@.planning/ROADMAP.md
@.planning/phases/09-testing-qa/09-CONTEXT.md
@.planning/phases/09-testing-qa/09-RESEARCH.md
@.planning/phases/09-testing-qa/09-01-SUMMARY.md
 Depends on Plan 01 for: Playwright config and test files that CI will execute
 </context>
 <tasks>
 <task type="auto">
  <name>Task 1: Create Gitea Actions CI workflow</name>
  <files>
    .gitea/workflows/ci.yml
  </files>
  <action>
 Create `.gitea/workflows/ci.yml` based on RESEARCH Pattern 7 with these specifics:
 1. Triggers: push to main, pull_request to main
 2. Job 1: `backend` (Backend Tests)
   - runs-on: ubuntu-latest
   - Service containers:
     - postgres: pgvector/pgvector:pg16, env POSTGRES_DB/USER/PASSWORD, health-cmd pg_isready
     - redis: redis:7-alpine, health-cmd "redis-cli ping"
   - Env vars: DATABASE_URL (asyncpg to konstruct_app), DATABASE_ADMIN_URL (asyncpg to postgres), REDIS_URL
   - Steps:
     - actions/checkout@v4
     - actions/setup-python@v5 python-version 3.12
     - pip install uv
     - uv sync
     - uv run ruff check packages/ tests/
     - uv run ruff format --check packages/ tests/
     - uv run pytest tests/ -x --tb=short --junitxml=test-results.xml
   - Upload test-results.xml as artifact (if: always())
 3. Job 2: `portal` (Portal E2E) -- needs: backend
   - runs-on: ubuntu-latest
   - Service containers: same postgres + redis
   - Steps:
     - actions/checkout@v4
     - actions/setup-node@v4 node-version 22
     - actions/setup-python@v5 python-version 3.12 (for gateway)
     - Install portal deps: `cd packages/portal && npm ci`
     - Build portal: `cd packages/portal && npm run build` with NEXT_PUBLIC_API_URL env
     - Copy standalone assets: `cd packages/portal && cp -r .next/static .next/standalone/.next/static && cp -r public .next/standalone/public`
     - Install Playwright browsers: `cd packages/portal && npx playwright install --with-deps chromium firefox webkit`
     - Start gateway (background):
       ```
       pip install uv && uv sync
       uv run alembic upgrade head
       uv run python -c "from shared.db import seed_admin; import asyncio; asyncio.run(seed_admin())" || true
       uv run uvicorn gateway.main:app --host 0.0.0.0 --port 8001 &
       ```
       env: DATABASE_URL, DATABASE_ADMIN_URL, REDIS_URL, LLM_POOL_URL (http://localhost:8004)
     - Wait for gateway: `timeout 30 bash -c 'until curl -sf http://localhost:8001/health; do sleep 1; done'`
     - Run E2E tests: `cd packages/portal && npx playwright test e2e/flows/ e2e/accessibility/`
       env: CI=true, PLAYWRIGHT_BASE_URL, API_URL, AUTH_SECRET, E2E_ADMIN_EMAIL, E2E_ADMIN_PASSWORD, E2E_CADMIN_EMAIL, E2E_CADMIN_PASSWORD, E2E_OPERATOR_EMAIL, E2E_OPERATOR_PASSWORD
       (Use secrets for credentials: ${{ secrets.E2E_ADMIN_EMAIL }} etc.)
     - Run Lighthouse CI: `cd packages/portal && npx lhci autorun --config=e2e/lighthouse/lighthouserc.json`
       env: LHCI_BUILD_CONTEXT__CURRENT_HASH: ${{ github.sha }}
     - Upload Playwright report (if: always()): actions/upload-artifact@v4, path packages/portal/playwright-report/
     - Upload Playwright JUnit (if: always()): actions/upload-artifact@v4, path packages/portal/playwright-results.xml
     - Upload Lighthouse report (if: always()): actions/upload-artifact@v4, path packages/portal/.lighthouseci/
 IMPORTANT: Do NOT include mypy --strict step (existing codebase may not be fully strict-typed). Only include ruff check and ruff format --check for linting.
 NOTE: The seed_admin call may not exist -- include `|| true` so it doesn't block. The E2E auth setup creates test users via the login form, so the admin user must already exist in the database. If there's a migration seed, it will handle this.
 Pipeline target: < 5 minutes total.
  </action>
  <verify>
    <automated>test -f /home/adelorenzo/repos/konstruct/.gitea/workflows/ci.yml && python3 -c "import yaml; yaml.safe_load(open('/home/adelorenzo/repos/konstruct/.gitea/workflows/ci.yml'))" && echo "VALID YAML"</automated>
  </verify>
  <done>CI pipeline YAML exists at .gitea/workflows/ci.yml, is valid YAML, has 2 jobs (backend + portal), portal depends on backend (fail-fast), includes lint/format/pytest/E2E/Lighthouse/artifact-upload steps</done>
 </task>
 <task type="checkpoint:human-verify" gate="blocking">
  <name>Task 2: Verify test suite and CI pipeline</name>
  <what-built>
 Complete E2E test suite (7 flow specs + accessibility + visual regression + Lighthouse CI) and Gitea Actions CI pipeline. Tests cover login, tenant CRUD, agent deployment, chat with mocked WebSocket, RBAC enforcement, i18n language switching, mobile viewport behavior, accessibility (axe-core), and visual regression at 3 viewports.
  </what-built>
  <how-to-verify>
 1. Run the full E2E test suite locally:
   ```
   cd packages/portal
   npx playwright test --project=chromium --reporter=list
   ```
   Expected: All flow tests + accessibility tests pass
 2. Run cross-browser:
   ```
   npx playwright test e2e/flows/ --reporter=list
   ```
   Expected: All tests pass on chromium, firefox, webkit
 3. Check the Playwright HTML report:
   ```
   npx playwright show-report
   ```
   Expected: Opens browser with detailed test results
 4. Review the CI pipeline:
   ```
   cat .gitea/workflows/ci.yml
   ```
   Expected: Valid YAML with backend job (lint + pytest) and portal job (build + E2E + Lighthouse), portal depends on backend
 5. (Optional) Push a branch to trigger CI on git.oe74.net and verify pipeline runs
  </how-to-verify>
  <resume-signal>Type "approved" if tests pass and CI pipeline looks correct, or describe issues</resume-signal>
 </task>
 </tasks>
 <verification>
 1. `.gitea/workflows/ci.yml` exists and is valid YAML
 2. Pipeline has 2 jobs: backend (lint + pytest) and portal (build + E2E + Lighthouse)
 3. Portal job depends on backend job (fail-fast enforced)
 4. Secrets referenced for credentials (not hardcoded)
 5. Artifacts uploaded for test reports
 </verification>
 <success_criteria>
 - CI pipeline YAML is syntactically valid
 - Pipeline stages enforce fail-fast ordering
 - Backend job: ruff check + ruff format --check + pytest
 - Portal job: npm build + Playwright E2E + Lighthouse CI
 - Test reports uploaded as artifacts (JUnit XML, HTML, Lighthouse)
 - Human approves test suite and pipeline structure
 </success_criteria>
 <output>
 After completion, create `.planning/phases/09-testing-qa/09-03-SUMMARY.md`
 </output>