docs(04-rbac): create phase plan — 3 plans in 3 waves

This commit is contained in:
2026-03-24 13:37:36 -06:00
parent 4706a87355
commit bf4adf0b21
4 changed files with 1008 additions and 20 deletions

View File

@@ -0,0 +1,324 @@
---
phase: 04-rbac
plan: 03
type: execute
wave: 3
depends_on: ["04-01", "04-02"]
files_modified:
- packages/shared/shared/api/portal.py
- packages/shared/shared/api/billing.py
- packages/shared/shared/api/channels.py
- packages/shared/shared/api/llm_keys.py
- packages/shared/shared/api/usage.py
- packages/shared/shared/api/invitations.py
- tests/integration/test_portal_rbac.py
- tests/integration/test_invite_flow.py
autonomous: false
requirements:
- RBAC-06
- RBAC-01
- RBAC-02
- RBAC-03
- RBAC-04
- RBAC-05
must_haves:
truths:
- "Every mutating portal API endpoint (POST/PUT/DELETE) returns 403 for customer_operator"
- "Every tenant-scoped endpoint returns 403 for customer_admin accessing a different tenant"
- "Platform admin gets 200 on any tenant's endpoints regardless of membership"
- "Customer operator gets 200 on read-only endpoints (GET agents, GET usage) for their tenant"
- "Impersonation actions are logged in audit_events with platform admin user_id"
- "Full invite flow works end-to-end: create invitation -> accept -> login -> correct role"
artifacts:
- path: "tests/integration/test_portal_rbac.py"
provides: "Integration tests for RBAC enforcement on all portal endpoints"
min_lines: 80
- path: "tests/integration/test_invite_flow.py"
provides: "End-to-end invitation flow integration test"
min_lines: 40
key_links:
- from: "packages/shared/shared/api/portal.py"
to: "packages/shared/shared/api/rbac.py"
via: "Depends(require_tenant_admin) on mutating endpoints"
pattern: "Depends\\(require_tenant_admin\\)|Depends\\(require_platform_admin\\)"
- from: "packages/shared/shared/api/billing.py"
to: "packages/shared/shared/api/rbac.py"
via: "Depends(require_tenant_admin) on billing endpoints"
pattern: "Depends\\(require_tenant_admin\\)"
- from: "tests/integration/test_portal_rbac.py"
to: "packages/shared/shared/api/rbac.py"
via: "Tests pass role headers and assert 403/200"
pattern: "X-Portal-User-Role"
---
<objective>
Wire RBAC guards to ALL existing portal API endpoints, add impersonation audit logging, add user listing endpoints, and create comprehensive integration tests proving every endpoint enforces role-based authorization.
Purpose: Defense in depth — the UI hides things, but the API MUST enforce authorization. This plan completes the server-side enforcement layer and validates the entire RBAC system end-to-end.
Output: All portal endpoints guarded, impersonation logged, integration tests for RBAC + invite flow.
</objective>
<execution_context>
@/home/adelorenzo/.claude/get-shit-done/workflows/execute-plan.md
@/home/adelorenzo/.claude/get-shit-done/templates/summary.md
</execution_context>
<context>
@.planning/PROJECT.md
@.planning/ROADMAP.md
@.planning/STATE.md
@.planning/phases/04-rbac/04-CONTEXT.md
@.planning/phases/04-rbac/04-RESEARCH.md
@.planning/phases/04-rbac/04-01-SUMMARY.md
@.planning/phases/04-rbac/04-02-SUMMARY.md
<interfaces>
<!-- From Plan 01 — RBAC guard dependencies -->
From packages/shared/shared/api/rbac.py:
```python
class PortalCaller:
user_id: uuid.UUID
role: str
tenant_id: uuid.UUID | None
async def get_portal_caller(...) -> PortalCaller: ...
async def require_platform_admin(caller: PortalCaller) -> PortalCaller: ...
async def require_tenant_admin(tenant_id: UUID, caller: PortalCaller, session: AsyncSession) -> PortalCaller: ...
async def require_tenant_member(tenant_id: UUID, caller: PortalCaller, session: AsyncSession) -> PortalCaller: ...
```
From packages/shared/shared/api/portal.py — existing endpoints to guard:
```python
# Tenant CRUD — platform_admin only (or tenant_admin for their own tenant GET)
GET /api/portal/tenants # platform_admin only (lists ALL tenants)
POST /api/portal/tenants # platform_admin only
GET /api/portal/tenants/{tid} # require_tenant_member (own tenant) or platform_admin
PUT /api/portal/tenants/{tid} # platform_admin only
DELETE /api/portal/tenants/{tid} # platform_admin only
# Agent CRUD — tenant-scoped
GET /api/portal/tenants/{tid}/agents # require_tenant_member
POST /api/portal/tenants/{tid}/agents # require_tenant_admin
GET /api/portal/tenants/{tid}/agents/{aid} # require_tenant_member
PUT /api/portal/tenants/{tid}/agents/{aid} # require_tenant_admin
DELETE /api/portal/tenants/{tid}/agents/{aid} # require_tenant_admin
```
From packages/shared/shared/api/billing.py, channels.py, llm_keys.py, usage.py:
```python
# All tenant-scoped endpoints need guards:
# billing.py: subscription management — require_tenant_admin
# channels.py: channel connections — require_tenant_admin (GET: require_tenant_member)
# llm_keys.py: BYO API keys — require_tenant_admin
# usage.py: usage metrics — require_tenant_member (read-only OK for operators)
```
From packages/shared/shared/models/audit.py:
```python
class AuditEvent(Base):
# action_type, tenant_id, event_metadata — use for impersonation logging
```
</interfaces>
</context>
<tasks>
<task type="auto">
<name>Task 1: Wire RBAC guards to all existing API endpoints + impersonation + user listing</name>
<files>
packages/shared/shared/api/portal.py,
packages/shared/shared/api/billing.py,
packages/shared/shared/api/channels.py,
packages/shared/shared/api/llm_keys.py,
packages/shared/shared/api/usage.py,
packages/shared/shared/api/invitations.py
</files>
<action>
Add `Depends()` guards to every endpoint across all portal API routers. The guards read X-Portal-User-Id, X-Portal-User-Role, X-Portal-Tenant-Id headers set by the portal proxy layer.
**packages/shared/shared/api/portal.py:**
- `GET /tenants` — add `Depends(require_platform_admin)`. Only platform admins list ALL tenants.
- `POST /tenants` — add `Depends(require_platform_admin)`. Only platform admins create tenants.
- `GET /tenants/{tenant_id}` — add `Depends(require_tenant_member)`. Any role with membership can view their tenant.
- `PUT /tenants/{tenant_id}` — add `Depends(require_platform_admin)`. Only platform admins edit tenant settings.
- `DELETE /tenants/{tenant_id}` — add `Depends(require_platform_admin)`. Only platform admins delete tenants.
- `GET /tenants/{tenant_id}/agents` — add `Depends(require_tenant_member)`. All roles can list agents.
- `POST /tenants/{tenant_id}/agents` — add `Depends(require_tenant_admin)`. Only admins create agents.
- `GET /tenants/{tenant_id}/agents/{agent_id}` — add `Depends(require_tenant_member)`.
- `PUT /tenants/{tenant_id}/agents/{agent_id}` — add `Depends(require_tenant_admin)`.
- `DELETE /tenants/{tenant_id}/agents/{agent_id}` — add `Depends(require_tenant_admin)`.
- ADD new endpoint: `GET /tenants/{tenant_id}/users` — requires require_tenant_admin. Queries UserTenantRole JOIN PortalUser WHERE tenant_id matches. Returns list of {id, name, email, role, created_at}. Also queries PortalInvitation WHERE tenant_id AND status='pending' to include pending invites.
- ADD new endpoint: `GET /admin/users` — requires require_platform_admin. Queries ALL PortalUser with their UserTenantRole associations. Supports optional query params: tenant_id filter, role filter. Returns list with tenant membership info.
**packages/shared/shared/api/billing.py:**
- All endpoints: add `Depends(require_tenant_admin)` — only admins manage billing.
**packages/shared/shared/api/channels.py:**
- GET endpoints: `Depends(require_tenant_member)` — operators can view channel connections.
- POST/PUT/DELETE endpoints: `Depends(require_tenant_admin)` — only admins modify channels.
**packages/shared/shared/api/llm_keys.py:**
- All endpoints: `Depends(require_tenant_admin)` — only admins manage BYO API keys.
**packages/shared/shared/api/usage.py:**
- All GET endpoints: `Depends(require_tenant_member)` — operators can view usage dashboards (per locked decision: operators can view usage).
**Impersonation endpoint** (add to portal.py or a new admin section):
- `POST /api/portal/admin/impersonate` — requires require_platform_admin. Accepts `{tenant_id}`. Logs AuditEvent with action_type="impersonation", event_metadata containing platform_admin user_id and target tenant_id. Returns the tenant details (the portal will use this to trigger a JWT update with impersonating_tenant_id).
- `POST /api/portal/admin/stop-impersonation` — requires require_platform_admin. Logs end of impersonation in audit trail.
**packages/shared/shared/api/invitations.py:**
- Ensure all endpoints already have proper guards from Plan 01. Verify and fix if missing.
IMPORTANT: For each endpoint, the guard dependency must be added as a function parameter so FastAPI's DI resolves it. Example:
```python
@portal_router.get("/tenants")
async def list_tenants(
caller: PortalCaller = Depends(require_platform_admin),
session: AsyncSession = Depends(get_session),
) -> TenantsListResponse:
```
The `caller` parameter captures the resolved PortalCaller but may not be used in the function body — that's fine, the guard raises 403 before the function executes if unauthorized.
</action>
<verify>
<automated>cd /home/adelorenzo/repos/konstruct && python -c "
from shared.api.portal import portal_router
from shared.api.billing import billing_router
from shared.api.channels import channels_router
routes = [r.path for r in portal_router.routes]
print(f'Portal routes: {len(routes)}')
# Verify at least one route has dependencies
import inspect
for r in portal_router.routes:
if hasattr(r, 'dependant') and r.dependant.dependencies:
print(f' {r.path} has {len(r.dependant.dependencies)} dependencies')
break
else:
print('WARNING: No routes have dependencies')
"</automated>
</verify>
<done>Every portal API endpoint has an RBAC guard. Mutating endpoints require tenant_admin or platform_admin. Read-only tenant endpoints allow tenant_member. Global endpoints require platform_admin. Impersonation endpoint logs to audit trail. User listing endpoints exist for both per-tenant and global views.</done>
</task>
<task type="auto" tdd="true">
<name>Task 2: Integration tests for RBAC enforcement and invite flow</name>
<files>
tests/integration/test_portal_rbac.py,
tests/integration/test_invite_flow.py
</files>
<behavior>
- Platform admin with correct headers gets 200 on all endpoints
- Customer admin gets 200 on own-tenant endpoints, 403 on other tenants
- Customer operator gets 200 on GET endpoints, 403 on POST/PUT/DELETE
- Missing role headers return 401/422 (FastAPI Header() validation)
- Impersonation endpoint logs AuditEvent row
- Full invite flow: admin creates invite -> token generated -> accept with password -> new user can login -> new user has correct role and tenant membership
- Resend invite generates new token and extends expiry
- Expired invite acceptance returns error
</behavior>
<action>
Create `tests/integration/test_portal_rbac.py`:
- Use httpx.AsyncClient with the FastAPI app (established test pattern: `make_app(session)`)
- Set up test fixtures: create test tenants, portal_users with different roles, user_tenant_roles
- Helper function to add role headers: `def headers(user_id, role, tenant_id=None) -> dict`
Test matrix (test each combination):
| Endpoint | platform_admin | customer_admin (own) | customer_admin (other) | customer_operator |
|----------|---------------|---------------------|----------------------|-------------------|
| GET /tenants | 200 | 403 | 403 | 403 |
| POST /tenants | 201 | 403 | 403 | 403 |
| GET /tenants/{tid} | 200 | 200 | 403 | 200 |
| PUT /tenants/{tid} | 200 | 403 | 403 | 403 |
| DELETE /tenants/{tid} | 204 | 403 | 403 | 403 |
| GET /tenants/{tid}/agents | 200 | 200 | 403 | 200 |
| POST /tenants/{tid}/agents | 201 | 201 | 403 | 403 |
| PUT /tenants/{tid}/agents/{aid} | 200 | 200 | 403 | 403 |
| DELETE /tenants/{tid}/agents/{aid} | 204 | 204 | 403 | 403 |
| GET /tenants/{tid}/users | 200 | 200 | 403 | 403 |
| GET /admin/users | 200 | 403 | 403 | 403 |
Also test:
- Request with NO role headers -> 422 (missing required header)
- Impersonation endpoint creates AuditEvent row
- Billing, channels, llm_keys, usage endpoints follow same pattern (at least one representative test per router)
Create `tests/integration/test_invite_flow.py`:
- Set up: create a tenant, create a customer_admin user with membership
- Test full flow:
1. Admin POST /invitations with {email, name, role: "customer_operator", tenant_id} -> 201, returns token
2. Accept POST /invitations/accept with {token, password: "securepass123"} -> 200, returns user
3. Verify PortalUser created with correct email, role from invitation
4. Verify UserTenantRole created linking user to tenant
5. Verify invitation status updated to "accepted"
6. Verify login works: POST /auth/verify with new credentials -> 200, returns role="customer_operator"
- Test expired invite: create invitation, manually set expires_at to past, attempt accept -> error
- Test resend: create invitation, POST /{id}/resend -> 200, verify new token_hash and extended expires_at
- Test double-accept: accept once, attempt accept again -> error (status no longer 'pending')
Use `pytest.mark.asyncio` and async test functions. Follow existing integration test patterns in `tests/integration/`.
</action>
<verify>
<automated>cd /home/adelorenzo/repos/konstruct && pytest tests/integration/test_portal_rbac.py tests/integration/test_invite_flow.py -x -v</automated>
</verify>
<done>All RBAC integration tests pass — every endpoint returns correct status code for each role. Full invite flow works end-to-end. Expired invites are rejected. Resend works. Double-accept prevented.</done>
</task>
<task type="checkpoint:human-verify" gate="blocking">
<name>Task 3: Verify complete RBAC system end-to-end</name>
<action>
Human verification of the complete RBAC system: three-tier role enforcement (platform admin, customer admin, customer operator) with role-based portal navigation, proxy redirects, API guards, invitation flow, tenant switcher, and impersonation.
Steps to verify:
1. Start the dev environment: `docker compose up -d` and `cd packages/portal && npm run dev`
2. Run the migration: `cd /home/adelorenzo/repos/konstruct && alembic upgrade head`
3. Login as platform admin:
- Verify: sees all nav items (Dashboard, Tenants, Employees, Usage, Billing, API Keys, Users, Platform)
- Verify: can access /admin/users (global user management)
- Verify: can impersonate a tenant (banner appears, can exit)
4. Create a customer_admin invite from the Users page
5. Open the invite link in an incognito window
- Verify: activation page shows, can set password
- Verify: after activation, redirected to login
6. Login as the new customer admin:
- Verify: sees Dashboard, Employees, Usage, Billing, API Keys, Users (no Tenants, no Platform)
- Verify: cannot access /admin/users (redirected to /dashboard)
7. Create a customer_operator invite from Users page
8. Accept invite and login as operator:
- Verify: sees only Employees and Usage in nav
- Verify: navigating to /billing redirects to /agents
- Verify: cannot see Billing, API Keys, Users in sidebar
9. If user has multiple tenants, verify tenant switcher appears and switches context
10. Run: `pytest tests/ -x` — all tests pass
</action>
<verify>Human confirms all verification steps pass or reports issues</verify>
<done>All three roles behave correctly in portal UI and API. Invitation flow works end-to-end. Full test suite green.</done>
</task>
</tasks>
<verification>
- `pytest tests/integration/test_portal_rbac.py -x -v` — all RBAC endpoint tests pass
- `pytest tests/integration/test_invite_flow.py -x -v` — full invite flow tests pass
- `pytest tests/ -x` — entire test suite green (no regressions)
- Every mutating endpoint returns 403 without proper role headers
- Platform admin bypasses all tenant membership checks
</verification>
<success_criteria>
- All portal API endpoints enforce role-based authorization via Depends() guards
- Customer operators cannot mutate any data via API (403 on POST/PUT/DELETE)
- Customer admins can only access their own tenant's data (403 on other tenants)
- Platform admin has unrestricted access to all endpoints
- Impersonation actions logged in audit_events table
- User listing endpoints exist for per-tenant and global views
- Integration tests comprehensively cover the RBAC matrix
- Full invite flow works end-to-end in integration tests
- Human verification confirms visual role-based behavior in portal
</success_criteria>
<output>
After completion, create `.planning/phases/04-rbac/04-03-SUMMARY.md`
</output>