- Task 3 (human-verify) approved — all 3 tasks complete - SUMMARY.md updated: tasks 3/3, next phase readiness updated - STATE.md stopped_at reflects full completion - ROADMAP.md phase 4 progress confirmed 3/3 summaries complete Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
13 KiB
gsd_state_version, milestone, milestone_name, status, stopped_at, last_updated, last_activity, progress
| gsd_state_version | milestone | milestone_name | status | stopped_at | last_updated | last_activity | progress | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1.0 | v1.0 | milestone | completed | Completed 04-rbac-03-PLAN.md — all tasks complete, human-verify checkpoint approved | 2026-03-24T23:20:03.259Z | 2026-03-23 — Completed 03-02 onboarding wizard, Slack OAuth, BYO API keys |
|
Project State
Project Reference
See: .planning/PROJECT.md (updated 2026-03-22)
Core value: An AI employee that works in the channels your team already uses — no new tools to learn, no dashboards to check, just a capable coworker in Slack or WhatsApp. Current focus: Phase 3 — Operator Experience (all plans complete)
Current Position
Phase: 3 of 3 (Operator Experience) Plan: 4 of 4 in current phase (all complete) Status: All 3 phases complete — v1.0 milestone achieved Last activity: 2026-03-23 — Completed 03-02 onboarding wizard, Slack OAuth, BYO API keys
Progress: [██████████] 100%
Performance Metrics
Velocity:
- Total plans completed: 0
- Average duration: —
- Total execution time: —
By Phase:
| Phase | Plans | Total | Avg/Plan |
|---|---|---|---|
| - | - | - | - |
Recent Trend:
- Last 5 plans: —
- Trend: —
Updated after each plan completion | Phase 01-foundation P01 | 12 | 2 tasks | 32 files | | Phase 01-foundation P02 | 6 | 2 tasks | 15 files | | Phase 01-foundation P04 | 19 | 2 tasks | 25 files | | Phase 01-foundation P03 | 9 | 2 tasks | 20 files | | Phase 02-agent-features P03 | 7 | 2 tasks | 7 files | | Phase 02-agent-features P02-01 | 9m 22s | 2 tasks | 15 files | | Phase 02-agent-features P04 | 5m | 2 tasks | 7 files | | Phase 02-agent-features P02 | 12m 22s | 3 tasks | 19 files | | Phase 02-agent-features P05 | ~25m | 2 tasks | 6 files | | Phase 02-agent-features P06 | 9m 53s | 2 tasks | 3 files | | Phase 03-operator-experience P01 | 22m | 3 tasks | 20 files | | Phase 03-operator-experience P03 | ~8m | 1 tasks | 6 files | | Phase 03-operator-experience P04 | 10m | 1 tasks | 8 files | | Phase 03-operator-experience P02 | ~35min | 2 tasks | 10 files | | Phase 03-operator-experience P03 | 8min | 2 tasks | 6 files | | Phase 03-operator-experience P04 | 10min | 2 tasks | 8 files | | Phase 03-operator-experience P05 | 2min | 2 tasks | 6 files | | Phase 04-rbac P01 | 8min | 3 tasks | 14 files | | Phase 04-rbac P02 | 5min | 3 tasks | 10 files | | Phase 04-rbac P03 | 8min | 2 tasks | 7 files |
Accumulated Context
Decisions
Decisions are logged in PROJECT.md Key Decisions table. Recent decisions affecting current work:
- [Roadmap]: Coarse 3-phase structure — Foundation → Agent Features → Operator Experience
- [Roadmap]: Phase 3 portal gated on Phase 2 completing (DB schema stability after memory + tool data models)
- [Roadmap]: WhatsApp Business Verification must be initiated during Phase 1 (1-6 week approval, WhatsApp goes live in Phase 2)
- [Phase 01-foundation]: PostgreSQL RLS with FORCE ROW LEVEL SECURITY chosen for tenant isolation; app connects as konstruct_app role (not superuser)
- [Phase 01-foundation]: SET LOCAL app.current_tenant uses UUID-sanitized f-string (not parameterized) — asyncpg does not support prepared statement placeholders for SET LOCAL
- [Phase 01-foundation]: channel_type stored as TEXT with CHECK constraint — native sa.Enum caused duplicate CREATE TYPE DDL in Alembic migrations
- [Phase 01-foundation]: LiteLLM pinned to ==1.82.5, not latest — September 2025 OOM regression in later versions
- [Phase 01-foundation]: Celery tasks are always sync def with asyncio.run() — hard architectural constraint, never async def
- [Phase 01-foundation]: AI transparency clause is unconditional in system prompt — agents must disclose AI identity when directly asked
- [Phase 01-foundation]: llm-pool port 8004 (consistent with shared/config.py llm_pool_url default, not plan-stated 8002)
- [Phase 01-foundation]: proxy.ts used instead of middleware.ts — Next.js 16 renamed middleware to proxy
- [Phase 01-foundation]: standardSchemaResolver used over zodResolver — hookform/resolvers v5 dropped zod subpackage, uses Standard Schema protocol; zod v4 implements Standard Schema
- [Phase 01-foundation]: Auth.js v5 JWT session strategy chosen — no portal_sessions DB table needed for Phase 1, stateless tokens sufficient
- [Phase 01-foundation]: Patch at usage site in tests: mock 'gateway.channels.slack.resolve_tenant' not 'router.tenant.resolve_tenant' — Python name binding at import time
- [Phase 01-foundation]: Celery payload extension: msg.model_dump() | extras dict, pop extras before model_validate in tasks.py to avoid pydantic validation errors on unknown fields
- [Phase 01-foundation]: Bot token for chat.update loaded from channel_connections.config['bot_token'] in orchestrator task — keeps Slack SDK out of orchestrator package
- [Phase 02-agent-features]: HMAC uses hmac.new() with hmac.compare_digest for timing-safe WhatsApp signature verification
- [Phase 02-agent-features]: meta-media://{media_id} placeholder URL at normalization time; actual download in adapter after tenant resolution
- [Phase 02-agent-features]: WhatsApp thread_id = sender wa_id (WhatsApp has no threading; conversation scope is per phone number)
- [Phase 02-agent-features]: Always return HTTP 200 to Meta webhooks regardless of processing errors to prevent retry storms
- [Phase 02-agent-features]: pgvector/pgvector:pg16 Docker image required for pgvector extension — postgres:16-alpine does not include vector extension control file
- [Phase 02-agent-features]: SentenceTransformer loaded as lazy singleton — model loaded once on first use to avoid per-call 2s overhead; 384d all-MiniLM-L6-v2 matches vector(384) column
- [Phase 02-agent-features]: embed_and_store Celery task is fire-and-forget (ignore_result=True) — embedding backfill never blocks LLM response path
- [Phase 02-agent-features]: Keyword-based conversation metadata detection (v1) uses billing keywords + attempt counter from sliding window — simple and sufficient for initial escalation rules
- [Phase 02-agent-features]: Escalation condition parser uses regex not eval — safe, no code injection risk, supports 'keyword AND count > N' format
- [Phase 02-agent-features]: No-op audit logger stub in tasks.py allows escalation to function before Plan 02 audit module ships — one-import swap when ready
- [Phase 02-agent-features]: CAST(:metadata AS jsonb) for asyncpg JSONB params — :: cast syntax fails with named params
- [Phase 02-agent-features]: Migration 004 (not 003) for audit_events — 003_escalation_fields.py claimed revision 003 first
- [Phase 02-agent-features]: AuditLogger uses raw INSERT text() — ORM model would allow accidental SQLAlchemy UPDATE/DELETE on audit rows
- [Phase 02-agent-features]: boto3 added to gateway pyproject.toml explicitly — was used via local import in whatsapp.py but never declared, causing ModuleNotFoundError in tests
- [Phase 02-agent-features]: boto3 patched at import site patch('boto3.client') not patch('module.boto3') — local imports inside async functions require patching the actual module, not the module attribute
- [Phase 02-agent-features]: build_messages_with_media() wraps build_messages_with_memory() — media enrichment is additive, all memory context preserved alongside image_url blocks
- [Phase 02-agent-features]: AUDIO/VIDEO attachments text-referenced only in v1 — OpenAI image_url blocks support images only, not audio/video
- [Phase 02-agent-features]: Module-level imports in tasks.py for testability — patchable at orchestrator.tasks.*
- [Phase 02-agent-features]: Unified extras dict carries channel-specific metadata (Slack + WhatsApp) through entire pipeline
- [Phase 02-agent-features]: wa_id extracted from sender.user_id in handle_message after model_validate and injected into extras
- [Phase 03-operator-experience]: AuditEvent ORM attribute renamed from 'metadata' to 'event_metadata' — SQLAlchemy 2.0 DeclarativeBase reserves 'metadata'; mapped_column('metadata') preserves DB column name
- [Phase 03-operator-experience]: StripeClient(api_key=settings.stripe_secret_key) — new v14+ thread-safe API, not legacy stripe.api_key module-level approach
- [Phase 03-operator-experience]: Stripe webhook idempotency via StripeEvent INSERT + flush + IntegrityError catch — handles Stripe at-least-once delivery
- [Phase 03-operator-experience]: LLM key listing returns key_hint (last 4 chars only) — portal displays ...ABCD without decrypting Fernet ciphertext
- [Phase 03-operator-experience]: window.location.href for Stripe redirects (not router.push) — Stripe Checkout/Portal URLs are external domains
- [Phase 03-operator-experience]: use(searchParams) in client components for Next.js 15 — searchParams is a Promise, must be unwrapped with React.use()
- [Phase 03-operator-experience]: BillingStatus uses inline Tailwind color classes — existing Badge variants lack semantic blue/green/amber/red states needed for subscription status
- [Phase 03-operator-experience]: recharts installed with --force due to npm ENOTEMPTY race bug — was in package.json but not node_modules
- [Phase 03-operator-experience]: Usage nav links to /usage tenant picker (not hardcoded tenantId) — supports multi-tenant operators
- [Phase 03-operator-experience]: BudgetAlertBadge renders neutral 'No limit set' for null budget_limit_usd — prevents false alarms
- [Phase 03-operator-experience]: Agent goes live automatically (is_active true by default) — no separate Go Live button in onboarding wizard (per user decision)
- [Phase 03-operator-experience]: Test message step is REQUIRED in onboarding — no skip button (per user decision)
- [Phase 03-operator-experience]: Onboarding wizard step state in URL searchParams (step=1|2|3) — shareable and browser-refresh safe
- [Phase 03-operator-experience]: Portal git initialized as submodule with own .git repo — enables atomic per-task commits in packages/portal; parent repo tracks gitlink
- [Phase 03-operator-experience]: Agent goes live automatically after test message — is_active is true by default, no separate Go Live button (per user decision)
- [Phase 03-operator-experience]: Test message step is REQUIRED in onboarding — no skip button (per user decision)
- [Phase 03-operator-experience]: Onboarding wizard step state in URL searchParams (step=1|2|3) — shareable and browser-refresh safe
- [Phase 03-operator-experience]: Portal git initialized as submodule with own .git repo — enables atomic per-task commits in packages/portal; parent repo tracks gitlink
- [Phase 03-operator-experience]: window.location.href used for Stripe redirects (not router.push) — Stripe Checkout/Portal URLs are external domains
- [Phase 03-operator-experience]: use(searchParams) in billing page client component — Next.js 15 searchParams is a Promise, must be unwrapped with React.use()
- [Phase 03-operator-experience]: BillingStatus uses inline Tailwind color classes — existing Badge variants lack semantic blue/green/amber/red states needed for subscription status
- [Phase 03-operator-experience]: recharts installed with --force due to npm ENOTEMPTY race bug — was in package.json but not node_modules
- [Phase 03-operator-experience]: Usage nav links to /usage tenant picker (not hardcoded tenantId) — supports multi-tenant operators
- [Phase 03-operator-experience]: BudgetAlertBadge renders neutral 'No limit set' for null budget_limit_usd — prevents false alarms
- [Phase 03-operator-experience]: All Phase 3 portal routers (portal, billing, channels, llm_keys, usage, webhook) mounted directly on gateway FastAPI app
- [Phase 04-rbac]: Role stored as TEXT+CHECK (not sa.Enum) per Phase 1 ADR to avoid Alembic DDL conflicts
- [Phase 04-rbac]: SHA-256 hash of raw invite token stored in DB — token_to_hash enables O(1) lookup without exposing token
- [Phase 04-rbac]: platform_admin bypasses tenant membership check entirely (no DB query) for simpler, faster guard logic
- [Phase 04-rbac]: Celery invite email task dispatched via lazy local import in invitations.py to avoid shared->orchestrator circular dep
- [Phase 04-rbac]: base-ui DialogTrigger uses render prop not asChild — fixes TypeScript error in portal components
- [Phase 04-rbac]: base-ui Select onValueChange typed as (string | null) — filter state setters use ?? '' to coerce null
- [Phase 04-rbac]: Operator test-message endpoint uses require_tenant_member not require_tenant_admin — locked decision: operators can QA agent behavior without CRUD access
- [Phase 04-rbac]: Impersonation logs via raw SQL INSERT into audit_events — consistent with audit table immutability design (UPDATE/DELETE revoked at DB level)
Roadmap Evolution
- Phase 4 added: RBAC — 3-tier role-based access control (platform admin, customer admin, customer operator) with invitation flow
Pending Todos
None yet.
Blockers/Concerns
None — all phases complete.
Session Continuity
Last session: 2026-03-24T23:20:03.256Z Stopped at: Completed 04-rbac-03-PLAN.md — all tasks complete, human-verify checkpoint approved Resume file: None