From e879d27e552a4d95889a5b020933fe226c9a99c2 Mon Sep 17 00:00:00 2001 From: Adolfo Delorenzo Date: Mon, 23 Mar 2026 14:47:06 -0600 Subject: [PATCH] =?UTF-8?q?docs(02-01):=20complete=20two-layer=20memory=20?= =?UTF-8?q?plan=20=E2=80=94=20SUMMARY,=20STATE,=20ROADMAP,=20REQUIREMENTS?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - SUMMARY.md: memory system docs (Redis sliding window + pgvector HNSW) - STATE.md: 67% progress (6/9 plans), 3 key decisions, metrics logged - ROADMAP.md: Phase 2 plan progress updated (2 summaries complete) - REQUIREMENTS.md: AGNT-02 (conversational memory), AGNT-03 (cross-session recall) marked complete --- .planning/REQUIREMENTS.md | 8 +- .planning/ROADMAP.md | 2 +- .planning/STATE.md | 14 +- .../phases/02-agent-features/02-01-SUMMARY.md | 140 ++++++++++++++++++ 4 files changed, 154 insertions(+), 10 deletions(-) create mode 100644 .planning/phases/02-agent-features/02-01-SUMMARY.md diff --git a/.planning/REQUIREMENTS.md b/.planning/REQUIREMENTS.md index e6119c5..df15599 100644 --- a/.planning/REQUIREMENTS.md +++ b/.planning/REQUIREMENTS.md @@ -18,8 +18,8 @@ Requirements for beta-ready release. Each maps to roadmap phases. ### Agent Core - [x] **AGNT-01**: Tenant can configure a single AI employee with custom name, role, and persona -- [ ] **AGNT-02**: Agent maintains conversational memory within sessions (sliding window) -- [ ] **AGNT-03**: Agent retrieves relevant past context via vector search (pgvector long-term memory) +- [x] **AGNT-02**: Agent maintains conversational memory within sessions (sliding window) +- [x] **AGNT-03**: Agent retrieves relevant past context via vector search (pgvector long-term memory) - [ ] **AGNT-04**: Agent can invoke registered tools to perform actions (tool registry + execution) - [ ] **AGNT-05**: Agent escalates to human when configured rules trigger, transferring full conversation context - [ ] **AGNT-06**: Every agent action (LLM call, tool invocation, handoff) is logged in an audit trail @@ -101,8 +101,8 @@ Which phases cover which requirements. Updated during roadmap creation. | CHAN-04 | Phase 2 | Complete | | CHAN-05 | Phase 1 | Complete | | AGNT-01 | Phase 1 | Complete | -| AGNT-02 | Phase 2 | Pending | -| AGNT-03 | Phase 2 | Pending | +| AGNT-02 | Phase 2 | Complete | +| AGNT-03 | Phase 2 | Complete | | AGNT-04 | Phase 2 | Pending | | AGNT-05 | Phase 2 | Pending | | AGNT-06 | Phase 2 | Pending | diff --git a/.planning/ROADMAP.md b/.planning/ROADMAP.md index 01efabe..4b6d060 100644 --- a/.planning/ROADMAP.md +++ b/.planning/ROADMAP.md @@ -79,7 +79,7 @@ Phases execute in numeric order: 1 → 2 → 3 | Phase | Plans Complete | Status | Completed | |-------|----------------|--------|-----------| | 1. Foundation | 4/4 | Complete | 2026-03-23 | -| 2. Agent Features | 1/5 | In Progress| | +| 2. Agent Features | 2/5 | In Progress| | | 3. Operator Experience | 0/2 | Not started | - | --- diff --git a/.planning/STATE.md b/.planning/STATE.md index 276c547..3150723 100644 --- a/.planning/STATE.md +++ b/.planning/STATE.md @@ -3,14 +3,14 @@ gsd_state_version: 1.0 milestone: v1.0 milestone_name: milestone status: planning -stopped_at: Completed 02-agent-features/02-03-PLAN.md -last_updated: "2026-03-23T20:44:35.519Z" +stopped_at: Completed 02-agent-features/02-01-PLAN.md +last_updated: "2026-03-23T20:46:53.813Z" last_activity: 2026-03-23 — Roadmap created, ready for Phase 1 planning progress: total_phases: 3 completed_phases: 1 total_plans: 9 - completed_plans: 5 + completed_plans: 6 percent: 0 --- @@ -55,6 +55,7 @@ Progress: [░░░░░░░░░░] 0% | Phase 01-foundation P04 | 19 | 2 tasks | 25 files | | Phase 01-foundation P03 | 9 | 2 tasks | 20 files | | Phase 02-agent-features P03 | 7 | 2 tasks | 7 files | +| Phase 02-agent-features P02-01 | 9m 22s | 2 tasks | 15 files | ## Accumulated Context @@ -83,6 +84,9 @@ Recent decisions affecting current work: - [Phase 02-agent-features]: meta-media://{media_id} placeholder URL at normalization time; actual download in adapter after tenant resolution - [Phase 02-agent-features]: WhatsApp thread_id = sender wa_id (WhatsApp has no threading; conversation scope is per phone number) - [Phase 02-agent-features]: Always return HTTP 200 to Meta webhooks regardless of processing errors to prevent retry storms +- [Phase 02-agent-features]: pgvector/pgvector:pg16 Docker image required for pgvector extension — postgres:16-alpine does not include vector extension control file +- [Phase 02-agent-features]: SentenceTransformer loaded as lazy singleton — model loaded once on first use to avoid per-call 2s overhead; 384d all-MiniLM-L6-v2 matches vector(384) column +- [Phase 02-agent-features]: embed_and_store Celery task is fire-and-forget (ignore_result=True) — embedding backfill never blocks LLM response path ### Pending Todos @@ -94,6 +98,6 @@ None yet. ## Session Continuity -Last session: 2026-03-23T20:44:35.516Z -Stopped at: Completed 02-agent-features/02-03-PLAN.md +Last session: 2026-03-23T20:46:53.810Z +Stopped at: Completed 02-agent-features/02-01-PLAN.md Resume file: None diff --git a/.planning/phases/02-agent-features/02-01-SUMMARY.md b/.planning/phases/02-agent-features/02-01-SUMMARY.md new file mode 100644 index 0000000..7f30049 --- /dev/null +++ b/.planning/phases/02-agent-features/02-01-SUMMARY.md @@ -0,0 +1,140 @@ +--- +phase: 02-agent-features +plan: 01 +subsystem: memory +tags: [redis, pgvector, sentence-transformers, celery, memory, multi-tenancy] +dependency_graph: + requires: [01-foundation (all plans)] + provides: [conversational-memory, semantic-recall, embedding-backfill] + affects: [orchestrator/tasks.py, orchestrator/agents/builder.py, orchestrator/agents/runner.py] +tech_stack: + added: [pgvector>=0.4.2, sentence-transformers>=3.0.0 (all-MiniLM-L6-v2), fakeredis (tests)] + patterns: [Redis RPUSH/LTRIM sliding window, pgvector HNSW cosine search, fire-and-forget Celery task, lazy singleton model loading] +key_files: + created: + - packages/shared/shared/models/memory.py + - packages/orchestrator/orchestrator/memory/__init__.py + - packages/orchestrator/orchestrator/memory/short_term.py + - packages/orchestrator/orchestrator/memory/long_term.py + - packages/orchestrator/orchestrator/memory/embedder.py + - migrations/versions/002_phase2_memory.py + - tests/unit/test_memory_short_term.py + - tests/integration/test_memory_long_term.py + modified: + - packages/shared/shared/redis_keys.py + - packages/shared/pyproject.toml + - packages/orchestrator/pyproject.toml + - packages/orchestrator/orchestrator/agents/builder.py + - packages/orchestrator/orchestrator/agents/runner.py + - packages/orchestrator/orchestrator/tasks.py + - docker-compose.yml +decisions: + - "pgvector/pgvector:pg16 Docker image used instead of postgres:16-alpine — postgres:16-alpine does not include the pgvector extension; switched to official pgvector image" + - "Lazy singleton pattern for SentenceTransformer — model loaded once on first use, not at import time, to avoid 2s load penalty when module imported but not used" + - "embed_and_store is a fire-and-forget Celery task (ignore_result=True) — embedding never blocks the LLM response path" + - "All queries pre-filter by (tenant_id, agent_id, user_id) BEFORE ANN operator in pgvector — defense-in-depth with RLS as secondary backstop" + - "No TTL on Redis memory keys — indefinite retention per plan spec; caller controls window size" + - "pgvector context injected as system message BEFORE sliding window — provides LLM background context without polluting conversation flow" +metrics: + duration: 9m 22s + completed_date: "2026-03-23" + tasks_completed: 2 + files_created: 8 + files_modified: 7 + tests_added: 16 + tests_total: 202 +--- + +# Phase 2 Plan 01: Two-Layer Conversational Memory Summary + +**One-liner:** Redis sliding window (last 20 msgs) + pgvector HNSW semantic recall (all-MiniLM-L6-v2, 384d) with per-user per-agent per-tenant isolation and async Celery embedding backfill. + +## What Was Built + +Transforms the stateless Phase 1 agent into one with persistent conversational memory across sessions. Every LLM call now receives: + +1. **Short-term context**: Last 20 messages from Redis (zero-latency in-session history) +2. **Long-term context**: Up to 3 semantically relevant past exchanges from pgvector (cross-session recall) + +The embedding backfill runs asynchronously — the LLM response is never blocked. + +## Task Execution + +### Task 1: DB models, migration, and memory modules with tests + +**Status:** Complete + +**TDD Cycle:** +- RED: Wrote 10 unit tests (fakeredis, short_term) + 6 integration tests (pgvector, long_term) → ImportError confirmed failure +- GREEN: Implemented all modules → 16/16 tests pass + +**Files created:** +- `packages/shared/shared/models/memory.py` — ConversationEmbedding ORM with Vector(384) column +- `packages/orchestrator/orchestrator/memory/__init__.py` — Package init with architecture doc +- `packages/orchestrator/orchestrator/memory/short_term.py` — RPUSH/LTRIM sliding window +- `packages/orchestrator/orchestrator/memory/long_term.py` — pgvector HNSW cosine search +- `migrations/versions/002_phase2_memory.py` — Alembic migration with HNSW index and RLS + +**Files modified:** +- `packages/shared/shared/redis_keys.py` — Added memory_short_key, escalation_status_key, pending_tool_confirm_key +- `packages/shared/pyproject.toml` — Added pgvector>=0.4.2 +- `packages/orchestrator/pyproject.toml` — Added sentence-transformers>=3.0.0 + +**Commit:** `28a5ee9` + +### Task 2: Wire memory into orchestrator pipeline + +**Status:** Complete + +**Files created:** +- `packages/orchestrator/orchestrator/memory/embedder.py` — Lazy singleton SentenceTransformer with embed_text() / embed_texts() + +**Files modified:** +- `packages/orchestrator/orchestrator/agents/builder.py` — Added build_messages_with_memory() +- `packages/orchestrator/orchestrator/agents/runner.py` — Added optional messages parameter (backward compat) +- `packages/orchestrator/orchestrator/tasks.py` — Wired memory pipeline + new embed_and_store Celery task + +**Commit:** `45b9573` + +## Memory Pipeline Flow + +``` +handle_message (Celery task) + │ + ├── BEFORE LLM CALL: + │ 1. get_recent_messages(redis) → last 20 turns + │ 2. embed_text(user_text) → 384-dim query vector + │ 3. retrieve_relevant(session, ...) → top-3 past exchanges (cosine >= 0.75) + │ 4. build_messages_with_memory(...) → enriched messages array + │ + ├── run_agent(msg, agent, messages=enriched_messages) + │ └── POST /complete to llm-pool + │ + └── AFTER LLM RESPONSE: + 5. append_message(redis, user_turn) → update sliding window + 6. append_message(redis, asst_turn) → update sliding window + 7. embed_and_store.delay(...) → fire-and-forget pgvector backfill +``` + +## Security Properties + +- **Cross-tenant isolation**: pgvector queries pre-filter by tenant_id BEFORE ANN operator; RLS enforces at DB level as secondary backstop +- **Cross-agent isolation**: agent_id pre-filter ensures agent A cannot recall agent B's memory +- **Cross-user isolation**: user_id pre-filter ensures user A cannot see user B's memory +- **Redis isolation**: memory_short_key format `{tenant_id}:memory:short:{agent_id}:{user_id}` ensures namespace separation + +## Deviations from Plan + +### Auto-fixed Issues + +**1. [Rule 3 - Blocking] Updated Docker postgres image to support pgvector extension** + +- **Found during:** Task 1 integration test run +- **Issue:** `postgres:16-alpine` does not include the pgvector extension. Migration `002_phase2_memory.py` fails with: `extension "vector" is not available — Could not open extension control file` +- **Fix:** Changed `docker-compose.yml` postgres image from `postgres:16-alpine` to `pgvector/pgvector:pg16`. Restarted container. Migration ran cleanly. +- **Files modified:** `docker-compose.yml` +- **Commit:** `28a5ee9` (included in Task 1 commit) + +## Self-Check: PASSED + +All 8 created files exist on disk. Both commits (28a5ee9, 45b9573) confirmed present in git log. 202 tests pass.