--- phase: 02-agent-features plan: 01 subsystem: memory tags: [redis, pgvector, sentence-transformers, celery, memory, multi-tenancy] dependency_graph: requires: [01-foundation (all plans)] provides: [conversational-memory, semantic-recall, embedding-backfill] affects: [orchestrator/tasks.py, orchestrator/agents/builder.py, orchestrator/agents/runner.py] tech_stack: added: [pgvector>=0.4.2, sentence-transformers>=3.0.0 (all-MiniLM-L6-v2), fakeredis (tests)] patterns: [Redis RPUSH/LTRIM sliding window, pgvector HNSW cosine search, fire-and-forget Celery task, lazy singleton model loading] key_files: created: - packages/shared/shared/models/memory.py - packages/orchestrator/orchestrator/memory/__init__.py - packages/orchestrator/orchestrator/memory/short_term.py - packages/orchestrator/orchestrator/memory/long_term.py - packages/orchestrator/orchestrator/memory/embedder.py - migrations/versions/002_phase2_memory.py - tests/unit/test_memory_short_term.py - tests/integration/test_memory_long_term.py modified: - packages/shared/shared/redis_keys.py - packages/shared/pyproject.toml - packages/orchestrator/pyproject.toml - packages/orchestrator/orchestrator/agents/builder.py - packages/orchestrator/orchestrator/agents/runner.py - packages/orchestrator/orchestrator/tasks.py - docker-compose.yml decisions: - "pgvector/pgvector:pg16 Docker image used instead of postgres:16-alpine — postgres:16-alpine does not include the pgvector extension; switched to official pgvector image" - "Lazy singleton pattern for SentenceTransformer — model loaded once on first use, not at import time, to avoid 2s load penalty when module imported but not used" - "embed_and_store is a fire-and-forget Celery task (ignore_result=True) — embedding never blocks the LLM response path" - "All queries pre-filter by (tenant_id, agent_id, user_id) BEFORE ANN operator in pgvector — defense-in-depth with RLS as secondary backstop" - "No TTL on Redis memory keys — indefinite retention per plan spec; caller controls window size" - "pgvector context injected as system message BEFORE sliding window — provides LLM background context without polluting conversation flow" metrics: duration: 9m 22s completed_date: "2026-03-23" tasks_completed: 2 files_created: 8 files_modified: 7 tests_added: 16 tests_total: 202 --- # Phase 2 Plan 01: Two-Layer Conversational Memory Summary **One-liner:** Redis sliding window (last 20 msgs) + pgvector HNSW semantic recall (all-MiniLM-L6-v2, 384d) with per-user per-agent per-tenant isolation and async Celery embedding backfill. ## What Was Built Transforms the stateless Phase 1 agent into one with persistent conversational memory across sessions. Every LLM call now receives: 1. **Short-term context**: Last 20 messages from Redis (zero-latency in-session history) 2. **Long-term context**: Up to 3 semantically relevant past exchanges from pgvector (cross-session recall) The embedding backfill runs asynchronously — the LLM response is never blocked. ## Task Execution ### Task 1: DB models, migration, and memory modules with tests **Status:** Complete **TDD Cycle:** - RED: Wrote 10 unit tests (fakeredis, short_term) + 6 integration tests (pgvector, long_term) → ImportError confirmed failure - GREEN: Implemented all modules → 16/16 tests pass **Files created:** - `packages/shared/shared/models/memory.py` — ConversationEmbedding ORM with Vector(384) column - `packages/orchestrator/orchestrator/memory/__init__.py` — Package init with architecture doc - `packages/orchestrator/orchestrator/memory/short_term.py` — RPUSH/LTRIM sliding window - `packages/orchestrator/orchestrator/memory/long_term.py` — pgvector HNSW cosine search - `migrations/versions/002_phase2_memory.py` — Alembic migration with HNSW index and RLS **Files modified:** - `packages/shared/shared/redis_keys.py` — Added memory_short_key, escalation_status_key, pending_tool_confirm_key - `packages/shared/pyproject.toml` — Added pgvector>=0.4.2 - `packages/orchestrator/pyproject.toml` — Added sentence-transformers>=3.0.0 **Commit:** `28a5ee9` ### Task 2: Wire memory into orchestrator pipeline **Status:** Complete **Files created:** - `packages/orchestrator/orchestrator/memory/embedder.py` — Lazy singleton SentenceTransformer with embed_text() / embed_texts() **Files modified:** - `packages/orchestrator/orchestrator/agents/builder.py` — Added build_messages_with_memory() - `packages/orchestrator/orchestrator/agents/runner.py` — Added optional messages parameter (backward compat) - `packages/orchestrator/orchestrator/tasks.py` — Wired memory pipeline + new embed_and_store Celery task **Commit:** `45b9573` ## Memory Pipeline Flow ``` handle_message (Celery task) │ ├── BEFORE LLM CALL: │ 1. get_recent_messages(redis) → last 20 turns │ 2. embed_text(user_text) → 384-dim query vector │ 3. retrieve_relevant(session, ...) → top-3 past exchanges (cosine >= 0.75) │ 4. build_messages_with_memory(...) → enriched messages array │ ├── run_agent(msg, agent, messages=enriched_messages) │ └── POST /complete to llm-pool │ └── AFTER LLM RESPONSE: 5. append_message(redis, user_turn) → update sliding window 6. append_message(redis, asst_turn) → update sliding window 7. embed_and_store.delay(...) → fire-and-forget pgvector backfill ``` ## Security Properties - **Cross-tenant isolation**: pgvector queries pre-filter by tenant_id BEFORE ANN operator; RLS enforces at DB level as secondary backstop - **Cross-agent isolation**: agent_id pre-filter ensures agent A cannot recall agent B's memory - **Cross-user isolation**: user_id pre-filter ensures user A cannot see user B's memory - **Redis isolation**: memory_short_key format `{tenant_id}:memory:short:{agent_id}:{user_id}` ensures namespace separation ## Deviations from Plan ### Auto-fixed Issues **1. [Rule 3 - Blocking] Updated Docker postgres image to support pgvector extension** - **Found during:** Task 1 integration test run - **Issue:** `postgres:16-alpine` does not include the pgvector extension. Migration `002_phase2_memory.py` fails with: `extension "vector" is not available — Could not open extension control file` - **Fix:** Changed `docker-compose.yml` postgres image from `postgres:16-alpine` to `pgvector/pgvector:pg16`. Restarted container. Migration ran cleanly. - **Files modified:** `docker-compose.yml` - **Commit:** `28a5ee9` (included in Task 1 commit) ## Self-Check: PASSED All 8 created files exist on disk. Both commits (28a5ee9, 45b9573) confirmed present in git log. 202 tests pass.