Files
Adolfo Delorenzo e879d27e55 docs(02-01): complete two-layer memory plan — SUMMARY, STATE, ROADMAP, REQUIREMENTS
- SUMMARY.md: memory system docs (Redis sliding window + pgvector HNSW)
- STATE.md: 67% progress (6/9 plans), 3 key decisions, metrics logged
- ROADMAP.md: Phase 2 plan progress updated (2 summaries complete)
- REQUIREMENTS.md: AGNT-02 (conversational memory), AGNT-03 (cross-session recall) marked complete
2026-03-23 14:47:06 -06:00

6.7 KiB

phase, plan, subsystem, tags, dependency_graph, tech_stack, key_files, decisions, metrics
phase plan subsystem tags dependency_graph tech_stack key_files decisions metrics
02-agent-features 01 memory
redis
pgvector
sentence-transformers
celery
memory
multi-tenancy
requires provides affects
01-foundation (all plans)
conversational-memory
semantic-recall
embedding-backfill
orchestrator/tasks.py
orchestrator/agents/builder.py
orchestrator/agents/runner.py
added patterns
pgvector>=0.4.2
sentence-transformers>=3.0.0 (all-MiniLM-L6-v2)
fakeredis (tests)
Redis RPUSH/LTRIM sliding window
pgvector HNSW cosine search
fire-and-forget Celery task
lazy singleton model loading
created modified
packages/shared/shared/models/memory.py
packages/orchestrator/orchestrator/memory/__init__.py
packages/orchestrator/orchestrator/memory/short_term.py
packages/orchestrator/orchestrator/memory/long_term.py
packages/orchestrator/orchestrator/memory/embedder.py
migrations/versions/002_phase2_memory.py
tests/unit/test_memory_short_term.py
tests/integration/test_memory_long_term.py
packages/shared/shared/redis_keys.py
packages/shared/pyproject.toml
packages/orchestrator/pyproject.toml
packages/orchestrator/orchestrator/agents/builder.py
packages/orchestrator/orchestrator/agents/runner.py
packages/orchestrator/orchestrator/tasks.py
docker-compose.yml
pgvector/pgvector:pg16 Docker image used instead of postgres:16-alpine — postgres:16-alpine does not include the pgvector extension; switched to official pgvector image
Lazy singleton pattern for SentenceTransformer — model loaded once on first use, not at import time, to avoid 2s load penalty when module imported but not used
embed_and_store is a fire-and-forget Celery task (ignore_result=True) — embedding never blocks the LLM response path
All queries pre-filter by (tenant_id, agent_id, user_id) BEFORE ANN operator in pgvector — defense-in-depth with RLS as secondary backstop
No TTL on Redis memory keys — indefinite retention per plan spec; caller controls window size
pgvector context injected as system message BEFORE sliding window — provides LLM background context without polluting conversation flow
duration completed_date tasks_completed files_created files_modified tests_added tests_total
9m 22s 2026-03-23 2 8 7 16 202

Phase 2 Plan 01: Two-Layer Conversational Memory Summary

One-liner: Redis sliding window (last 20 msgs) + pgvector HNSW semantic recall (all-MiniLM-L6-v2, 384d) with per-user per-agent per-tenant isolation and async Celery embedding backfill.

What Was Built

Transforms the stateless Phase 1 agent into one with persistent conversational memory across sessions. Every LLM call now receives:

  1. Short-term context: Last 20 messages from Redis (zero-latency in-session history)
  2. Long-term context: Up to 3 semantically relevant past exchanges from pgvector (cross-session recall)

The embedding backfill runs asynchronously — the LLM response is never blocked.

Task Execution

Task 1: DB models, migration, and memory modules with tests

Status: Complete

TDD Cycle:

  • RED: Wrote 10 unit tests (fakeredis, short_term) + 6 integration tests (pgvector, long_term) → ImportError confirmed failure
  • GREEN: Implemented all modules → 16/16 tests pass

Files created:

  • packages/shared/shared/models/memory.py — ConversationEmbedding ORM with Vector(384) column
  • packages/orchestrator/orchestrator/memory/__init__.py — Package init with architecture doc
  • packages/orchestrator/orchestrator/memory/short_term.py — RPUSH/LTRIM sliding window
  • packages/orchestrator/orchestrator/memory/long_term.py — pgvector HNSW cosine search
  • migrations/versions/002_phase2_memory.py — Alembic migration with HNSW index and RLS

Files modified:

  • packages/shared/shared/redis_keys.py — Added memory_short_key, escalation_status_key, pending_tool_confirm_key
  • packages/shared/pyproject.toml — Added pgvector>=0.4.2
  • packages/orchestrator/pyproject.toml — Added sentence-transformers>=3.0.0

Commit: 28a5ee9

Task 2: Wire memory into orchestrator pipeline

Status: Complete

Files created:

  • packages/orchestrator/orchestrator/memory/embedder.py — Lazy singleton SentenceTransformer with embed_text() / embed_texts()

Files modified:

  • packages/orchestrator/orchestrator/agents/builder.py — Added build_messages_with_memory()
  • packages/orchestrator/orchestrator/agents/runner.py — Added optional messages parameter (backward compat)
  • packages/orchestrator/orchestrator/tasks.py — Wired memory pipeline + new embed_and_store Celery task

Commit: 45b9573

Memory Pipeline Flow

handle_message (Celery task)
    │
    ├── BEFORE LLM CALL:
    │     1. get_recent_messages(redis)         → last 20 turns
    │     2. embed_text(user_text)              → 384-dim query vector
    │     3. retrieve_relevant(session, ...)    → top-3 past exchanges (cosine >= 0.75)
    │     4. build_messages_with_memory(...)    → enriched messages array
    │
    ├── run_agent(msg, agent, messages=enriched_messages)
    │     └── POST /complete to llm-pool
    │
    └── AFTER LLM RESPONSE:
          5. append_message(redis, user_turn)   → update sliding window
          6. append_message(redis, asst_turn)   → update sliding window
          7. embed_and_store.delay(...)          → fire-and-forget pgvector backfill

Security Properties

  • Cross-tenant isolation: pgvector queries pre-filter by tenant_id BEFORE ANN operator; RLS enforces at DB level as secondary backstop
  • Cross-agent isolation: agent_id pre-filter ensures agent A cannot recall agent B's memory
  • Cross-user isolation: user_id pre-filter ensures user A cannot see user B's memory
  • Redis isolation: memory_short_key format {tenant_id}:memory:short:{agent_id}:{user_id} ensures namespace separation

Deviations from Plan

Auto-fixed Issues

1. [Rule 3 - Blocking] Updated Docker postgres image to support pgvector extension

  • Found during: Task 1 integration test run
  • Issue: postgres:16-alpine does not include the pgvector extension. Migration 002_phase2_memory.py fails with: extension "vector" is not available — Could not open extension control file
  • Fix: Changed docker-compose.yml postgres image from postgres:16-alpine to pgvector/pgvector:pg16. Restarted container. Migration ran cleanly.
  • Files modified: docker-compose.yml
  • Commit: 28a5ee9 (included in Task 1 commit)

Self-Check: PASSED

All 8 created files exist on disk. Both commits (28a5ee9, 45b9573) confirmed present in git log. 202 tests pass.