- SUMMARY.md: memory system docs (Redis sliding window + pgvector HNSW) - STATE.md: 67% progress (6/9 plans), 3 key decisions, metrics logged - ROADMAP.md: Phase 2 plan progress updated (2 summaries complete) - REQUIREMENTS.md: AGNT-02 (conversational memory), AGNT-03 (cross-session recall) marked complete
6.7 KiB
6.7 KiB
phase, plan, subsystem, tags, dependency_graph, tech_stack, key_files, decisions, metrics
| phase | plan | subsystem | tags | dependency_graph | tech_stack | key_files | decisions | metrics | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 02-agent-features | 01 | memory |
|
|
|
|
|
|
Phase 2 Plan 01: Two-Layer Conversational Memory Summary
One-liner: Redis sliding window (last 20 msgs) + pgvector HNSW semantic recall (all-MiniLM-L6-v2, 384d) with per-user per-agent per-tenant isolation and async Celery embedding backfill.
What Was Built
Transforms the stateless Phase 1 agent into one with persistent conversational memory across sessions. Every LLM call now receives:
- Short-term context: Last 20 messages from Redis (zero-latency in-session history)
- Long-term context: Up to 3 semantically relevant past exchanges from pgvector (cross-session recall)
The embedding backfill runs asynchronously — the LLM response is never blocked.
Task Execution
Task 1: DB models, migration, and memory modules with tests
Status: Complete
TDD Cycle:
- RED: Wrote 10 unit tests (fakeredis, short_term) + 6 integration tests (pgvector, long_term) → ImportError confirmed failure
- GREEN: Implemented all modules → 16/16 tests pass
Files created:
packages/shared/shared/models/memory.py— ConversationEmbedding ORM with Vector(384) columnpackages/orchestrator/orchestrator/memory/__init__.py— Package init with architecture docpackages/orchestrator/orchestrator/memory/short_term.py— RPUSH/LTRIM sliding windowpackages/orchestrator/orchestrator/memory/long_term.py— pgvector HNSW cosine searchmigrations/versions/002_phase2_memory.py— Alembic migration with HNSW index and RLS
Files modified:
packages/shared/shared/redis_keys.py— Added memory_short_key, escalation_status_key, pending_tool_confirm_keypackages/shared/pyproject.toml— Added pgvector>=0.4.2packages/orchestrator/pyproject.toml— Added sentence-transformers>=3.0.0
Commit: 28a5ee9
Task 2: Wire memory into orchestrator pipeline
Status: Complete
Files created:
packages/orchestrator/orchestrator/memory/embedder.py— Lazy singleton SentenceTransformer with embed_text() / embed_texts()
Files modified:
packages/orchestrator/orchestrator/agents/builder.py— Added build_messages_with_memory()packages/orchestrator/orchestrator/agents/runner.py— Added optional messages parameter (backward compat)packages/orchestrator/orchestrator/tasks.py— Wired memory pipeline + new embed_and_store Celery task
Commit: 45b9573
Memory Pipeline Flow
handle_message (Celery task)
│
├── BEFORE LLM CALL:
│ 1. get_recent_messages(redis) → last 20 turns
│ 2. embed_text(user_text) → 384-dim query vector
│ 3. retrieve_relevant(session, ...) → top-3 past exchanges (cosine >= 0.75)
│ 4. build_messages_with_memory(...) → enriched messages array
│
├── run_agent(msg, agent, messages=enriched_messages)
│ └── POST /complete to llm-pool
│
└── AFTER LLM RESPONSE:
5. append_message(redis, user_turn) → update sliding window
6. append_message(redis, asst_turn) → update sliding window
7. embed_and_store.delay(...) → fire-and-forget pgvector backfill
Security Properties
- Cross-tenant isolation: pgvector queries pre-filter by tenant_id BEFORE ANN operator; RLS enforces at DB level as secondary backstop
- Cross-agent isolation: agent_id pre-filter ensures agent A cannot recall agent B's memory
- Cross-user isolation: user_id pre-filter ensures user A cannot see user B's memory
- Redis isolation: memory_short_key format
{tenant_id}:memory:short:{agent_id}:{user_id}ensures namespace separation
Deviations from Plan
Auto-fixed Issues
1. [Rule 3 - Blocking] Updated Docker postgres image to support pgvector extension
- Found during: Task 1 integration test run
- Issue:
postgres:16-alpinedoes not include the pgvector extension. Migration002_phase2_memory.pyfails with:extension "vector" is not available — Could not open extension control file - Fix: Changed
docker-compose.ymlpostgres image frompostgres:16-alpinetopgvector/pgvector:pg16. Restarted container. Migration ran cleanly. - Files modified:
docker-compose.yml - Commit:
28a5ee9(included in Task 1 commit)
Self-Check: PASSED
All 8 created files exist on disk. Both commits (28a5ee9, 45b9573) confirmed present in git log. 202 tests pass.