Files

Adolfo Delorenzo e879d27e55 docs(02-01): complete two-layer memory plan — SUMMARY, STATE, ROADMAP, REQUIREMENTS

- SUMMARY.md: memory system docs (Redis sliding window + pgvector HNSW)
- STATE.md: 67% progress (6/9 plans), 3 key decisions, metrics logged
- ROADMAP.md: Phase 2 plan progress updated (2 summaries complete)
- REQUIREMENTS.md: AGNT-02 (conversational memory), AGNT-03 (cross-session recall) marked complete

2026-03-23 14:47:06 -06:00

6.7 KiB

Raw Permalink Blame History

phase, plan, subsystem, tags, dependency_graph, tech_stack, key_files, decisions, metrics

phase

plan

subsystem

tags

dependency_graph

tech_stack

key_files

decisions

metrics

02-agent-features

memory

redis

pgvector

sentence-transformers

celery

memory

multi-tenancy

requires

provides

affects

01-foundation (all plans)

conversational-memory

semantic-recall

embedding-backfill

orchestrator/tasks.py

orchestrator/agents/builder.py

orchestrator/agents/runner.py

added

patterns

pgvector>=0.4.2

sentence-transformers>=3.0.0 (all-MiniLM-L6-v2)

fakeredis (tests)

Redis RPUSH/LTRIM sliding window

pgvector HNSW cosine search

fire-and-forget Celery task

lazy singleton model loading

created

modified

packages/shared/shared/models/memory.py

packages/orchestrator/orchestrator/memory/__init__.py

packages/orchestrator/orchestrator/memory/short_term.py

packages/orchestrator/orchestrator/memory/long_term.py

packages/orchestrator/orchestrator/memory/embedder.py

migrations/versions/002_phase2_memory.py

tests/unit/test_memory_short_term.py

tests/integration/test_memory_long_term.py

packages/shared/shared/redis_keys.py

packages/shared/pyproject.toml

packages/orchestrator/pyproject.toml

packages/orchestrator/orchestrator/agents/builder.py

packages/orchestrator/orchestrator/agents/runner.py

packages/orchestrator/orchestrator/tasks.py

docker-compose.yml

pgvector/pgvector:pg16 Docker image used instead of postgres:16-alpine — postgres:16-alpine does not include the pgvector extension; switched to official pgvector image

Lazy singleton pattern for SentenceTransformer — model loaded once on first use, not at import time, to avoid 2s load penalty when module imported but not used

embed_and_store is a fire-and-forget Celery task (ignore_result=True) — embedding never blocks the LLM response path

All queries pre-filter by (tenant_id, agent_id, user_id) BEFORE ANN operator in pgvector — defense-in-depth with RLS as secondary backstop

No TTL on Redis memory keys — indefinite retention per plan spec; caller controls window size

pgvector context injected as system message BEFORE sliding window — provides LLM background context without polluting conversation flow

duration	completed_date	tasks_completed	files_created	files_modified	tests_added	tests_total
9m 22s	2026-03-23	2	8	7	16	202

Phase 2 Plan 01: Two-Layer Conversational Memory Summary

One-liner: Redis sliding window (last 20 msgs) + pgvector HNSW semantic recall (all-MiniLM-L6-v2, 384d) with per-user per-agent per-tenant isolation and async Celery embedding backfill.

What Was Built

Transforms the stateless Phase 1 agent into one with persistent conversational memory across sessions. Every LLM call now receives:

Short-term context: Last 20 messages from Redis (zero-latency in-session history)
Long-term context: Up to 3 semantically relevant past exchanges from pgvector (cross-session recall)

The embedding backfill runs asynchronously — the LLM response is never blocked.

Task Execution

Task 1: DB models, migration, and memory modules with tests

Status: Complete

TDD Cycle:

RED: Wrote 10 unit tests (fakeredis, short_term) + 6 integration tests (pgvector, long_term) → ImportError confirmed failure
GREEN: Implemented all modules → 16/16 tests pass

Files created:

packages/shared/shared/models/memory.py — ConversationEmbedding ORM with Vector(384) column
packages/orchestrator/orchestrator/memory/__init__.py — Package init with architecture doc
packages/orchestrator/orchestrator/memory/short_term.py — RPUSH/LTRIM sliding window
packages/orchestrator/orchestrator/memory/long_term.py — pgvector HNSW cosine search
migrations/versions/002_phase2_memory.py — Alembic migration with HNSW index and RLS

Files modified:

packages/shared/shared/redis_keys.py — Added memory_short_key, escalation_status_key, pending_tool_confirm_key
packages/shared/pyproject.toml — Added pgvector>=0.4.2
packages/orchestrator/pyproject.toml — Added sentence-transformers>=3.0.0

Commit: 28a5ee9

Task 2: Wire memory into orchestrator pipeline

Status: Complete

Files created:

packages/orchestrator/orchestrator/memory/embedder.py — Lazy singleton SentenceTransformer with embed_text() / embed_texts()

Files modified:

packages/orchestrator/orchestrator/agents/builder.py — Added build_messages_with_memory()
packages/orchestrator/orchestrator/agents/runner.py — Added optional messages parameter (backward compat)
packages/orchestrator/orchestrator/tasks.py — Wired memory pipeline + new embed_and_store Celery task

Commit: 45b9573

Memory Pipeline Flow

handle_message (Celery task)
    │
    ├── BEFORE LLM CALL:
    │     1. get_recent_messages(redis)         → last 20 turns
    │     2. embed_text(user_text)              → 384-dim query vector
    │     3. retrieve_relevant(session, ...)    → top-3 past exchanges (cosine >= 0.75)
    │     4. build_messages_with_memory(...)    → enriched messages array
    │
    ├── run_agent(msg, agent, messages=enriched_messages)
    │     └── POST /complete to llm-pool
    │
    └── AFTER LLM RESPONSE:
          5. append_message(redis, user_turn)   → update sliding window
          6. append_message(redis, asst_turn)   → update sliding window
          7. embed_and_store.delay(...)          → fire-and-forget pgvector backfill

Security Properties

Cross-tenant isolation: pgvector queries pre-filter by tenant_id BEFORE ANN operator; RLS enforces at DB level as secondary backstop
Cross-agent isolation: agent_id pre-filter ensures agent A cannot recall agent B's memory
Cross-user isolation: user_id pre-filter ensures user A cannot see user B's memory
Redis isolation: memory_short_key format {tenant_id}:memory:short:{agent_id}:{user_id} ensures namespace separation

Deviations from Plan

Auto-fixed Issues

1. [Rule 3 - Blocking] Updated Docker postgres image to support pgvector extension

Found during: Task 1 integration test run
Issue: postgres:16-alpine does not include the pgvector extension. Migration 002_phase2_memory.py fails with: extension "vector" is not available — Could not open extension control file
Fix: Changed docker-compose.yml postgres image from postgres:16-alpine to pgvector/pgvector:pg16. Restarted container. Migration ran cleanly.
Files modified: docker-compose.yml
Commit: 28a5ee9 (included in Task 1 commit)

Self-Check: PASSED

All 8 created files exist on disk. Both commits (28a5ee9, 45b9573) confirmed present in git log. 202 tests pass.

6.7 KiB Raw Permalink Blame History