docs(02-01): complete two-layer memory plan — SUMMARY, STATE, ROADMAP, REQUIREMENTS
- SUMMARY.md: memory system docs (Redis sliding window + pgvector HNSW) - STATE.md: 67% progress (6/9 plans), 3 key decisions, metrics logged - ROADMAP.md: Phase 2 plan progress updated (2 summaries complete) - REQUIREMENTS.md: AGNT-02 (conversational memory), AGNT-03 (cross-session recall) marked complete
This commit is contained in:
@@ -18,8 +18,8 @@ Requirements for beta-ready release. Each maps to roadmap phases.
|
|||||||
### Agent Core
|
### Agent Core
|
||||||
|
|
||||||
- [x] **AGNT-01**: Tenant can configure a single AI employee with custom name, role, and persona
|
- [x] **AGNT-01**: Tenant can configure a single AI employee with custom name, role, and persona
|
||||||
- [ ] **AGNT-02**: Agent maintains conversational memory within sessions (sliding window)
|
- [x] **AGNT-02**: Agent maintains conversational memory within sessions (sliding window)
|
||||||
- [ ] **AGNT-03**: Agent retrieves relevant past context via vector search (pgvector long-term memory)
|
- [x] **AGNT-03**: Agent retrieves relevant past context via vector search (pgvector long-term memory)
|
||||||
- [ ] **AGNT-04**: Agent can invoke registered tools to perform actions (tool registry + execution)
|
- [ ] **AGNT-04**: Agent can invoke registered tools to perform actions (tool registry + execution)
|
||||||
- [ ] **AGNT-05**: Agent escalates to human when configured rules trigger, transferring full conversation context
|
- [ ] **AGNT-05**: Agent escalates to human when configured rules trigger, transferring full conversation context
|
||||||
- [ ] **AGNT-06**: Every agent action (LLM call, tool invocation, handoff) is logged in an audit trail
|
- [ ] **AGNT-06**: Every agent action (LLM call, tool invocation, handoff) is logged in an audit trail
|
||||||
@@ -101,8 +101,8 @@ Which phases cover which requirements. Updated during roadmap creation.
|
|||||||
| CHAN-04 | Phase 2 | Complete |
|
| CHAN-04 | Phase 2 | Complete |
|
||||||
| CHAN-05 | Phase 1 | Complete |
|
| CHAN-05 | Phase 1 | Complete |
|
||||||
| AGNT-01 | Phase 1 | Complete |
|
| AGNT-01 | Phase 1 | Complete |
|
||||||
| AGNT-02 | Phase 2 | Pending |
|
| AGNT-02 | Phase 2 | Complete |
|
||||||
| AGNT-03 | Phase 2 | Pending |
|
| AGNT-03 | Phase 2 | Complete |
|
||||||
| AGNT-04 | Phase 2 | Pending |
|
| AGNT-04 | Phase 2 | Pending |
|
||||||
| AGNT-05 | Phase 2 | Pending |
|
| AGNT-05 | Phase 2 | Pending |
|
||||||
| AGNT-06 | Phase 2 | Pending |
|
| AGNT-06 | Phase 2 | Pending |
|
||||||
|
|||||||
@@ -79,7 +79,7 @@ Phases execute in numeric order: 1 → 2 → 3
|
|||||||
| Phase | Plans Complete | Status | Completed |
|
| Phase | Plans Complete | Status | Completed |
|
||||||
|-------|----------------|--------|-----------|
|
|-------|----------------|--------|-----------|
|
||||||
| 1. Foundation | 4/4 | Complete | 2026-03-23 |
|
| 1. Foundation | 4/4 | Complete | 2026-03-23 |
|
||||||
| 2. Agent Features | 1/5 | In Progress| |
|
| 2. Agent Features | 2/5 | In Progress| |
|
||||||
| 3. Operator Experience | 0/2 | Not started | - |
|
| 3. Operator Experience | 0/2 | Not started | - |
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|||||||
@@ -3,14 +3,14 @@ gsd_state_version: 1.0
|
|||||||
milestone: v1.0
|
milestone: v1.0
|
||||||
milestone_name: milestone
|
milestone_name: milestone
|
||||||
status: planning
|
status: planning
|
||||||
stopped_at: Completed 02-agent-features/02-03-PLAN.md
|
stopped_at: Completed 02-agent-features/02-01-PLAN.md
|
||||||
last_updated: "2026-03-23T20:44:35.519Z"
|
last_updated: "2026-03-23T20:46:53.813Z"
|
||||||
last_activity: 2026-03-23 — Roadmap created, ready for Phase 1 planning
|
last_activity: 2026-03-23 — Roadmap created, ready for Phase 1 planning
|
||||||
progress:
|
progress:
|
||||||
total_phases: 3
|
total_phases: 3
|
||||||
completed_phases: 1
|
completed_phases: 1
|
||||||
total_plans: 9
|
total_plans: 9
|
||||||
completed_plans: 5
|
completed_plans: 6
|
||||||
percent: 0
|
percent: 0
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -55,6 +55,7 @@ Progress: [░░░░░░░░░░] 0%
|
|||||||
| Phase 01-foundation P04 | 19 | 2 tasks | 25 files |
|
| Phase 01-foundation P04 | 19 | 2 tasks | 25 files |
|
||||||
| Phase 01-foundation P03 | 9 | 2 tasks | 20 files |
|
| Phase 01-foundation P03 | 9 | 2 tasks | 20 files |
|
||||||
| Phase 02-agent-features P03 | 7 | 2 tasks | 7 files |
|
| Phase 02-agent-features P03 | 7 | 2 tasks | 7 files |
|
||||||
|
| Phase 02-agent-features P02-01 | 9m 22s | 2 tasks | 15 files |
|
||||||
|
|
||||||
## Accumulated Context
|
## Accumulated Context
|
||||||
|
|
||||||
@@ -83,6 +84,9 @@ Recent decisions affecting current work:
|
|||||||
- [Phase 02-agent-features]: meta-media://{media_id} placeholder URL at normalization time; actual download in adapter after tenant resolution
|
- [Phase 02-agent-features]: meta-media://{media_id} placeholder URL at normalization time; actual download in adapter after tenant resolution
|
||||||
- [Phase 02-agent-features]: WhatsApp thread_id = sender wa_id (WhatsApp has no threading; conversation scope is per phone number)
|
- [Phase 02-agent-features]: WhatsApp thread_id = sender wa_id (WhatsApp has no threading; conversation scope is per phone number)
|
||||||
- [Phase 02-agent-features]: Always return HTTP 200 to Meta webhooks regardless of processing errors to prevent retry storms
|
- [Phase 02-agent-features]: Always return HTTP 200 to Meta webhooks regardless of processing errors to prevent retry storms
|
||||||
|
- [Phase 02-agent-features]: pgvector/pgvector:pg16 Docker image required for pgvector extension — postgres:16-alpine does not include vector extension control file
|
||||||
|
- [Phase 02-agent-features]: SentenceTransformer loaded as lazy singleton — model loaded once on first use to avoid per-call 2s overhead; 384d all-MiniLM-L6-v2 matches vector(384) column
|
||||||
|
- [Phase 02-agent-features]: embed_and_store Celery task is fire-and-forget (ignore_result=True) — embedding backfill never blocks LLM response path
|
||||||
|
|
||||||
### Pending Todos
|
### Pending Todos
|
||||||
|
|
||||||
@@ -94,6 +98,6 @@ None yet.
|
|||||||
|
|
||||||
## Session Continuity
|
## Session Continuity
|
||||||
|
|
||||||
Last session: 2026-03-23T20:44:35.516Z
|
Last session: 2026-03-23T20:46:53.810Z
|
||||||
Stopped at: Completed 02-agent-features/02-03-PLAN.md
|
Stopped at: Completed 02-agent-features/02-01-PLAN.md
|
||||||
Resume file: None
|
Resume file: None
|
||||||
|
|||||||
140
.planning/phases/02-agent-features/02-01-SUMMARY.md
Normal file
140
.planning/phases/02-agent-features/02-01-SUMMARY.md
Normal file
@@ -0,0 +1,140 @@
|
|||||||
|
---
|
||||||
|
phase: 02-agent-features
|
||||||
|
plan: 01
|
||||||
|
subsystem: memory
|
||||||
|
tags: [redis, pgvector, sentence-transformers, celery, memory, multi-tenancy]
|
||||||
|
dependency_graph:
|
||||||
|
requires: [01-foundation (all plans)]
|
||||||
|
provides: [conversational-memory, semantic-recall, embedding-backfill]
|
||||||
|
affects: [orchestrator/tasks.py, orchestrator/agents/builder.py, orchestrator/agents/runner.py]
|
||||||
|
tech_stack:
|
||||||
|
added: [pgvector>=0.4.2, sentence-transformers>=3.0.0 (all-MiniLM-L6-v2), fakeredis (tests)]
|
||||||
|
patterns: [Redis RPUSH/LTRIM sliding window, pgvector HNSW cosine search, fire-and-forget Celery task, lazy singleton model loading]
|
||||||
|
key_files:
|
||||||
|
created:
|
||||||
|
- packages/shared/shared/models/memory.py
|
||||||
|
- packages/orchestrator/orchestrator/memory/__init__.py
|
||||||
|
- packages/orchestrator/orchestrator/memory/short_term.py
|
||||||
|
- packages/orchestrator/orchestrator/memory/long_term.py
|
||||||
|
- packages/orchestrator/orchestrator/memory/embedder.py
|
||||||
|
- migrations/versions/002_phase2_memory.py
|
||||||
|
- tests/unit/test_memory_short_term.py
|
||||||
|
- tests/integration/test_memory_long_term.py
|
||||||
|
modified:
|
||||||
|
- packages/shared/shared/redis_keys.py
|
||||||
|
- packages/shared/pyproject.toml
|
||||||
|
- packages/orchestrator/pyproject.toml
|
||||||
|
- packages/orchestrator/orchestrator/agents/builder.py
|
||||||
|
- packages/orchestrator/orchestrator/agents/runner.py
|
||||||
|
- packages/orchestrator/orchestrator/tasks.py
|
||||||
|
- docker-compose.yml
|
||||||
|
decisions:
|
||||||
|
- "pgvector/pgvector:pg16 Docker image used instead of postgres:16-alpine — postgres:16-alpine does not include the pgvector extension; switched to official pgvector image"
|
||||||
|
- "Lazy singleton pattern for SentenceTransformer — model loaded once on first use, not at import time, to avoid 2s load penalty when module imported but not used"
|
||||||
|
- "embed_and_store is a fire-and-forget Celery task (ignore_result=True) — embedding never blocks the LLM response path"
|
||||||
|
- "All queries pre-filter by (tenant_id, agent_id, user_id) BEFORE ANN operator in pgvector — defense-in-depth with RLS as secondary backstop"
|
||||||
|
- "No TTL on Redis memory keys — indefinite retention per plan spec; caller controls window size"
|
||||||
|
- "pgvector context injected as system message BEFORE sliding window — provides LLM background context without polluting conversation flow"
|
||||||
|
metrics:
|
||||||
|
duration: 9m 22s
|
||||||
|
completed_date: "2026-03-23"
|
||||||
|
tasks_completed: 2
|
||||||
|
files_created: 8
|
||||||
|
files_modified: 7
|
||||||
|
tests_added: 16
|
||||||
|
tests_total: 202
|
||||||
|
---
|
||||||
|
|
||||||
|
# Phase 2 Plan 01: Two-Layer Conversational Memory Summary
|
||||||
|
|
||||||
|
**One-liner:** Redis sliding window (last 20 msgs) + pgvector HNSW semantic recall (all-MiniLM-L6-v2, 384d) with per-user per-agent per-tenant isolation and async Celery embedding backfill.
|
||||||
|
|
||||||
|
## What Was Built
|
||||||
|
|
||||||
|
Transforms the stateless Phase 1 agent into one with persistent conversational memory across sessions. Every LLM call now receives:
|
||||||
|
|
||||||
|
1. **Short-term context**: Last 20 messages from Redis (zero-latency in-session history)
|
||||||
|
2. **Long-term context**: Up to 3 semantically relevant past exchanges from pgvector (cross-session recall)
|
||||||
|
|
||||||
|
The embedding backfill runs asynchronously — the LLM response is never blocked.
|
||||||
|
|
||||||
|
## Task Execution
|
||||||
|
|
||||||
|
### Task 1: DB models, migration, and memory modules with tests
|
||||||
|
|
||||||
|
**Status:** Complete
|
||||||
|
|
||||||
|
**TDD Cycle:**
|
||||||
|
- RED: Wrote 10 unit tests (fakeredis, short_term) + 6 integration tests (pgvector, long_term) → ImportError confirmed failure
|
||||||
|
- GREEN: Implemented all modules → 16/16 tests pass
|
||||||
|
|
||||||
|
**Files created:**
|
||||||
|
- `packages/shared/shared/models/memory.py` — ConversationEmbedding ORM with Vector(384) column
|
||||||
|
- `packages/orchestrator/orchestrator/memory/__init__.py` — Package init with architecture doc
|
||||||
|
- `packages/orchestrator/orchestrator/memory/short_term.py` — RPUSH/LTRIM sliding window
|
||||||
|
- `packages/orchestrator/orchestrator/memory/long_term.py` — pgvector HNSW cosine search
|
||||||
|
- `migrations/versions/002_phase2_memory.py` — Alembic migration with HNSW index and RLS
|
||||||
|
|
||||||
|
**Files modified:**
|
||||||
|
- `packages/shared/shared/redis_keys.py` — Added memory_short_key, escalation_status_key, pending_tool_confirm_key
|
||||||
|
- `packages/shared/pyproject.toml` — Added pgvector>=0.4.2
|
||||||
|
- `packages/orchestrator/pyproject.toml` — Added sentence-transformers>=3.0.0
|
||||||
|
|
||||||
|
**Commit:** `28a5ee9`
|
||||||
|
|
||||||
|
### Task 2: Wire memory into orchestrator pipeline
|
||||||
|
|
||||||
|
**Status:** Complete
|
||||||
|
|
||||||
|
**Files created:**
|
||||||
|
- `packages/orchestrator/orchestrator/memory/embedder.py` — Lazy singleton SentenceTransformer with embed_text() / embed_texts()
|
||||||
|
|
||||||
|
**Files modified:**
|
||||||
|
- `packages/orchestrator/orchestrator/agents/builder.py` — Added build_messages_with_memory()
|
||||||
|
- `packages/orchestrator/orchestrator/agents/runner.py` — Added optional messages parameter (backward compat)
|
||||||
|
- `packages/orchestrator/orchestrator/tasks.py` — Wired memory pipeline + new embed_and_store Celery task
|
||||||
|
|
||||||
|
**Commit:** `45b9573`
|
||||||
|
|
||||||
|
## Memory Pipeline Flow
|
||||||
|
|
||||||
|
```
|
||||||
|
handle_message (Celery task)
|
||||||
|
│
|
||||||
|
├── BEFORE LLM CALL:
|
||||||
|
│ 1. get_recent_messages(redis) → last 20 turns
|
||||||
|
│ 2. embed_text(user_text) → 384-dim query vector
|
||||||
|
│ 3. retrieve_relevant(session, ...) → top-3 past exchanges (cosine >= 0.75)
|
||||||
|
│ 4. build_messages_with_memory(...) → enriched messages array
|
||||||
|
│
|
||||||
|
├── run_agent(msg, agent, messages=enriched_messages)
|
||||||
|
│ └── POST /complete to llm-pool
|
||||||
|
│
|
||||||
|
└── AFTER LLM RESPONSE:
|
||||||
|
5. append_message(redis, user_turn) → update sliding window
|
||||||
|
6. append_message(redis, asst_turn) → update sliding window
|
||||||
|
7. embed_and_store.delay(...) → fire-and-forget pgvector backfill
|
||||||
|
```
|
||||||
|
|
||||||
|
## Security Properties
|
||||||
|
|
||||||
|
- **Cross-tenant isolation**: pgvector queries pre-filter by tenant_id BEFORE ANN operator; RLS enforces at DB level as secondary backstop
|
||||||
|
- **Cross-agent isolation**: agent_id pre-filter ensures agent A cannot recall agent B's memory
|
||||||
|
- **Cross-user isolation**: user_id pre-filter ensures user A cannot see user B's memory
|
||||||
|
- **Redis isolation**: memory_short_key format `{tenant_id}:memory:short:{agent_id}:{user_id}` ensures namespace separation
|
||||||
|
|
||||||
|
## Deviations from Plan
|
||||||
|
|
||||||
|
### Auto-fixed Issues
|
||||||
|
|
||||||
|
**1. [Rule 3 - Blocking] Updated Docker postgres image to support pgvector extension**
|
||||||
|
|
||||||
|
- **Found during:** Task 1 integration test run
|
||||||
|
- **Issue:** `postgres:16-alpine` does not include the pgvector extension. Migration `002_phase2_memory.py` fails with: `extension "vector" is not available — Could not open extension control file`
|
||||||
|
- **Fix:** Changed `docker-compose.yml` postgres image from `postgres:16-alpine` to `pgvector/pgvector:pg16`. Restarted container. Migration ran cleanly.
|
||||||
|
- **Files modified:** `docker-compose.yml`
|
||||||
|
- **Commit:** `28a5ee9` (included in Task 1 commit)
|
||||||
|
|
||||||
|
## Self-Check: PASSED
|
||||||
|
|
||||||
|
All 8 created files exist on disk. Both commits (28a5ee9, 45b9573) confirmed present in git log. 202 tests pass.
|
||||||
Reference in New Issue
Block a user