konstruct

Files

Adolfo Delorenzo dd80e2b822 perf: bypass Celery for web chat — stream LLM directly from WebSocket

Eliminates 5-10s of overhead by calling the LLM pool's streaming
endpoint directly from the WebSocket handler instead of going through
Celery queue → worker → asyncio.run() → Redis pub-sub → WebSocket.

New flow: WebSocket → agent lookup → memory → LLM stream → WebSocket
Old flow: WebSocket → Celery → worker → DB → memory → LLM → Redis → WebSocket

Memory still saved (Redis sliding window + fire-and-forget embedding).
Slack/WhatsApp still use Celery (async webhook pattern).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-25 18:32:16 -06:00

gateway

perf: bypass Celery for web chat — stream LLM directly from WebSocket

2026-03-25 18:32:16 -06:00

pyproject.toml

fix: runtime deployment fixes for Docker Compose stack

2026-03-24 12:26:34 -06:00