dd80e2b8228dacb21770302d22834a798f6f35fb
Eliminates 5-10s of overhead by calling the LLM pool's streaming endpoint directly from the WebSocket handler instead of going through Celery queue → worker → asyncio.run() → Redis pub-sub → WebSocket. New flow: WebSocket → agent lookup → memory → LLM stream → WebSocket Old flow: WebSocket → Celery → worker → DB → memory → LLM → Redis → WebSocket Memory still saved (Redis sliding window + fire-and-forget embedding). Slack/WhatsApp still use Celery (async webhook pattern). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Description
No description provided
Languages
Python
99.9%