- Service worker push/notificationclick handlers with conversation deep-link
- PushPermission component for opt-in UI in More sheet
- InstallPrompt component (second-visit, Android + iOS)
- IndexedDB message-queue for offline message persistence
- use-chat-socket.ts: drain queue on reconnect, enqueue when offline
SMBs don't need separate Financial Manager, Controller, and Accountant.
Merged into one versatile role that handles invoicing, AP/AR, expenses,
budgets, reporting, and cash flow. Full en/es/pt translations.
Templates: 8 → 6 (Customer Support Rep, Sales Assistant, Marketing
Manager, Office Manager, Project Coordinator, Finance & Accounting Manager)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New AI employee template: Marketing & Growth Manager (category: marketing).
Creative and data-driven, handles campaign strategy, content briefs,
metrics analysis, social media calendars, and lead gen coordination.
Escalates brand-sensitive decisions and high-budget approvals.
Full translations for Spanish and Portuguese with native business
terminology.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Streaming httpx client uses 300s read timeout (cloud LLMs can take
30-60s for first token). Was using 120s general timeout.
- Guard all WebSocket sends with try/except for client disconnect.
Prevents "Cannot send once close message has been sent" crash.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Celery workers use NullPool to avoid "Future attached to a different
loop" errors from stale pooled async connections across asyncio.run()
calls. FastAPI keeps regular pool (single event loop, safe to reuse).
- Skip pgvector similarity search when no conversation history exists
(first message) — saves ~3s embedding + query overhead.
- Wrap pgvector retrieval in try/except to prevent DB errors from
blocking the LLM response.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Pub-sub loop now handles 'chunk' and 'done' message types (not just 'response')
- 'chunk' messages are forwarded immediately via websocket.send_json
- 'done' message breaks the loop and triggers DB persistence of full response
- Sends final 'done' JSON to browser to signal stream completion
- Legacy 'response' type no longer emitted from orchestrator (now unified as 'done')
- _stream_agent_response_to_redis() publishes chunk/done messages to Redis pub-sub
- _process_message() uses streaming path for web channel with no tools registered
- Non-web channels (Slack, WhatsApp) and tool-enabled agents use non-streaming run_agent()
- streaming_delivered flag prevents double-publish when streaming path is active
- _send_response() web branch changed from 'response' to 'done' message type for consistency
- run_agent_streaming() calls POST /complete/stream and yields token strings
- Reads NDJSON lines from the streaming response, yields on 'chunk' events
- On 'error' or connection failure, yields the fallback response string
- Tool calls are not supported in the streaming path
- Existing run_agent() (non-streaming, tool-call loop) is unchanged
- complete_stream() in router.py yields token strings via acompletion(stream=True)
- POST /complete/stream returns NDJSON: chunk lines then a done line
- Streaming path does not support tool calls (plain text only)
- Non-streaming POST /complete endpoint unchanged
LLM responses can take >60s (especially with local models). The
WebSocket listener was timing out before the response arrived,
causing agent replies to appear in logs but not in the chat UI.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>