konstruct

Author	SHA1	Message	Date
Adolfo Delorenzo	f3e358b418	feat(streaming): add complete_stream() generator and POST /complete/stream NDJSON endpoint to llm-pool - complete_stream() in router.py yields token strings via acompletion(stream=True) - POST /complete/stream returns NDJSON: chunk lines then a done line - Streaming path does not support tool calls (plain text only) - Non-streaming POST /complete endpoint unchanged	2026-03-25 17:56:56 -06:00
Adolfo Delorenzo	ebf6e76174	feat: make Ollama model configurable via OLLAMA_MODEL env var - Add OLLAMA_MODEL setting to shared config (default: qwen3:32b) - LLM router reads from settings instead of hardcoded model name - Create .env file with all configurable settings documented - docker-compose passes OLLAMA_MODEL to llm-pool container To change the model: edit OLLAMA_MODEL in .env and restart llm-pool. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-25 13:22:18 -06:00
Adolfo Delorenzo	22c6a44ff6	fix: map all model_preference values to LiteLLM router groups Added balanced/economy/local groups alongside fast/quality so all 5 agent model_preference values resolve to real provider groups. All default to local Ollama qwen3:32b, commercial as fallback. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-25 13:20:23 -06:00
Adolfo Delorenzo	44fa7e6845	feat(02-02): wire tool-call loop into agent runner and orchestrator pipeline - runner.py: multi-turn tool-call loop (LLM -> tool -> observe -> respond) - runner.py: max 5 iterations guard against runaway tool chains - runner.py: confirmation gate — returns confirmation msg, stops loop - runner.py: audit logging for every LLM call via audit_logger - tasks.py: AuditLogger initialized at task start with session factory - tasks.py: tool registry built from agent.tool_assignments - tasks.py: pending tool confirmation flow via Redis (10 min TTL) - tasks.py: memory persistence skipped for confirmation request responses - llm-pool/router.py: LLMResponse model with content + tool_calls fields - llm-pool/router.py: tools parameter forwarded to litellm.acompletion() - llm-pool/main.py: CompleteRequest accepts optional tools list - llm-pool/main.py: CompleteResponse includes tool_calls field - Migration renamed to 004 (003 was already taken by escalation migration) - [Rule 1 - Bug] Renamed 003_phase2_audit_kb.py -> 004 to fix duplicate revision ID	2026-03-23 15:00:17 -06:00
Adolfo Delorenzo	ee2f88e13b	feat(01-02): LLM Backend Pool — LiteLLM Router with Ollama + Anthropic + OpenAI fallback - Create llm_pool/router.py: LiteLLM Router with fast (Ollama) and quality (Anthropic/OpenAI) model groups - Configure fallback chain: quality providers fail -> fast group - Pin LiteLLM to ==1.82.5 (avoid September 2025 OOM regression in later releases) - Create llm_pool/main.py: FastAPI service on port 8004 with /complete and /health endpoints - Add providers/__init__.py: reserved for future per-provider customization - Update docker-compose.yml: add llm-pool and celery-worker service stubs	2026-03-23 10:03:05 -06:00

5 Commits