Commit Graph

4 Commits

Author SHA1 Message Date
ebf6e76174 feat: make Ollama model configurable via OLLAMA_MODEL env var
- Add OLLAMA_MODEL setting to shared config (default: qwen3:32b)
- LLM router reads from settings instead of hardcoded model name
- Create .env file with all configurable settings documented
- docker-compose passes OLLAMA_MODEL to llm-pool container

To change the model: edit OLLAMA_MODEL in .env and restart llm-pool.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 13:22:18 -06:00
22c6a44ff6 fix: map all model_preference values to LiteLLM router groups
Added balanced/economy/local groups alongside fast/quality so all 5
agent model_preference values resolve to real provider groups.
All default to local Ollama qwen3:32b, commercial as fallback.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 13:20:23 -06:00
44fa7e6845 feat(02-02): wire tool-call loop into agent runner and orchestrator pipeline
- runner.py: multi-turn tool-call loop (LLM -> tool -> observe -> respond)
- runner.py: max 5 iterations guard against runaway tool chains
- runner.py: confirmation gate — returns confirmation msg, stops loop
- runner.py: audit logging for every LLM call via audit_logger
- tasks.py: AuditLogger initialized at task start with session factory
- tasks.py: tool registry built from agent.tool_assignments
- tasks.py: pending tool confirmation flow via Redis (10 min TTL)
- tasks.py: memory persistence skipped for confirmation request responses
- llm-pool/router.py: LLMResponse model with content + tool_calls fields
- llm-pool/router.py: tools parameter forwarded to litellm.acompletion()
- llm-pool/main.py: CompleteRequest accepts optional tools list
- llm-pool/main.py: CompleteResponse includes tool_calls field
- Migration renamed to 004 (003 was already taken by escalation migration)
- [Rule 1 - Bug] Renamed 003_phase2_audit_kb.py -> 004 to fix duplicate revision ID
2026-03-23 15:00:17 -06:00
ee2f88e13b feat(01-02): LLM Backend Pool — LiteLLM Router with Ollama + Anthropic + OpenAI fallback
- Create llm_pool/router.py: LiteLLM Router with fast (Ollama) and quality (Anthropic/OpenAI) model groups
- Configure fallback chain: quality providers fail -> fast group
- Pin LiteLLM to ==1.82.5 (avoid September 2025 OOM regression in later releases)
- Create llm_pool/main.py: FastAPI service on port 8004 with /complete and /health endpoints
- Add providers/__init__.py: reserved for future per-provider customization
- Update docker-compose.yml: add llm-pool and celery-worker service stubs
2026-03-23 10:03:05 -06:00