ebf6e76174
feat: make Ollama model configurable via OLLAMA_MODEL env var
...
- Add OLLAMA_MODEL setting to shared config (default: qwen3:32b)
- LLM router reads from settings instead of hardcoded model name
- Create .env file with all configurable settings documented
- docker-compose passes OLLAMA_MODEL to llm-pool container
To change the model: edit OLLAMA_MODEL in .env and restart llm-pool.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com >
2026-03-25 13:22:18 -06:00
22c6a44ff6
fix: map all model_preference values to LiteLLM router groups
...
Added balanced/economy/local groups alongside fast/quality so all 5
agent model_preference values resolve to real provider groups.
All default to local Ollama qwen3:32b, commercial as fallback.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com >
2026-03-25 13:20:23 -06:00
0e0ea5fb66
fix: runtime deployment fixes for Docker Compose stack
...
- Add .gitignore for __pycache__, node_modules, .playwright-mcp
- Add CLAUDE.md project instructions
- docker-compose: remove host port exposure for internal services,
remove Ollama container (use host), add CORS origin, bake
NEXT_PUBLIC_API_URL at build time, run alembic migrations on
gateway startup, add CPU-only torch pre-install
- gateway: add CORS middleware, graceful Slack degradation without
bot token, fix None guard on slack_handler
- gateway pyproject: add aiohttp dependency for slack-bolt async
- llm-pool pyproject: install litellm from GitHub (removed from PyPI),
enable hatch direct references
- portal: enable standalone output in next.config.ts
- Remove orphaned migration 003_phase2_audit_kb.py (renamed to 004)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com >
2026-03-24 12:26:34 -06:00
44fa7e6845
feat(02-02): wire tool-call loop into agent runner and orchestrator pipeline
...
- runner.py: multi-turn tool-call loop (LLM -> tool -> observe -> respond)
- runner.py: max 5 iterations guard against runaway tool chains
- runner.py: confirmation gate — returns confirmation msg, stops loop
- runner.py: audit logging for every LLM call via audit_logger
- tasks.py: AuditLogger initialized at task start with session factory
- tasks.py: tool registry built from agent.tool_assignments
- tasks.py: pending tool confirmation flow via Redis (10 min TTL)
- tasks.py: memory persistence skipped for confirmation request responses
- llm-pool/router.py: LLMResponse model with content + tool_calls fields
- llm-pool/router.py: tools parameter forwarded to litellm.acompletion()
- llm-pool/main.py: CompleteRequest accepts optional tools list
- llm-pool/main.py: CompleteResponse includes tool_calls field
- Migration renamed to 004 (003 was already taken by escalation migration)
- [Rule 1 - Bug] Renamed 003_phase2_audit_kb.py -> 004 to fix duplicate revision ID
2026-03-23 15:00:17 -06:00
ee2f88e13b
feat(01-02): LLM Backend Pool — LiteLLM Router with Ollama + Anthropic + OpenAI fallback
...
- Create llm_pool/router.py: LiteLLM Router with fast (Ollama) and quality (Anthropic/OpenAI) model groups
- Configure fallback chain: quality providers fail -> fast group
- Pin LiteLLM to ==1.82.5 (avoid September 2025 OOM regression in later releases)
- Create llm_pool/main.py: FastAPI service on port 8004 with /complete and /health endpoints
- Add providers/__init__.py: reserved for future per-provider customization
- Update docker-compose.yml: add llm-pool and celery-worker service stubs
2026-03-23 10:03:05 -06:00
5714acf741
feat(01-foundation-01): monorepo scaffolding, Docker Compose, and shared data models
...
- pyproject.toml: uv workspace with 5 member packages (shared, gateway, router, orchestrator, llm-pool)
- docker-compose.yml: PostgreSQL 16 + Redis 7 + Ollama services on konstruct-net
- .env.example: all required env vars documented, konstruct_app role (not superuser)
- scripts/init-db.sh: creates konstruct_app role at DB init time
- packages/shared/shared/config.py: Pydantic Settings loading all env vars
- packages/shared/shared/models/message.py: KonstructMessage, ChannelType, SenderInfo, MessageContent
- packages/shared/shared/models/tenant.py: Tenant, Agent, ChannelConnection SQLAlchemy 2.0 models
- packages/shared/shared/models/auth.py: PortalUser model for admin portal auth
- packages/shared/shared/db.py: async SQLAlchemy engine, session factory, get_session dependency
- packages/shared/shared/rls.py: current_tenant_id ContextVar and configure_rls_hook with parameterized SET LOCAL
- packages/shared/shared/redis_keys.py: tenant-namespaced key constructors (rate_limit, idempotency, session, engaged_thread)
2026-03-23 09:49:28 -06:00