Commit Graph

7 Commits

Author SHA1 Message Date
f3e358b418 feat(streaming): add complete_stream() generator and POST /complete/stream NDJSON endpoint to llm-pool
- complete_stream() in router.py yields token strings via acompletion(stream=True)
- POST /complete/stream returns NDJSON: chunk lines then a done line
- Streaming path does not support tool calls (plain text only)
- Non-streaming POST /complete endpoint unchanged
2026-03-25 17:56:56 -06:00
ebf6e76174 feat: make Ollama model configurable via OLLAMA_MODEL env var
- Add OLLAMA_MODEL setting to shared config (default: qwen3:32b)
- LLM router reads from settings instead of hardcoded model name
- Create .env file with all configurable settings documented
- docker-compose passes OLLAMA_MODEL to llm-pool container

To change the model: edit OLLAMA_MODEL in .env and restart llm-pool.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 13:22:18 -06:00
22c6a44ff6 fix: map all model_preference values to LiteLLM router groups
Added balanced/economy/local groups alongside fast/quality so all 5
agent model_preference values resolve to real provider groups.
All default to local Ollama qwen3:32b, commercial as fallback.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 13:20:23 -06:00
0e0ea5fb66 fix: runtime deployment fixes for Docker Compose stack
- Add .gitignore for __pycache__, node_modules, .playwright-mcp
- Add CLAUDE.md project instructions
- docker-compose: remove host port exposure for internal services,
  remove Ollama container (use host), add CORS origin, bake
  NEXT_PUBLIC_API_URL at build time, run alembic migrations on
  gateway startup, add CPU-only torch pre-install
- gateway: add CORS middleware, graceful Slack degradation without
  bot token, fix None guard on slack_handler
- gateway pyproject: add aiohttp dependency for slack-bolt async
- llm-pool pyproject: install litellm from GitHub (removed from PyPI),
  enable hatch direct references
- portal: enable standalone output in next.config.ts
- Remove orphaned migration 003_phase2_audit_kb.py (renamed to 004)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 12:26:34 -06:00
44fa7e6845 feat(02-02): wire tool-call loop into agent runner and orchestrator pipeline
- runner.py: multi-turn tool-call loop (LLM -> tool -> observe -> respond)
- runner.py: max 5 iterations guard against runaway tool chains
- runner.py: confirmation gate — returns confirmation msg, stops loop
- runner.py: audit logging for every LLM call via audit_logger
- tasks.py: AuditLogger initialized at task start with session factory
- tasks.py: tool registry built from agent.tool_assignments
- tasks.py: pending tool confirmation flow via Redis (10 min TTL)
- tasks.py: memory persistence skipped for confirmation request responses
- llm-pool/router.py: LLMResponse model with content + tool_calls fields
- llm-pool/router.py: tools parameter forwarded to litellm.acompletion()
- llm-pool/main.py: CompleteRequest accepts optional tools list
- llm-pool/main.py: CompleteResponse includes tool_calls field
- Migration renamed to 004 (003 was already taken by escalation migration)
- [Rule 1 - Bug] Renamed 003_phase2_audit_kb.py -> 004 to fix duplicate revision ID
2026-03-23 15:00:17 -06:00
ee2f88e13b feat(01-02): LLM Backend Pool — LiteLLM Router with Ollama + Anthropic + OpenAI fallback
- Create llm_pool/router.py: LiteLLM Router with fast (Ollama) and quality (Anthropic/OpenAI) model groups
- Configure fallback chain: quality providers fail -> fast group
- Pin LiteLLM to ==1.82.5 (avoid September 2025 OOM regression in later releases)
- Create llm_pool/main.py: FastAPI service on port 8004 with /complete and /health endpoints
- Add providers/__init__.py: reserved for future per-provider customization
- Update docker-compose.yml: add llm-pool and celery-worker service stubs
2026-03-23 10:03:05 -06:00
5714acf741 feat(01-foundation-01): monorepo scaffolding, Docker Compose, and shared data models
- pyproject.toml: uv workspace with 5 member packages (shared, gateway, router, orchestrator, llm-pool)
- docker-compose.yml: PostgreSQL 16 + Redis 7 + Ollama services on konstruct-net
- .env.example: all required env vars documented, konstruct_app role (not superuser)
- scripts/init-db.sh: creates konstruct_app role at DB init time
- packages/shared/shared/config.py: Pydantic Settings loading all env vars
- packages/shared/shared/models/message.py: KonstructMessage, ChannelType, SenderInfo, MessageContent
- packages/shared/shared/models/tenant.py: Tenant, Agent, ChannelConnection SQLAlchemy 2.0 models
- packages/shared/shared/models/auth.py: PortalUser model for admin portal auth
- packages/shared/shared/db.py: async SQLAlchemy engine, session factory, get_session dependency
- packages/shared/shared/rls.py: current_tenant_id ContextVar and configure_rls_hook with parameterized SET LOCAL
- packages/shared/shared/redis_keys.py: tenant-namespaced key constructors (rate_limit, idempotency, session, engaged_thread)
2026-03-23 09:49:28 -06:00