konstruct

Author	SHA1	Message	Date
Adolfo Delorenzo	3c8fc255bc	feat(03-01): LLM key CRUD API endpoints with encryption - Create llm_keys.py: GET list (redacted, key_hint only), POST (encrypt + store), DELETE (204 or 404) - LlmKeyResponse never exposes encrypted_key or raw api_key - 409 returned on duplicate (tenant_id, provider) key - Cross-tenant deletion prevented by tenant_id verification in DELETE query - Update api/__init__.py to export llm_keys_router - All 5 LLM key CRUD tests passing (32 total unit tests green)	2026-03-23 21:36:08 -06:00
Adolfo Delorenzo	4cbf192fa5	feat(03-01): backend API endpoints — channels, billing, usage, and audit logger enhancement - Create channels.py: HMAC-signed OAuth state generation/verification, Slack OAuth install/callback, WhatsApp manual connect, test message endpoint - Create billing.py: Stripe Checkout session, billing portal session, webhook handler with idempotency (StripeEvent table), subscription lifecycle management - Update usage.py: add _aggregate_rows_by_agent and _aggregate_rows_by_provider helpers (unit-testable without DB), complete usage endpoints - Fix audit.py: rename 'metadata' attribute to 'event_metadata' (SQLAlchemy 2.0 DeclarativeBase reserves 'metadata') - Enhance runner.py: audit log now includes prompt_tokens, completion_tokens, total_tokens, cost_usd, provider in LLM call metadata - Update api/__init__.py to export all new routers - All 27 unit tests passing	2026-03-23 21:24:08 -06:00
Adolfo Delorenzo	215e67a7eb	feat(03-01): DB migrations, models, encryption service, and test scaffolds - Add stripe and cryptography to shared pyproject.toml - Add recharts, @stripe/stripe-js, stripe to portal package.json (submodule) - Add billing fields to Tenant model (stripe_customer_id, subscription_status, agent_quota, trial_ends_at) - Add budget_limit_usd to Agent model - Create TenantLlmKey and StripeEvent models in billing.py (AuditBase and Base respectively) - Create KeyEncryptionService (MultiFernet encrypt/decrypt/rotate) in crypto.py - Create compute_budget_status helper in usage.py (threshold logic: ok/warning/exceeded) - Add platform_encryption_key, stripe_, slack_oauth settings to config.py - Create Alembic migration 005 with all schema changes, RLS, grants, and composite index - All 12 tests passing (key encryption roundtrip, rotation, budget thresholds)	2026-03-23 21:19:09 -06:00
Adolfo Delorenzo	ac606cf9ff	fix(03): revise plans based on checker feedback	2026-03-23 21:10:23 -06:00
Adolfo Delorenzo	1ff61d9ba4	docs(03-operator-experience): create phase plan	2026-03-23 21:03:30 -06:00
Adolfo Delorenzo	a42fa5f38a	docs(03): add research and validation strategy	2026-03-23 20:55:12 -06:00
Adolfo Delorenzo	c4ebcf0de4	docs(03): research operator experience phase	2026-03-23 20:54:13 -06:00
Adolfo Delorenzo	a8f48df305	docs: resolve LLM-03 conflict — BYO keys confirmed for v1 Phase 3	2026-03-23 20:06:30 -06:00
Adolfo Delorenzo	c76b1ee3ce	docs(state): record phase 3 context session	2026-03-23 20:06:09 -06:00
Adolfo Delorenzo	1672b4cc81	docs(03): capture phase context	2026-03-23 20:06:09 -06:00
Adolfo Delorenzo	c5a4515f8c	docs(phase-2): complete phase execution	2026-03-23 19:20:04 -06:00
Adolfo Delorenzo	43cf7d4e63	docs(02-06): complete escalation and WhatsApp routing re-wire plan summary - Created 02-06-SUMMARY.md documenting escalation wiring, WhatsApp outbound routing, and tier-2 scoping - Updated STATE.md: advanced progress to 100%, recorded metrics and decisions - Updated ROADMAP.md: Phase 2 marked Complete (6/6 plans)	2026-03-23 19:16:56 -06:00
Adolfo Delorenzo	bd217a4113	feat(02-06): re-wire escalation and WhatsApp outbound routing in pipeline - Move key imports to module level in tasks.py for testability and clarity - Pop WhatsApp extras (phone_number_id, bot_token) in handle_message before model_validate - Build unified extras dict and extract wa_id from sender.user_id - Change _process_message signature to accept extras dict - Add _build_response_extras() helper for channel-aware extras assembly - Replace all _update_slack_placeholder calls in _process_message with _send_response() - Add escalation pre-check: skip LLM when Redis escalation_status_key == 'escalated' - Add escalation post-check: check_escalation_rules after run_agent; call escalate_to_human when rule matches and agent.escalation_assignee is set - Add _build_conversation_metadata() helper (billing keyword v1 detection) - Add channel parameter to build_system_prompt(), build_messages_with_memory(), build_messages_with_media() for WhatsApp tier-2 business-function scoping - WhatsApp scoping appends 'You only handle: {topics}' when tool_assignments non-empty - Pass msg.channel to build_messages_with_memory() in _process_message - All 26 new tests pass; all existing escalation/WhatsApp tests pass (no regressions)	2026-03-23 19:15:20 -06:00
Adolfo Delorenzo	77c9cfc825	test(02-06): add failing tests for escalation wiring and WhatsApp outbound routing - Tests for handle_message WhatsApp extra extraction (phone_number_id, bot_token) - Tests for _send_response routing to Slack and WhatsApp - Tests for _process_message using _send_response (not _update_slack_placeholder directly) - Tests for escalation pre-check (skip LLM when already escalated) - Tests for escalation post-check (check_escalation_rules + escalate_to_human) - Tests for _build_conversation_metadata billing keyword extraction - Tests for build_system_prompt WhatsApp tier-2 scoping (Task 2) - Tests for build_messages_with_memory channel parameter passthrough	2026-03-23 19:08:59 -06:00
Adolfo Delorenzo	48d9ef0c29	docs(02-agent-features): create gap closure plan for escalation and WhatsApp outbound wiring	2026-03-23 19:03:24 -06:00
Adolfo Delorenzo	d921ed776a	docs(02-05): complete multimodal media support plan summary - Add 02-05-SUMMARY.md with full task documentation and deviations - Update STATE.md: advance to plan 5 of 5 in phase 02, add decisions - Update ROADMAP.md: phase 2 now 5/5 plans complete (Complete status)	2026-03-23 15:21:38 -06:00
Adolfo Delorenzo	669c0b52b3	feat(02-05): multimodal LLM interpretation with image_url content blocks - Add supports_vision(model_name) to builder.py — detects vision-capable models (claude-3, gpt-4o, gpt-4-vision, gemini-pro-vision, gemini-1.5, gemini-2) with provider prefix stripping support - Add generate_presigned_url(storage_key, expiry=3600) to builder.py — generates 1-hour MinIO presigned URLs via boto3 S3 client - Add build_messages_with_media() to builder.py — extends build_messages_with_memory() with media injection: IMAGE -> image_url blocks for vision models / text fallback for non-vision models, DOCUMENT -> text reference with presigned URL - image_url blocks use 'detail: auto' per OpenAI/LiteLLM multipart format - Add 27 unit tests in test_multimodal_messages.py (TDD)	2026-03-23 15:09:18 -06:00
Adolfo Delorenzo	9dd7c481a3	feat(02-05): Slack file_share extraction and channel-aware outbound routing - Add gateway/channels/slack_media.py with is_file_share_event, media_type_from_mime, build_slack_storage_key, build_attachment_from_slack_file, download_and_store_slack_file - Add _send_response() helper to orchestrator/tasks.py for channel-aware dispatch (Slack -> chat.update, WhatsApp -> send_whatsapp_message) - Add send_whatsapp_message import to orchestrator/tasks.py for WhatsApp outbound - Add boto3>=1.35.0 to gateway dependencies for MinIO S3 client - Add 23 unit tests in test_slack_media.py (TDD)	2026-03-23 15:06:45 -06:00
Adolfo Delorenzo	eba6c85188	docs(02-02): complete tool framework and audit logging plan - 02-02-SUMMARY.md: tool registry, executor, 4 built-in tools, immutable audit trail - STATE.md: progress 89%, decisions recorded, session updated - ROADMAP.md: phase 2 plan progress updated (4 of 5 summaries) - REQUIREMENTS.md: AGNT-04 and AGNT-06 marked complete	2026-03-23 15:02:27 -06:00
Adolfo Delorenzo	44fa7e6845	feat(02-02): wire tool-call loop into agent runner and orchestrator pipeline - runner.py: multi-turn tool-call loop (LLM -> tool -> observe -> respond) - runner.py: max 5 iterations guard against runaway tool chains - runner.py: confirmation gate — returns confirmation msg, stops loop - runner.py: audit logging for every LLM call via audit_logger - tasks.py: AuditLogger initialized at task start with session factory - tasks.py: tool registry built from agent.tool_assignments - tasks.py: pending tool confirmation flow via Redis (10 min TTL) - tasks.py: memory persistence skipped for confirmation request responses - llm-pool/router.py: LLMResponse model with content + tool_calls fields - llm-pool/router.py: tools parameter forwarded to litellm.acompletion() - llm-pool/main.py: CompleteRequest accepts optional tools list - llm-pool/main.py: CompleteResponse includes tool_calls field - Migration renamed to 004 (003 was already taken by escalation migration) - [Rule 1 - Bug] Renamed 003_phase2_audit_kb.py -> 004 to fix duplicate revision ID	2026-03-23 15:00:17 -06:00
Adolfo Delorenzo	d1bcdef0f5	docs(02-04): complete human escalation handoff plan - Summary with decisions, metrics, and self-check - STATE.md: advance progress to 78%, add decisions, record session - ROADMAP.md: update phase 2 plan progress (3 of 5 complete) - REQUIREMENTS.md: mark AGNT-05 complete	2026-03-23 14:55:22 -06:00
Adolfo Delorenzo	f49927888e	feat(02-02): tool registry, executor, and 4 built-in tools - ToolDefinition Pydantic model with JSON Schema parameters + handler - BUILTIN_TOOLS: web_search, kb_search, http_request, calendar_lookup - http_request requires_confirmation=True (outbound side effects) - get_tools_for_agent filters by agent.tool_assignments - to_litellm_format converts to OpenAI function-calling schema - execute_tool: jsonschema validation before handler call - execute_tool: confirmation gate for requires_confirmation=True - execute_tool: audit logging on every invocation (success + failure) - web_search: Brave Search API with BRAVE_API_KEY env var - kb_search: pgvector cosine similarity with HNSW index - http_request: 30s timeout, 1MB cap, GET/POST/PUT/DELETE only - calendar_lookup: Google Calendar events.list read-only - jsonschema dependency added to orchestrator pyproject.toml - [Rule 1 - Bug] Added missing execute_tool import in test	2026-03-23 14:54:14 -06:00
Adolfo Delorenzo	a025cadc44	feat(02-04): wire escalation into orchestrator pipeline - Add escalation pre-check in _process_message: assistant mode for escalated threads - Add escalation post-check after LLM response: calls escalate_to_human on rule match - Load Slack bot token unconditionally (needed for escalation DM, not just placeholders) - Add keyword-based conversation metadata detector (billing keywords, attempt counter) - Add no-op audit logger stub (replaced by real AuditLogger from Plan 02 when available) - Add escalation_assignee and natural_language_escalation fields to Agent model - Add Alembic migration 003 for new Agent columns	2026-03-23 14:53:45 -06:00
Adolfo Delorenzo	420294b8fe	test(02-02): add failing tool registry and executor unit tests - Tests for BUILTIN_TOOLS (4 tools present, correct fields, confirmation flags) - Tests for get_tools_for_agent filtering and to_litellm_format conversion - Tests for execute_tool: valid args, invalid args, unknown tool, confirmation flow - Tests for audit logger called on every invocation	2026-03-23 14:51:42 -06:00
Adolfo Delorenzo	4047b552a7	feat(02-04): implement escalation handler (rule evaluator, transcript, DM delivery) - check_escalation_rules: condition parser for 'keyword AND count > N' and NL phrases - build_transcript: formats messages as Slack mrkdwn, truncates at 3000 chars - escalate_to_human: opens DM, posts transcript, sets Redis key, logs audit event	2026-03-23 14:50:56 -06:00
Adolfo Delorenzo	30b9f60668	feat(02-02): audit model, KB model, migration, and audit logger - AuditEvent ORM model with tenant_id, action_type, latency_ms, metadata - KnowledgeBaseDocument and KBChunk ORM models for vector KB - Migration 003: audit_events (immutable via REVOKE), kb_documents, kb_chunks with HNSW index and RLS on all tables - AuditLogger with log_llm_call, log_tool_call, log_escalation methods - audit_events immutability enforced at DB level (UPDATE/DELETE rejected) - [Rule 1 - Bug] Fixed CAST(:metadata AS jsonb) for asyncpg compatibility	2026-03-23 14:50:51 -06:00
Adolfo Delorenzo	d489551130	test(02-04): add failing tests for escalation handler - Unit tests: rule matching, natural language escalation, transcript formatting - Integration tests: Slack API calls, Redis key, audit log, return value	2026-03-23 14:49:54 -06:00
Adolfo Delorenzo	df7a5a922f	test(02-02): add failing audit integration tests - Tests for AuditLogger.log_llm_call, log_tool_call, log_escalation - Tests for audit_events immutability (UPDATE/DELETE rejection) - Tests for RLS tenant isolation	2026-03-23 14:48:52 -06:00
Adolfo Delorenzo	e879d27e55	docs(02-01): complete two-layer memory plan — SUMMARY, STATE, ROADMAP, REQUIREMENTS - SUMMARY.md: memory system docs (Redis sliding window + pgvector HNSW) - STATE.md: 67% progress (6/9 plans), 3 key decisions, metrics logged - ROADMAP.md: Phase 2 plan progress updated (2 summaries complete) - REQUIREMENTS.md: AGNT-02 (conversational memory), AGNT-03 (cross-session recall) marked complete	2026-03-23 14:47:06 -06:00
Adolfo Delorenzo	45b957377f	feat(02-01): wire two-layer memory into orchestrator pipeline - builder.py: add build_messages_with_memory() — injects pgvector context as system message + sliding window history before current user turn - runner.py: accept optional messages parameter; fallback to simple build for backward compat (existing tests unaffected) - tasks.py: memory pipeline in _process_message — load short-term + retrieve long-term before LLM call; append turns to Redis + dispatch embed_and_store fire-and-forget after response - tasks.py: add embed_and_store Celery task (sync def + asyncio.run()) for async pgvector backfill — never blocks the LLM response pipeline - memory/embedder.py: lazy singleton SentenceTransformer (all-MiniLM-L6-v2) with embed_text() / embed_texts() helpers - All 202 tests pass (196 existing + 6 new memory integration tests)	2026-03-23 14:45:21 -06:00
Adolfo Delorenzo	2dc94682ff	docs(02-03): complete WhatsApp channel adapter plan - Create 02-03-SUMMARY.md documenting WhatsApp adapter implementation - Update STATE.md: advance progress to 56%, add 4 key decisions, record metrics - Update ROADMAP.md: Phase 2 plan progress updated - Mark CHAN-03, CHAN-04 requirements complete in REQUIREMENTS.md	2026-03-23 14:44:49 -06:00
Adolfo Delorenzo	6fea34db28	feat(02-03): WhatsApp adapter with business-function scoping and router registration - Register whatsapp_router in gateway main.py (GET + POST /whatsapp/webhook) - Implement is_clearly_off_topic() tier 1 keyword scoping gate - Implement build_off_topic_reply() canned redirect message builder - Full webhook handler: verify -> normalize -> tenant -> rate limit -> dedup -> scope -> media -> dispatch - Outbound delivery via send_whatsapp_message() and send_whatsapp_media() - Media download from Meta API and storage in MinIO with tenant-prefixed keys - 14 new passing scoping tests	2026-03-23 14:43:04 -06:00
Adolfo Delorenzo	28a5ee996e	feat(02-01): add two-layer memory system — Redis sliding window + pgvector long-term - ConversationEmbedding ORM model with Vector(384) column (pgvector) - memory_short_key, escalation_status_key, pending_tool_confirm_key in redis_keys.py - orchestrator/memory/short_term.py: RPUSH/LTRIM sliding window (get_recent_messages, append_message) - orchestrator/memory/long_term.py: pgvector HNSW cosine search (retrieve_relevant, store_embedding) - Migration 002: conversation_embeddings table, HNSW index, RLS with FORCE, SELECT/INSERT only - 10 unit tests (fakeredis), 6 integration tests (pgvector) — all passing - Auto-fix [Rule 3]: postgres image updated to pgvector/pgvector:pg16 (extension required)	2026-03-23 14:41:57 -06:00
Adolfo Delorenzo	370a860622	feat(02-03): add MediaAttachment model, WhatsApp normalizer, and signature verification - Add MediaType(StrEnum) and MediaAttachment(BaseModel) to shared/models/message.py - Add media: list[MediaAttachment] field to MessageContent - Add whatsapp_app_secret, whatsapp_verify_token, and MinIO settings to shared/config.py - Add normalize_whatsapp_event() to gateway/normalize.py (text, image, document support) - Create whatsapp.py adapter with verify_whatsapp_signature() and verify_hub_challenge() - 30 new passing tests (signature verification + normalizer)	2026-03-23 14:41:48 -06:00
Adolfo Delorenzo	b2e86f1046	fix(02-agent-features): revise plans based on checker feedback	2026-03-23 14:32:20 -06:00
Adolfo Delorenzo	7da5ffb92a	docs(02-agent-features): create phase plan	2026-03-23 14:23:11 -06:00
Adolfo Delorenzo	ac54d819f8	docs(02): add research and validation strategy	2026-03-23 14:16:42 -06:00
Adolfo Delorenzo	3fe334b702	docs(02-agent-features): research phase domain	2026-03-23 14:15:40 -06:00
Adolfo Delorenzo	e48fbaa3d4	docs(state): record phase 2 context session	2026-03-23 13:12:16 -06:00
Adolfo Delorenzo	1f070e2a64	docs(02): capture phase context	2026-03-23 13:12:09 -06:00
Adolfo Delorenzo	6f4445f982	docs(phase-1): complete phase execution	2026-03-23 10:39:30 -06:00
Adolfo Delorenzo	44f5b98890	docs(01-03): complete Channel Gateway + Message Router plan	2026-03-23 10:34:50 -06:00
Adolfo Delorenzo	74326dfc3d	feat(01-03): integration tests for Slack flow, rate limiting, and agent persona - tests/unit/test_ratelimit.py: 11 tests for Redis token bucket (CHAN-05) - allows requests under limit, rejects 31st request - per-tenant isolation, per-channel isolation - TTL key expiry and window reset - tests/integration/test_slack_flow.py: 15 tests for end-to-end Slack flow (CHAN-02) - normalization: bot token stripped, channel=slack, thread_id set - @mention: placeholder posted in-thread, Celery dispatched with placeholder_ts - DM flow: same pipeline triggered for channel_type=im - bot messages silently ignored (no infinite loop) - unknown workspace_id silently ignored - duplicate events (Slack retries) skipped via idempotency - tests/integration/test_agent_persona.py: 15 tests for persona in prompts (AGNT-01) - system prompt contains name, role, persona, AI transparency clause - model_preference forwarded to LLM pool - full messages array: [system, user] structure verified - tests/integration/test_ratelimit.py: 4 tests for rate limit integration - over-limit -> ephemeral rejection posted - over-limit -> Celery NOT dispatched, placeholder NOT posted - within-limit -> no rejection - ephemeral message includes actionable retry hint All 45 tests pass	2026-03-23 10:32:48 -06:00
Adolfo Delorenzo	6f30705e1a	feat(01-03): Channel Gateway (Slack adapter) and Message Router - gateway/normalize.py: normalize_slack_event -> KonstructMessage (strips bot mention) - gateway/channels/slack.py: register_slack_handlers for app_mention + DM events - rate limit check -> ephemeral rejection on exceeded - idempotency dedup (Slack retry protection) - placeholder 'Thinking...' message posted in-thread before Celery dispatch - auto-follow engaged threads with 30-minute TTL - HTTP 200 returned immediately; all LLM work dispatched to Celery - gateway/main.py: FastAPI on port 8001, /slack/events + /health - router/tenant.py: resolve_tenant workspace_id -> tenant_id (RLS-bypass query) - router/ratelimit.py: check_rate_limit Redis token bucket, RateLimitExceeded exception - router/idempotency.py: is_duplicate + mark_processed (SET NX, 24h TTL) - router/context.py: load_agent_for_tenant with RLS ContextVar setup - orchestrator/tasks.py: handle_message now extracts placeholder_ts/channel_id, calls _update_slack_placeholder via chat.update after LLM response - docker-compose.yml: gateway service on port 8001 - pyproject.toml: added redis, konstruct-router, konstruct-orchestrator deps	2026-03-23 10:27:59 -06:00
Adolfo Delorenzo	dcd89cc8fd	docs(01-04): complete portal plan — tenant/agent CRUD and Agent Designer - Create 01-04-SUMMARY.md documenting FastAPI portal API and Next.js portal - Update STATE.md: advance plan, record metrics, add decisions - Update ROADMAP.md: phase 1 plan progress (3/4 summaries) - Update REQUIREMENTS.md: mark PRTA-01, PRTA-02 complete Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-23 10:22:22 -06:00
Adolfo Delorenzo	cec7180fb0	feat(01-04): Next.js 16 admin portal with Auth.js v5, tenant CRUD, and Agent Designer - Initialize Next.js 16 project in packages/portal/ with TypeScript, Tailwind 4, shadcn/ui - Auth.js v5 with Credentials provider calling FastAPI /auth/verify endpoint - proxy.ts (Next.js 16 replacement for middleware.ts) protects all routes - Login page with React Hook Form + zod validation (standard-schema resolver for zod v4 compat) - Agent Designer: prominent dedicated module with Identity, Personality, Configuration, Capabilities, Escalation, and Status sections; employee-centric language throughout - Tenant CRUD: list, create (slug auto-gen), view/edit, delete with confirmation - TanStack Query hooks for all API operations with proper cache invalidation - Route group (dashboard) provides shared Nav sidebar + QueryClientProvider - Update docker-compose.yml to add portal service on port 3000 - Deviations: middleware.ts renamed to proxy.ts in Next.js 16; zodResolver replaced with standardSchemaResolver for zod v4 + @hookform/resolvers v5 compatibility	2026-03-23 10:19:40 -06:00
Adolfo Delorenzo	0eae48699f	docs(01-02): complete LLM pool and orchestrator plan	2026-03-23 10:08:55 -06:00
Adolfo Delorenzo	8257c554d7	feat(01-02): Celery orchestrator — handle_message task, system prompt builder, LLM pool runner - Create orchestrator/main.py: Celery app with Redis broker/backend, task_acks_late=True, 10-min timeout - Create orchestrator/tasks.py: SYNC def handle_message (critical pattern: asyncio.run for async work) - Deserializes KonstructMessage, sets RLS context, loads agent from DB, calls run_agent - Retries up to 3x on deserialization failure - Create orchestrator/agents/builder.py: build_system_prompt assembles system_prompt + identity + persona + AI transparency clause - Create orchestrator/agents/runner.py: run_agent posts to llm-pool /complete via httpx, returns polite fallback on error - Add Celery[redis] dependency to orchestrator pyproject.toml - Create tests/integration/test_llm_fallback.py: 7 tests for fallback routing and 503 on total failure (LLM-01) - Create tests/integration/test_llm_providers.py: 12 tests verifying all three providers configured correctly (LLM-02) - All 19 integration tests pass	2026-03-23 10:06:44 -06:00
Adolfo Delorenzo	7b348b97e9	feat(01-04): FastAPI portal API endpoints with tenant/agent CRUD and auth - Add packages/shared/shared/api/portal.py with APIRouter at /api/portal - POST /auth/verify validates bcrypt credentials against portal_users table - POST /auth/register creates new portal users with hashed passwords - Tenant CRUD: GET/POST /tenants, GET/PUT/DELETE /tenants/{id} - Agent CRUD: full CRUD under /tenants/{tenant_id}/agents/{id} - Agent endpoints set RLS current_tenant_id context for policy compliance - Pydantic v2 schemas with slug validation (lowercase, hyphens, 2-50 chars) - Add bcrypt>=4.0.0 dependency to konstruct-shared - Integration tests: 38 tests covering all CRUD, validation, and isolation	2026-03-23 10:05:07 -06:00
Adolfo Delorenzo	ee2f88e13b	feat(01-02): LLM Backend Pool — LiteLLM Router with Ollama + Anthropic + OpenAI fallback - Create llm_pool/router.py: LiteLLM Router with fast (Ollama) and quality (Anthropic/OpenAI) model groups - Configure fallback chain: quality providers fail -> fast group - Pin LiteLLM to ==1.82.5 (avoid September 2025 OOM regression in later releases) - Create llm_pool/main.py: FastAPI service on port 8004 with /complete and /health endpoints - Add providers/__init__.py: reserved for future per-provider customization - Update docker-compose.yml: add llm-pool and celery-worker service stubs	2026-03-23 10:03:05 -06:00

1 2

63 Commits