# Portainer AI Templates (v2) > **26 production-ready AI/ML Docker Compose stacks for Portainer** — filling the AI gap in the official v3 template library. Aligned with an AI infrastructure positioning strategy for Portainer. ## Background The official [Portainer v3 templates](https://raw.githubusercontent.com/portainer/templates/v3/templates.json) contain **71 templates** with **zero pure AI/ML deployments**. This repository provides a curated, Portainer-compatible template set covering the entire AI infrastructure stack — from edge inference to distributed training to governed ML pipelines. See [docs/AI_GAP_ANALYSIS.md](docs/AI_GAP_ANALYSIS.md) for the full gap analysis. ## Homepage Alignment These templates map directly to the AI infrastructure positioning pillars: | Mock-Up Pillar | Templates Covering It | |---|---| | **GPU-Aware Fleet Management** | Triton, vLLM, NVIDIA NIM, Ray Cluster, Ollama, LocalAI | | **Model Lifecycle Governance** | MLflow + MinIO (Production MLOps), Prefect, BentoML, Label Studio | | **Edge AI Deployment** | ONNX Runtime (CPU/edge profile), Triton, DeepStream | | **Self-Service AI Stacks** | Open WebUI, Langflow, Flowise, n8n AI, Jupyter GPU | | **LLM Fine-Tune** (diagram) | Ray Cluster (distributed training) | | **RAG Pipeline** (diagram) | Qdrant, ChromaDB, Weaviate + Langflow/Flowise | | **Vision Model** (diagram) | DeepStream, ComfyUI, Stable Diffusion WebUI | | **Anomaly Detection** (diagram) | DeepStream (video analytics), Triton (custom models) | ## Quick Start ### Option A: Use as Custom Template URL in Portainer 1. In Portainer, go to **Settings > App Templates** 2. Set the URL to: ``` https://git.oe74.net/adelorenzo/portainer_scripts/raw/branch/master/ai-templates/portainer-ai-templates.json ``` 3. Click **Save** — all 26 AI templates appear in your App Templates list ### Option B: Deploy Individual Stacks ```bash cd stacks/ollama docker compose up -d ``` ## Template Catalog ### LLM Inference and Model Serving | # | Template | Port | GPU | Description | |---|---|---|---|---| | 1 | **Ollama** | 11434 | Yes | Local LLM engine — Llama, Mistral, Qwen, Gemma, Phi | | 2 | **Open WebUI + Ollama** | 3000 | Yes | ChatGPT-like UI bundled with Ollama backend | | 3 | **LocalAI** | 8080 | Yes | Drop-in OpenAI API replacement | | 4 | **vLLM** | 8000 | Yes | High-throughput serving with PagedAttention | | 5 | **Text Gen WebUI** | 7860 | Yes | Comprehensive LLM interface (oobabooga) | | 6 | **LiteLLM Proxy** | 4000 | No | Unified API gateway for 100+ LLM providers | | 26 | **NVIDIA NIM** | 8000 | Yes | Enterprise TensorRT-LLM optimized inference | ### Production Inference Serving | # | Template | Port | GPU | Description | |---|---|---|---|---| | 19 | **NVIDIA Triton** | 8000 | Yes | Multi-framework inference server (TensorRT, ONNX, PyTorch, TF) | | 20 | **ONNX Runtime** | 8001 | Optional | Lightweight inference with GPU and CPU/edge profiles | | 24 | **BentoML** | 3000 | Yes | Model packaging and serving with metrics | ### Image and Video Generation | # | Template | Port | GPU | Description | |---|---|---|---|---| | 7 | **ComfyUI** | 8188 | Yes | Node-based Stable Diffusion workflow engine | | 8 | **Stable Diffusion WebUI** | 7860 | Yes | AUTOMATIC1111 interface for image generation | ### Industrial AI and Computer Vision | # | Template | Port | GPU | Description | |---|---|---|---|---| | 21 | **NVIDIA DeepStream** | 8554 | Yes | Video analytics for inspection, anomaly detection, smart factory | ### Distributed Training | # | Template | Port | GPU | Description | |---|---|---|---|---| | 22 | **Ray Cluster** | 8265 | Yes | Head + workers for LLM fine-tuning, distributed training, Ray Serve | ### AI Agents and Workflows | # | Template | Port | GPU | Description | |---|---|---|---|---| | 9 | **Langflow** | 7860 | No | Visual multi-agent and RAG pipeline builder | | 10 | **Flowise** | 3000 | No | Drag-and-drop LLM chatflow builder | | 11 | **n8n (AI-Enabled)** | 5678 | No | Workflow automation with AI agent nodes | ### Vector Databases | # | Template | Port | GPU | Description | |---|---|---|---|---| | 12 | **Qdrant** | 6333 | No | High-performance vector similarity search | | 13 | **ChromaDB** | 8000 | No | AI-native embedding database | | 14 | **Weaviate** | 8080 | No | Vector DB with built-in vectorization modules | ### ML Operations and Governance | # | Template | Port | GPU | Description | |---|---|---|---|---| | 15 | **MLflow** | 5000 | No | Experiment tracking and model registry (SQLite) | | 25 | **MLflow + MinIO** | 5000 | No | Production MLOps: PostgreSQL + S3 artifact store | | 23 | **Prefect** | 4200 | No | Governed ML pipeline orchestration with audit logging | | 16 | **Label Studio** | 8080 | No | Multi-type data labeling platform | | 17 | **Jupyter (GPU/PyTorch)** | 8888 | Yes | GPU-accelerated notebooks | ### Speech and Audio | # | Template | Port | GPU | Description | |---|---|---|---|---| | 18 | **Whisper ASR** | 9000 | Yes | Speech-to-text API server | ## GPU Requirements Templates marked **GPU: Yes** require: - NVIDIA GPU with CUDA support - [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html) installed - Docker configured with `nvidia` runtime **Edge deployments (ONNX Runtime CPU profile):** No GPU required — runs on ARM or x86 with constrained CPU/memory limits. For AMD GPUs (ROCm), modify the `deploy.resources` section to use ROCm-compatible images and remove the NVIDIA device reservation. ## File Structure ``` ai-templates/ ├── portainer-ai-templates.json # Portainer v3 template definition (26 templates) ├── README.md ├── docs/ │ └── AI_GAP_ANALYSIS.md # Analysis of official templates gap └── stacks/ ├── ollama/ # LLM Inference ├── open-webui/ ├── localai/ ├── vllm/ ├── text-generation-webui/ ├── litellm/ ├── nvidia-nim/ # v2: Enterprise inference ├── triton/ # v2: Production inference serving ├── onnx-runtime/ # v2: Edge-friendly inference ├── bentoml/ # v2: Model packaging + serving ├── deepstream/ # v2: Industrial computer vision ├── ray-cluster/ # v2: Distributed training ├── prefect/ # v2: Governed ML pipelines ├── minio-mlops/ # v2: Production MLOps stack ├── comfyui/ # Image generation ├── stable-diffusion-webui/ ├── langflow/ # AI agents ├── flowise/ ├── n8n-ai/ ├── qdrant/ # Vector databases ├── chromadb/ ├── weaviate/ ├── mlflow/ # ML operations ├── label-studio/ ├── jupyter-gpu/ └── whisper/ # Speech ``` ## Changelog ### v2 (March 2026) - Added 8 templates to close alignment gap with AI infrastructure positioning: - **NVIDIA Triton Inference Server** — production multi-framework inference - **ONNX Runtime Server** — lightweight edge inference with CPU/GPU profiles - **NVIDIA DeepStream** — industrial computer vision and video analytics - **Ray Cluster (GPU)** — distributed training and fine-tuning - **Prefect** — governed ML pipeline orchestration - **BentoML** — model packaging and serving - **MLflow + MinIO** — production MLOps with S3 artifact governance - **NVIDIA NIM** — enterprise-optimized LLM inference ### v1 (March 2026) - Initial 18 AI templates covering LLM inference, image generation, agents, vector DBs, MLOps, and speech ## License These templates reference publicly available Docker images from their respective maintainers. Each tool has its own license — refer to the individual project documentation. --- *Portainer AI Templates by Adolfo De Lorenzo — March 2026*