Files
portainer_scripts/ai-templates-0/README.md
2026-03-10 14:40:51 -03:00

7.9 KiB

Portainer AI Templates (v2)

26 production-ready AI/ML Docker Compose stacks for Portainer — filling the AI gap in the official v3 template library. Aligned with an AI infrastructure positioning strategy for Portainer.

Background

The official Portainer v3 templates contain 71 templates with zero pure AI/ML deployments. This repository provides a curated, Portainer-compatible template set covering the entire AI infrastructure stack — from edge inference to distributed training to governed ML pipelines.

See docs/AI_GAP_ANALYSIS.md for the full gap analysis.

Homepage Alignment

These templates map directly to the AI infrastructure positioning pillars:

Mock-Up Pillar Templates Covering It
GPU-Aware Fleet Management Triton, vLLM, NVIDIA NIM, Ray Cluster, Ollama, LocalAI
Model Lifecycle Governance MLflow + MinIO (Production MLOps), Prefect, BentoML, Label Studio
Edge AI Deployment ONNX Runtime (CPU/edge profile), Triton, DeepStream
Self-Service AI Stacks Open WebUI, Langflow, Flowise, n8n AI, Jupyter GPU
LLM Fine-Tune (diagram) Ray Cluster (distributed training)
RAG Pipeline (diagram) Qdrant, ChromaDB, Weaviate + Langflow/Flowise
Vision Model (diagram) DeepStream, ComfyUI, Stable Diffusion WebUI
Anomaly Detection (diagram) DeepStream (video analytics), Triton (custom models)

Quick Start

Option A: Use as Custom Template URL in Portainer

  1. In Portainer, go to Settings > App Templates
  2. Set the URL to:
    https://git.oe74.net/adelorenzo/portainer_scripts/raw/branch/master/ai-templates/portainer-ai-templates.json
    
  3. Click Save — all 26 AI templates appear in your App Templates list

Option B: Deploy Individual Stacks

cd stacks/ollama
docker compose up -d

Template Catalog

LLM Inference and Model Serving

# Template Port GPU Description
1 Ollama 11434 Yes Local LLM engine — Llama, Mistral, Qwen, Gemma, Phi
2 Open WebUI + Ollama 3000 Yes ChatGPT-like UI bundled with Ollama backend
3 LocalAI 8080 Yes Drop-in OpenAI API replacement
4 vLLM 8000 Yes High-throughput serving with PagedAttention
5 Text Gen WebUI 7860 Yes Comprehensive LLM interface (oobabooga)
6 LiteLLM Proxy 4000 No Unified API gateway for 100+ LLM providers
26 NVIDIA NIM 8000 Yes Enterprise TensorRT-LLM optimized inference

Production Inference Serving

# Template Port GPU Description
19 NVIDIA Triton 8000 Yes Multi-framework inference server (TensorRT, ONNX, PyTorch, TF)
20 ONNX Runtime 8001 Optional Lightweight inference with GPU and CPU/edge profiles
24 BentoML 3000 Yes Model packaging and serving with metrics

Image and Video Generation

# Template Port GPU Description
7 ComfyUI 8188 Yes Node-based Stable Diffusion workflow engine
8 Stable Diffusion WebUI 7860 Yes AUTOMATIC1111 interface for image generation

Industrial AI and Computer Vision

# Template Port GPU Description
21 NVIDIA DeepStream 8554 Yes Video analytics for inspection, anomaly detection, smart factory

Distributed Training

# Template Port GPU Description
22 Ray Cluster 8265 Yes Head + workers for LLM fine-tuning, distributed training, Ray Serve

AI Agents and Workflows

# Template Port GPU Description
9 Langflow 7860 No Visual multi-agent and RAG pipeline builder
10 Flowise 3000 No Drag-and-drop LLM chatflow builder
11 n8n (AI-Enabled) 5678 No Workflow automation with AI agent nodes

Vector Databases

# Template Port GPU Description
12 Qdrant 6333 No High-performance vector similarity search
13 ChromaDB 8000 No AI-native embedding database
14 Weaviate 8080 No Vector DB with built-in vectorization modules

ML Operations and Governance

# Template Port GPU Description
15 MLflow 5000 No Experiment tracking and model registry (SQLite)
25 MLflow + MinIO 5000 No Production MLOps: PostgreSQL + S3 artifact store
23 Prefect 4200 No Governed ML pipeline orchestration with audit logging
16 Label Studio 8080 No Multi-type data labeling platform
17 Jupyter (GPU/PyTorch) 8888 Yes GPU-accelerated notebooks

Speech and Audio

# Template Port GPU Description
18 Whisper ASR 9000 Yes Speech-to-text API server

GPU Requirements

Templates marked GPU: Yes require:

Edge deployments (ONNX Runtime CPU profile): No GPU required — runs on ARM or x86 with constrained CPU/memory limits.

For AMD GPUs (ROCm), modify the deploy.resources section to use ROCm-compatible images and remove the NVIDIA device reservation.

File Structure

ai-templates/
├── portainer-ai-templates.json       # Portainer v3 template definition (26 templates)
├── README.md
├── docs/
│   └── AI_GAP_ANALYSIS.md           # Analysis of official templates gap
└── stacks/
    ├── ollama/                        # LLM Inference
    ├── open-webui/
    ├── localai/
    ├── vllm/
    ├── text-generation-webui/
    ├── litellm/
    ├── nvidia-nim/                    # v2: Enterprise inference
    ├── triton/                        # v2: Production inference serving
    ├── onnx-runtime/                  # v2: Edge-friendly inference
    ├── bentoml/                       # v2: Model packaging + serving
    ├── deepstream/                    # v2: Industrial computer vision
    ├── ray-cluster/                   # v2: Distributed training
    ├── prefect/                       # v2: Governed ML pipelines
    ├── minio-mlops/                   # v2: Production MLOps stack
    ├── comfyui/                       # Image generation
    ├── stable-diffusion-webui/
    ├── langflow/                      # AI agents
    ├── flowise/
    ├── n8n-ai/
    ├── qdrant/                        # Vector databases
    ├── chromadb/
    ├── weaviate/
    ├── mlflow/                        # ML operations
    ├── label-studio/
    ├── jupyter-gpu/
    └── whisper/                       # Speech

Changelog

v2 (March 2026)

  • Added 8 templates to close alignment gap with AI infrastructure positioning:
    • NVIDIA Triton Inference Server — production multi-framework inference
    • ONNX Runtime Server — lightweight edge inference with CPU/GPU profiles
    • NVIDIA DeepStream — industrial computer vision and video analytics
    • Ray Cluster (GPU) — distributed training and fine-tuning
    • Prefect — governed ML pipeline orchestration
    • BentoML — model packaging and serving
    • MLflow + MinIO — production MLOps with S3 artifact governance
    • NVIDIA NIM — enterprise-optimized LLM inference

v1 (March 2026)

  • Initial 18 AI templates covering LLM inference, image generation, agents, vector DBs, MLOps, and speech

License

These templates reference publicly available Docker images from their respective maintainers. Each tool has its own license — refer to the individual project documentation.


Portainer AI Templates by Adolfo De Lorenzo — March 2026