Files

2026-03-10 14:40:51 -03:00

7.9 KiB

Raw Blame History

Portainer AI Templates (v2)

26 production-ready AI/ML Docker Compose stacks for Portainer — filling the AI gap in the official v3 template library. Aligned with an AI infrastructure positioning strategy for Portainer.

Background

The official Portainer v3 templates contain 71 templates with zero pure AI/ML deployments. This repository provides a curated, Portainer-compatible template set covering the entire AI infrastructure stack — from edge inference to distributed training to governed ML pipelines.

See docs/AI_GAP_ANALYSIS.md for the full gap analysis.

Homepage Alignment

These templates map directly to the AI infrastructure positioning pillars:

Mock-Up Pillar	Templates Covering It
GPU-Aware Fleet Management	Triton, vLLM, NVIDIA NIM, Ray Cluster, Ollama, LocalAI
Model Lifecycle Governance	MLflow + MinIO (Production MLOps), Prefect, BentoML, Label Studio
Edge AI Deployment	ONNX Runtime (CPU/edge profile), Triton, DeepStream
Self-Service AI Stacks	Open WebUI, Langflow, Flowise, n8n AI, Jupyter GPU
LLM Fine-Tune (diagram)	Ray Cluster (distributed training)
RAG Pipeline (diagram)	Qdrant, ChromaDB, Weaviate + Langflow/Flowise
Vision Model (diagram)	DeepStream, ComfyUI, Stable Diffusion WebUI
Anomaly Detection (diagram)	DeepStream (video analytics), Triton (custom models)

Quick Start

Option A: Use as Custom Template URL in Portainer

In Portainer, go to Settings > App Templates

Set the URL to:

https://git.oe74.net/adelorenzo/portainer_scripts/raw/branch/master/ai-templates/portainer-ai-templates.json

Click Save — all 26 AI templates appear in your App Templates list

Option B: Deploy Individual Stacks

cd stacks/ollama
docker compose up -d

Template Catalog

LLM Inference and Model Serving

#	Template	Port	GPU	Description
1	Ollama	11434	Yes	Local LLM engine — Llama, Mistral, Qwen, Gemma, Phi
2	Open WebUI + Ollama	3000	Yes	ChatGPT-like UI bundled with Ollama backend
3	LocalAI	8080	Yes	Drop-in OpenAI API replacement
4	vLLM	8000	Yes	High-throughput serving with PagedAttention
5	Text Gen WebUI	7860	Yes	Comprehensive LLM interface (oobabooga)
6	LiteLLM Proxy	4000	No	Unified API gateway for 100+ LLM providers
26	NVIDIA NIM	8000	Yes	Enterprise TensorRT-LLM optimized inference

Production Inference Serving

#	Template	Port	GPU	Description
19	NVIDIA Triton	8000	Yes	Multi-framework inference server (TensorRT, ONNX, PyTorch, TF)
20	ONNX Runtime	8001	Optional	Lightweight inference with GPU and CPU/edge profiles
24	BentoML	3000	Yes	Model packaging and serving with metrics

Image and Video Generation

#	Template	Port	GPU	Description
7	ComfyUI	8188	Yes	Node-based Stable Diffusion workflow engine
8	Stable Diffusion WebUI	7860	Yes	AUTOMATIC1111 interface for image generation

Industrial AI and Computer Vision

#	Template	Port	GPU	Description
21	NVIDIA DeepStream	8554	Yes	Video analytics for inspection, anomaly detection, smart factory

Distributed Training

#	Template	Port	GPU	Description
22	Ray Cluster	8265	Yes	Head + workers for LLM fine-tuning, distributed training, Ray Serve

AI Agents and Workflows

#	Template	Port	GPU	Description
9	Langflow	7860	No	Visual multi-agent and RAG pipeline builder
10	Flowise	3000	No	Drag-and-drop LLM chatflow builder
11	n8n (AI-Enabled)	5678	No	Workflow automation with AI agent nodes

Vector Databases

#	Template	Port	GPU	Description
12	Qdrant	6333	No	High-performance vector similarity search
13	ChromaDB	8000	No	AI-native embedding database
14	Weaviate	8080	No	Vector DB with built-in vectorization modules

ML Operations and Governance

#	Template	Port	GPU	Description
15	MLflow	5000	No	Experiment tracking and model registry (SQLite)
25	MLflow + MinIO	5000	No	Production MLOps: PostgreSQL + S3 artifact store
23	Prefect	4200	No	Governed ML pipeline orchestration with audit logging
16	Label Studio	8080	No	Multi-type data labeling platform
17	Jupyter (GPU/PyTorch)	8888	Yes	GPU-accelerated notebooks

Speech and Audio

#	Template	Port	GPU	Description
18	Whisper ASR	9000	Yes	Speech-to-text API server

GPU Requirements

Templates marked GPU: Yes require:

NVIDIA GPU with CUDA support
NVIDIA Container Toolkit installed
Docker configured with nvidia runtime

Edge deployments (ONNX Runtime CPU profile): No GPU required — runs on ARM or x86 with constrained CPU/memory limits.

For AMD GPUs (ROCm), modify the deploy.resources section to use ROCm-compatible images and remove the NVIDIA device reservation.

File Structure

ai-templates/
├── portainer-ai-templates.json       # Portainer v3 template definition (26 templates)
├── README.md
├── docs/
│   └── AI_GAP_ANALYSIS.md           # Analysis of official templates gap
└── stacks/
    ├── ollama/                        # LLM Inference
    ├── open-webui/
    ├── localai/
    ├── vllm/
    ├── text-generation-webui/
    ├── litellm/
    ├── nvidia-nim/                    # v2: Enterprise inference
    ├── triton/                        # v2: Production inference serving
    ├── onnx-runtime/                  # v2: Edge-friendly inference
    ├── bentoml/                       # v2: Model packaging + serving
    ├── deepstream/                    # v2: Industrial computer vision
    ├── ray-cluster/                   # v2: Distributed training
    ├── prefect/                       # v2: Governed ML pipelines
    ├── minio-mlops/                   # v2: Production MLOps stack
    ├── comfyui/                       # Image generation
    ├── stable-diffusion-webui/
    ├── langflow/                      # AI agents
    ├── flowise/
    ├── n8n-ai/
    ├── qdrant/                        # Vector databases
    ├── chromadb/
    ├── weaviate/
    ├── mlflow/                        # ML operations
    ├── label-studio/
    ├── jupyter-gpu/
    └── whisper/                       # Speech

Changelog

v2 (March 2026)

Added 8 templates to close alignment gap with AI infrastructure positioning:
- NVIDIA Triton Inference Server — production multi-framework inference
- ONNX Runtime Server — lightweight edge inference with CPU/GPU profiles
- NVIDIA DeepStream — industrial computer vision and video analytics
- Ray Cluster (GPU) — distributed training and fine-tuning
- Prefect — governed ML pipeline orchestration
- BentoML — model packaging and serving
- MLflow + MinIO — production MLOps with S3 artifact governance
- NVIDIA NIM — enterprise-optimized LLM inference

v1 (March 2026)

Initial 18 AI templates covering LLM inference, image generation, agents, vector DBs, MLOps, and speech

License

These templates reference publicly available Docker images from their respective maintainers. Each tool has its own license — refer to the individual project documentation.

Portainer AI Templates by Adolfo De Lorenzo — March 2026

7.9 KiB Raw Blame History