v2
This commit is contained in:
187
ai-templates-0/README.md
Normal file
187
ai-templates-0/README.md
Normal file
@@ -0,0 +1,187 @@
|
||||
# Portainer AI Templates (v2)
|
||||
|
||||
> **26 production-ready AI/ML Docker Compose stacks for Portainer** — filling the AI gap in the official v3 template library. Aligned with an AI infrastructure positioning strategy for Portainer.
|
||||
|
||||
## Background
|
||||
|
||||
The official [Portainer v3 templates](https://raw.githubusercontent.com/portainer/templates/v3/templates.json) contain **71 templates** with **zero pure AI/ML deployments**. This repository provides a curated, Portainer-compatible template set covering the entire AI infrastructure stack — from edge inference to distributed training to governed ML pipelines.
|
||||
|
||||
See [docs/AI_GAP_ANALYSIS.md](docs/AI_GAP_ANALYSIS.md) for the full gap analysis.
|
||||
|
||||
## Homepage Alignment
|
||||
|
||||
These templates map directly to the AI infrastructure positioning pillars:
|
||||
|
||||
| Mock-Up Pillar | Templates Covering It |
|
||||
|---|---|
|
||||
| **GPU-Aware Fleet Management** | Triton, vLLM, NVIDIA NIM, Ray Cluster, Ollama, LocalAI |
|
||||
| **Model Lifecycle Governance** | MLflow + MinIO (Production MLOps), Prefect, BentoML, Label Studio |
|
||||
| **Edge AI Deployment** | ONNX Runtime (CPU/edge profile), Triton, DeepStream |
|
||||
| **Self-Service AI Stacks** | Open WebUI, Langflow, Flowise, n8n AI, Jupyter GPU |
|
||||
| **LLM Fine-Tune** (diagram) | Ray Cluster (distributed training) |
|
||||
| **RAG Pipeline** (diagram) | Qdrant, ChromaDB, Weaviate + Langflow/Flowise |
|
||||
| **Vision Model** (diagram) | DeepStream, ComfyUI, Stable Diffusion WebUI |
|
||||
| **Anomaly Detection** (diagram) | DeepStream (video analytics), Triton (custom models) |
|
||||
|
||||
## Quick Start
|
||||
|
||||
### Option A: Use as Custom Template URL in Portainer
|
||||
|
||||
1. In Portainer, go to **Settings > App Templates**
|
||||
2. Set the URL to:
|
||||
```
|
||||
https://git.oe74.net/adelorenzo/portainer_scripts/raw/branch/master/ai-templates/portainer-ai-templates.json
|
||||
```
|
||||
3. Click **Save** — all 26 AI templates appear in your App Templates list
|
||||
|
||||
### Option B: Deploy Individual Stacks
|
||||
|
||||
```bash
|
||||
cd stacks/ollama
|
||||
docker compose up -d
|
||||
```
|
||||
|
||||
## Template Catalog
|
||||
|
||||
### LLM Inference and Model Serving
|
||||
|
||||
| # | Template | Port | GPU | Description |
|
||||
|---|---|---|---|---|
|
||||
| 1 | **Ollama** | 11434 | Yes | Local LLM engine — Llama, Mistral, Qwen, Gemma, Phi |
|
||||
| 2 | **Open WebUI + Ollama** | 3000 | Yes | ChatGPT-like UI bundled with Ollama backend |
|
||||
| 3 | **LocalAI** | 8080 | Yes | Drop-in OpenAI API replacement |
|
||||
| 4 | **vLLM** | 8000 | Yes | High-throughput serving with PagedAttention |
|
||||
| 5 | **Text Gen WebUI** | 7860 | Yes | Comprehensive LLM interface (oobabooga) |
|
||||
| 6 | **LiteLLM Proxy** | 4000 | No | Unified API gateway for 100+ LLM providers |
|
||||
| 26 | **NVIDIA NIM** | 8000 | Yes | Enterprise TensorRT-LLM optimized inference |
|
||||
|
||||
### Production Inference Serving
|
||||
|
||||
| # | Template | Port | GPU | Description |
|
||||
|---|---|---|---|---|
|
||||
| 19 | **NVIDIA Triton** | 8000 | Yes | Multi-framework inference server (TensorRT, ONNX, PyTorch, TF) |
|
||||
| 20 | **ONNX Runtime** | 8001 | Optional | Lightweight inference with GPU and CPU/edge profiles |
|
||||
| 24 | **BentoML** | 3000 | Yes | Model packaging and serving with metrics |
|
||||
|
||||
### Image and Video Generation
|
||||
|
||||
| # | Template | Port | GPU | Description |
|
||||
|---|---|---|---|---|
|
||||
| 7 | **ComfyUI** | 8188 | Yes | Node-based Stable Diffusion workflow engine |
|
||||
| 8 | **Stable Diffusion WebUI** | 7860 | Yes | AUTOMATIC1111 interface for image generation |
|
||||
|
||||
### Industrial AI and Computer Vision
|
||||
|
||||
| # | Template | Port | GPU | Description |
|
||||
|---|---|---|---|---|
|
||||
| 21 | **NVIDIA DeepStream** | 8554 | Yes | Video analytics for inspection, anomaly detection, smart factory |
|
||||
|
||||
### Distributed Training
|
||||
|
||||
| # | Template | Port | GPU | Description |
|
||||
|---|---|---|---|---|
|
||||
| 22 | **Ray Cluster** | 8265 | Yes | Head + workers for LLM fine-tuning, distributed training, Ray Serve |
|
||||
|
||||
### AI Agents and Workflows
|
||||
|
||||
| # | Template | Port | GPU | Description |
|
||||
|---|---|---|---|---|
|
||||
| 9 | **Langflow** | 7860 | No | Visual multi-agent and RAG pipeline builder |
|
||||
| 10 | **Flowise** | 3000 | No | Drag-and-drop LLM chatflow builder |
|
||||
| 11 | **n8n (AI-Enabled)** | 5678 | No | Workflow automation with AI agent nodes |
|
||||
|
||||
### Vector Databases
|
||||
|
||||
| # | Template | Port | GPU | Description |
|
||||
|---|---|---|---|---|
|
||||
| 12 | **Qdrant** | 6333 | No | High-performance vector similarity search |
|
||||
| 13 | **ChromaDB** | 8000 | No | AI-native embedding database |
|
||||
| 14 | **Weaviate** | 8080 | No | Vector DB with built-in vectorization modules |
|
||||
|
||||
### ML Operations and Governance
|
||||
|
||||
| # | Template | Port | GPU | Description |
|
||||
|---|---|---|---|---|
|
||||
| 15 | **MLflow** | 5000 | No | Experiment tracking and model registry (SQLite) |
|
||||
| 25 | **MLflow + MinIO** | 5000 | No | Production MLOps: PostgreSQL + S3 artifact store |
|
||||
| 23 | **Prefect** | 4200 | No | Governed ML pipeline orchestration with audit logging |
|
||||
| 16 | **Label Studio** | 8080 | No | Multi-type data labeling platform |
|
||||
| 17 | **Jupyter (GPU/PyTorch)** | 8888 | Yes | GPU-accelerated notebooks |
|
||||
|
||||
### Speech and Audio
|
||||
|
||||
| # | Template | Port | GPU | Description |
|
||||
|---|---|---|---|---|
|
||||
| 18 | **Whisper ASR** | 9000 | Yes | Speech-to-text API server |
|
||||
|
||||
## GPU Requirements
|
||||
|
||||
Templates marked **GPU: Yes** require:
|
||||
- NVIDIA GPU with CUDA support
|
||||
- [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html) installed
|
||||
- Docker configured with `nvidia` runtime
|
||||
|
||||
**Edge deployments (ONNX Runtime CPU profile):** No GPU required — runs on ARM or x86 with constrained CPU/memory limits.
|
||||
|
||||
For AMD GPUs (ROCm), modify the `deploy.resources` section to use ROCm-compatible images and remove the NVIDIA device reservation.
|
||||
|
||||
## File Structure
|
||||
|
||||
```
|
||||
ai-templates/
|
||||
├── portainer-ai-templates.json # Portainer v3 template definition (26 templates)
|
||||
├── README.md
|
||||
├── docs/
|
||||
│ └── AI_GAP_ANALYSIS.md # Analysis of official templates gap
|
||||
└── stacks/
|
||||
├── ollama/ # LLM Inference
|
||||
├── open-webui/
|
||||
├── localai/
|
||||
├── vllm/
|
||||
├── text-generation-webui/
|
||||
├── litellm/
|
||||
├── nvidia-nim/ # v2: Enterprise inference
|
||||
├── triton/ # v2: Production inference serving
|
||||
├── onnx-runtime/ # v2: Edge-friendly inference
|
||||
├── bentoml/ # v2: Model packaging + serving
|
||||
├── deepstream/ # v2: Industrial computer vision
|
||||
├── ray-cluster/ # v2: Distributed training
|
||||
├── prefect/ # v2: Governed ML pipelines
|
||||
├── minio-mlops/ # v2: Production MLOps stack
|
||||
├── comfyui/ # Image generation
|
||||
├── stable-diffusion-webui/
|
||||
├── langflow/ # AI agents
|
||||
├── flowise/
|
||||
├── n8n-ai/
|
||||
├── qdrant/ # Vector databases
|
||||
├── chromadb/
|
||||
├── weaviate/
|
||||
├── mlflow/ # ML operations
|
||||
├── label-studio/
|
||||
├── jupyter-gpu/
|
||||
└── whisper/ # Speech
|
||||
```
|
||||
|
||||
## Changelog
|
||||
|
||||
### v2 (March 2026)
|
||||
- Added 8 templates to close alignment gap with AI infrastructure positioning:
|
||||
- **NVIDIA Triton Inference Server** — production multi-framework inference
|
||||
- **ONNX Runtime Server** — lightweight edge inference with CPU/GPU profiles
|
||||
- **NVIDIA DeepStream** — industrial computer vision and video analytics
|
||||
- **Ray Cluster (GPU)** — distributed training and fine-tuning
|
||||
- **Prefect** — governed ML pipeline orchestration
|
||||
- **BentoML** — model packaging and serving
|
||||
- **MLflow + MinIO** — production MLOps with S3 artifact governance
|
||||
- **NVIDIA NIM** — enterprise-optimized LLM inference
|
||||
|
||||
### v1 (March 2026)
|
||||
- Initial 18 AI templates covering LLM inference, image generation, agents, vector DBs, MLOps, and speech
|
||||
|
||||
## License
|
||||
|
||||
These templates reference publicly available Docker images from their respective maintainers. Each tool has its own license — refer to the individual project documentation.
|
||||
|
||||
---
|
||||
|
||||
*Portainer AI Templates by Adolfo De Lorenzo — March 2026*
|
||||
Reference in New Issue
Block a user