v2

2026-03-10 14:40:51 -03:00
parent 290f05be87
commit 92713a4d1c
30 changed files with 2074 additions and 0 deletions
--- a/ai-templates-0/README.md
+++ b/ai-templates-0/README.md
@@ -0,0 +1,187 @@
+# Portainer AI Templates (v2)
+
+> **26 production-ready AI/ML Docker Compose stacks for Portainer** — filling the AI gap in the official v3 template library. Aligned with an AI infrastructure positioning strategy for Portainer.
+
+## Background
+
+The official [Portainer v3 templates](https://raw.githubusercontent.com/portainer/templates/v3/templates.json) contain **71 templates** with **zero pure AI/ML deployments**. This repository provides a curated, Portainer-compatible template set covering the entire AI infrastructure stack — from edge inference to distributed training to governed ML pipelines.
+
+See [docs/AI_GAP_ANALYSIS.md](docs/AI_GAP_ANALYSIS.md) for the full gap analysis.
+
+## Homepage Alignment
+
+These templates map directly to the AI infrastructure positioning pillars:
+
+| Mock-Up Pillar | Templates Covering It |
+|---|---|
+| **GPU-Aware Fleet Management** | Triton, vLLM, NVIDIA NIM, Ray Cluster, Ollama, LocalAI |
+| **Model Lifecycle Governance** | MLflow + MinIO (Production MLOps), Prefect, BentoML, Label Studio |
+| **Edge AI Deployment** | ONNX Runtime (CPU/edge profile), Triton, DeepStream |
+| **Self-Service AI Stacks** | Open WebUI, Langflow, Flowise, n8n AI, Jupyter GPU |
+| **LLM Fine-Tune** (diagram) | Ray Cluster (distributed training) |
+| **RAG Pipeline** (diagram) | Qdrant, ChromaDB, Weaviate + Langflow/Flowise |
+| **Vision Model** (diagram) | DeepStream, ComfyUI, Stable Diffusion WebUI |
+| **Anomaly Detection** (diagram) | DeepStream (video analytics), Triton (custom models) |
+
+## Quick Start
+
+### Option A: Use as Custom Template URL in Portainer
+
+1. In Portainer, go to **Settings > App Templates**
+2. Set the URL to:
+   ```
+   https://git.oe74.net/adelorenzo/portainer_scripts/raw/branch/master/ai-templates/portainer-ai-templates.json
+   ```
+3. Click **Save** — all 26 AI templates appear in your App Templates list
+
+### Option B: Deploy Individual Stacks
+
+```bash
+cd stacks/ollama
+docker compose up -d
+```
+
+## Template Catalog
+
+### LLM Inference and Model Serving
+
+| # | Template | Port | GPU | Description |
+|---|---|---|---|---|
+| 1 | **Ollama** | 11434 | Yes | Local LLM engine — Llama, Mistral, Qwen, Gemma, Phi |
+| 2 | **Open WebUI + Ollama** | 3000 | Yes | ChatGPT-like UI bundled with Ollama backend |
+| 3 | **LocalAI** | 8080 | Yes | Drop-in OpenAI API replacement |
+| 4 | **vLLM** | 8000 | Yes | High-throughput serving with PagedAttention |
+| 5 | **Text Gen WebUI** | 7860 | Yes | Comprehensive LLM interface (oobabooga) |
+| 6 | **LiteLLM Proxy** | 4000 | No | Unified API gateway for 100+ LLM providers |
+| 26 | **NVIDIA NIM** | 8000 | Yes | Enterprise TensorRT-LLM optimized inference |
+
+### Production Inference Serving
+
+| # | Template | Port | GPU | Description |
+|---|---|---|---|---|
+| 19 | **NVIDIA Triton** | 8000 | Yes | Multi-framework inference server (TensorRT, ONNX, PyTorch, TF) |
+| 20 | **ONNX Runtime** | 8001 | Optional | Lightweight inference with GPU and CPU/edge profiles |
+| 24 | **BentoML** | 3000 | Yes | Model packaging and serving with metrics |
+
+### Image and Video Generation
+
+| # | Template | Port | GPU | Description |
+|---|---|---|---|---|
+| 7 | **ComfyUI** | 8188 | Yes | Node-based Stable Diffusion workflow engine |
+| 8 | **Stable Diffusion WebUI** | 7860 | Yes | AUTOMATIC1111 interface for image generation |
+
+### Industrial AI and Computer Vision
+
+| # | Template | Port | GPU | Description |
+|---|---|---|---|---|
+| 21 | **NVIDIA DeepStream** | 8554 | Yes | Video analytics for inspection, anomaly detection, smart factory |
+
+### Distributed Training
+
+| # | Template | Port | GPU | Description |
+|---|---|---|---|---|
+| 22 | **Ray Cluster** | 8265 | Yes | Head + workers for LLM fine-tuning, distributed training, Ray Serve |
+
+### AI Agents and Workflows
+
+| # | Template | Port | GPU | Description |
+|---|---|---|---|---|
+| 9 | **Langflow** | 7860 | No | Visual multi-agent and RAG pipeline builder |
+| 10 | **Flowise** | 3000 | No | Drag-and-drop LLM chatflow builder |
+| 11 | **n8n (AI-Enabled)** | 5678 | No | Workflow automation with AI agent nodes |
+
+### Vector Databases
+
+| # | Template | Port | GPU | Description |
+|---|---|---|---|---|
+| 12 | **Qdrant** | 6333 | No | High-performance vector similarity search |
+| 13 | **ChromaDB** | 8000 | No | AI-native embedding database |
+| 14 | **Weaviate** | 8080 | No | Vector DB with built-in vectorization modules |
+
+### ML Operations and Governance
+
+| # | Template | Port | GPU | Description |
+|---|---|---|---|---|
+| 15 | **MLflow** | 5000 | No | Experiment tracking and model registry (SQLite) |
+| 25 | **MLflow + MinIO** | 5000 | No | Production MLOps: PostgreSQL + S3 artifact store |
+| 23 | **Prefect** | 4200 | No | Governed ML pipeline orchestration with audit logging |
+| 16 | **Label Studio** | 8080 | No | Multi-type data labeling platform |
+| 17 | **Jupyter (GPU/PyTorch)** | 8888 | Yes | GPU-accelerated notebooks |
+
+### Speech and Audio
+
+| # | Template | Port | GPU | Description |
+|---|---|---|---|---|
+| 18 | **Whisper ASR** | 9000 | Yes | Speech-to-text API server |
+
+## GPU Requirements
+
+Templates marked **GPU: Yes** require:
+- NVIDIA GPU with CUDA support
+- [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html) installed
+- Docker configured with `nvidia` runtime
+
+**Edge deployments (ONNX Runtime CPU profile):** No GPU required — runs on ARM or x86 with constrained CPU/memory limits.
+
+For AMD GPUs (ROCm), modify the `deploy.resources` section to use ROCm-compatible images and remove the NVIDIA device reservation.
+
+## File Structure
+
+```
+ai-templates/
+├── portainer-ai-templates.json       # Portainer v3 template definition (26 templates)
+├── README.md
+├── docs/
+│   └── AI_GAP_ANALYSIS.md           # Analysis of official templates gap
+└── stacks/
+    ├── ollama/                        # LLM Inference
+    ├── open-webui/
+    ├── localai/
+    ├── vllm/
+    ├── text-generation-webui/
+    ├── litellm/
+    ├── nvidia-nim/                    # v2: Enterprise inference
+    ├── triton/                        # v2: Production inference serving
+    ├── onnx-runtime/                  # v2: Edge-friendly inference
+    ├── bentoml/                       # v2: Model packaging + serving
+    ├── deepstream/                    # v2: Industrial computer vision
+    ├── ray-cluster/                   # v2: Distributed training
+    ├── prefect/                       # v2: Governed ML pipelines
+    ├── minio-mlops/                   # v2: Production MLOps stack
+    ├── comfyui/                       # Image generation
+    ├── stable-diffusion-webui/
+    ├── langflow/                      # AI agents
+    ├── flowise/
+    ├── n8n-ai/
+    ├── qdrant/                        # Vector databases
+    ├── chromadb/
+    ├── weaviate/
+    ├── mlflow/                        # ML operations
+    ├── label-studio/
+    ├── jupyter-gpu/
+    └── whisper/                       # Speech
+```
+
+## Changelog
+
+### v2 (March 2026)
+- Added 8 templates to close alignment gap with AI infrastructure positioning:
+  - **NVIDIA Triton Inference Server** — production multi-framework inference
+  - **ONNX Runtime Server** — lightweight edge inference with CPU/GPU profiles
+  - **NVIDIA DeepStream** — industrial computer vision and video analytics
+  - **Ray Cluster (GPU)** — distributed training and fine-tuning
+  - **Prefect** — governed ML pipeline orchestration
+  - **BentoML** — model packaging and serving
+  - **MLflow + MinIO** — production MLOps with S3 artifact governance
+  - **NVIDIA NIM** — enterprise-optimized LLM inference
+
+### v1 (March 2026)
+- Initial 18 AI templates covering LLM inference, image generation, agents, vector DBs, MLOps, and speech
+
+## License
+
+These templates reference publicly available Docker images from their respective maintainers. Each tool has its own license — refer to the individual project documentation.
+
+---
+
+*Portainer AI Templates by Adolfo De Lorenzo — March 2026*