Microsoft Marketplace | cloud solutions, AI apps, and agents

https://catalogartifact.azureedge.net/publicartifacts/lynxroute.ollama-be320143-1e6a-44ae-92a1-cc7b2977fb76/image1_Azureready.png

概觀方案 Ratings + reviews Details + support

Ollama with Open WebUI - CIS Level 1 hardened local LLM runtime on Ubuntu 24.04 with SBOM.

What is Ollama with Open WebUI

Ollama is an open-source (MIT-licensed) runtime for running large language models locally - it pulls quantized model weights from a public model library and exposes a REST API for inference. Open WebUI is a self-hosted web chat interface that connects to the local Ollama API and lets your users chat with the models in a browser, manage conversations, upload documents for retrieval, and configure prompt templates - no external AI service required, all data stays on your VM.

Licensing note

Ollama is MIT-licensed. Open WebUI uses the source-available Open WebUI License (not OSI-approved). Self-hosted use is unrestricted; deployments above 50 end users in any 30-day window that also modify Open WebUI branding require a commercial Enterprise License from Open WebUI Inc. This image preserves the original branding.

Why self-host private LLMs

Running models on your own VM keeps every prompt and response inside your tenant - no third-party SaaS sees your queries or data. Suitable for data residency requirements (GDPR, HIPAA, ISO 27001), regulated industries, internal copilots over private documentation, and air-gapped scenarios. Models (Llama 3.2, Mistral, Gemma, Phi, Qwen, DeepSeek-R1 and 100+ more) are pulled on demand from the public catalogue.

What this VM image adds

Security hardening:

Authentication enabled by default - WEBUI_AUTH=True; first browser visit forces admin signup, subsequent users need a logged-in account
Suggested admin password generated per instance - written to /root/ollama-credentials.txt at first boot, derived from the instance vmId so two deployments never share the same value
nginx reverse proxy with TLS - Open WebUI proxied on port 443 with a self-signed certificate (4096-bit RSA, 10-year validity); replace with your own CA certificate via certbot or paste-in
Ollama API localhost-only - port 11434 bound to 127.0.0.1, never exposed externally; reach it only over SSH tunnel for CLI / SDK access
Open WebUI internal-only - port 8080 reachable only from nginx on the same host; UFW explicitly denies it externally
Dedicated system users - ollama and open-webui run as non-root, no shell, locked home directories, UMask=0027
UFW firewall - only ports 22, 80, 443 open; 8080 and 11434 explicitly denied externally
fail2ban - SSH brute-force protection
AppArmor - mandatory access control
Trivy CVE scan - every image scanned before release

OS hardening (CIS Level 1):

CIS Level 1 hardened - CIS Ubuntu 24.04 LTS Level 1 Benchmark via ansible-lockdown
auditd - system call auditing for critical paths
SSH hardening - PasswordAuthentication disabled, key-only access
Kernel hardening - SYN cookies, ASLR, rp_filter, TCP BBR, kexec disabled, unprivileged BPF disabled
/tmp as tmpfs - nosuid, nodev, noexec
Azure IMDS endpoints - egress rules pre-configured (169.254.169.254, 168.63.129.16)

Compliance artifacts (inside the VM):

SBOM - CycloneDX 1.6 at /etc/lynxroute/sbom.json
CIS Conformance Report - OpenSCAP HTML at /etc/lynxroute/cis-report.html
Tailored CIS profile - /usr/share/doc/lynxroute/CIS_TAILORED_PROFILE.md
Server credentials file - /root/ollama-credentials.txt with public IP, web UI URL, suggested admin password, and API access instructions

Quick Start

Deploy VM from Azure Marketplace (Standard_D2s_v3 or larger; for GPU inference use Standard_NC4as_T4_v3)
Open NSG: TCP 80 and 443 from your client networks - SSH 22 from your management IPs only
SSH: ssh -i key.pem azureuser@<PUBLIC_IP>
Read connection details: sudo cat /root/ollama-credentials.txt
Open https://<PUBLIC_IP>/ in a browser - accept the self-signed certificate warning, complete admin signup on the first visit
Pull a model: in the web UI go to Settings - Admin Panel - Models, search and pull (e.g. llama3.2:1b for fast CPU inference, llama3.2:3b for higher quality, larger sizes for GPU SKUs)
Optional - replace the self-signed certificate before going live: sudo certbot --nginx -d your.domain.com

Models are not pre-loaded - pull what you need. Open WebUI initialises an embedding model on first boot (3-5 min); the nginx startup page auto-refreshes until ready.

Ollama with Open WebUI - Hardened Local AI Inference

作者 Lynxroute