Microsoft Marketplace | cloud solutions, AI apps, and agents

https://catalogartifact.azureedge.net/publicartifacts/lynxroute.ollama-9378b5ac-d089-49c3-8440-a8c88c7919f4/image1_Azureready.png

Přehled Plány Ratings + reviews Details + support

Ollama with Open WebUI - CIS Level 1 hardened local LLM runtime on Ubuntu 24.04 with SBOM.

What is Ollama with Open WebUI

Ollama is an open-source runtime for running large language models locally - it pulls quantized model weights from a public model library and exposes a REST API for inference. Open WebUI is a self-hosted web chat interface that connects to the local Ollama API and lets your users chat with the models in a browser, manage conversations, upload documents for retrieval, and configure prompt templates - no external AI service required, all data stays on your VM.

Why self-host private LLMs

Running models on your own VM keeps every prompt, document, and generated response inside your tenant - no third-party SaaS sees your team's questions or your business data. Suitable for organisations under data residency rules (GDPR, HIPAA, ISO 27001), regulated industries that cannot send sensitive text to external services, internal copilots over private documentation, and offline or air-gapped scenarios. Models are pulled on demand from the public catalogue (Llama 3.2, Mistral, Gemma, Phi, Qwen, DeepSeek-R1 and 100+ others) and can be removed at any time.

What this VM image adds

Security hardening:

Authentication enabled by default - WEBUI_AUTH=True; first browser visit forces admin signup, subsequent users need a logged-in account
Suggested admin password generated per instance - written to /root/ollama-credentials.txt at first boot, derived from the instance vmId so two deployments never share the same value
nginx reverse proxy with TLS - Open WebUI proxied on port 443 with a self-signed certificate (4096-bit RSA, 10-year validity); replace with your own CA certificate via certbot or paste-in
Ollama API localhost-only - port 11434 bound to 127.0.0.1, never exposed externally; reach it only over SSH tunnel for CLI / SDK access
Open WebUI internal-only - port 8080 reachable only from nginx on the same host; UFW explicitly denies it externally
Dedicated system users - ollama and open-webui run as non-root, no shell, locked home directories, UMask=0027
UFW firewall - only ports 22, 80, 443 open; 8080 and 11434 explicitly denied externally
fail2ban - SSH brute-force protection
AppArmor - mandatory access control
Trivy CVE scan - every image scanned before release

OS hardening (CIS Level 1):

CIS Level 1 hardened - CIS Ubuntu 24.04 LTS Level 1 Benchmark via ansible-lockdown
auditd - system call auditing for critical paths
SSH hardening - PasswordAuthentication disabled, key-only access
Kernel hardening - SYN cookies, ASLR, rp_filter, TCP BBR, kexec disabled, unprivileged BPF disabled
/tmp as tmpfs - nosuid, nodev, noexec
Azure IMDS endpoints - egress rules pre-configured (169.254.169.254, 168.63.129.16)

Compliance artifacts (inside the VM):

SBOM - CycloneDX 1.6 at /etc/lynxroute/sbom.json
CIS Conformance Report - OpenSCAP HTML at /etc/lynxroute/cis-report.html
Tailored CIS profile - /usr/share/doc/lynxroute/CIS_TAILORED_PROFILE.md
Server credentials file - /root/ollama-credentials.txt with public IP, web UI URL, suggested admin password, and API access instructions

Quick Start

Deploy VM from Azure Marketplace (Standard_D2s_v3 or larger; for GPU inference use Standard_NC4as_T4_v3)
Open NSG: TCP 80 and 443 from your client networks - SSH 22 from your management IPs only
SSH: ssh -i key.pem azureuser@<PUBLIC_IP>
Read connection details: sudo cat /root/ollama-credentials.txt
Open https://<PUBLIC_IP>/ in a browser - accept the self-signed certificate warning, complete admin signup on the first visit
Pull a model: in the web UI go to Settings - Admin Panel - Models, search and pull (e.g. llama3.2:1b for fast CPU inference, llama3.2:3b for higher quality, larger sizes for GPU SKUs)
Optional - replace the self-signed certificate before going live: sudo certbot --nginx -d your.domain.com

Models are not pre-loaded - the VM image stays small and you choose what to install. Open WebUI initialises a sentence-transformers embedding model on first boot, which takes 3-5 minutes; the nginx startup page auto-refreshes until the chat interface is ready.

Ollama with Open WebUI - Hardened Local AI Inference

Autor: Lynxroute