LocalAI - Hardened Self-Hosted OpenAI-Compatible API Server
por Lynxroute
LocalAI 4.3.6 - CIS Level 1 hardened self-hosted OpenAI-compatible API on Ubuntu 24.04 + SBOM
What is LocalAI
LocalAI is an open-source, OpenAI-compatible inference server delivered as a single statically linked Go binary. It exposes the same REST surface as OpenAI - /v1/chat/completions, /v1/embeddings, /v1/audio/transcriptions, /v1/audio/speech, /v1/images/generations, /v1/models, plus a built-in /models gallery API - so any client written for OpenAI (openai-python, openai-node, LangChain OpenAI provider, LlamaIndex, IDE plugins) talks to it unchanged with one base URL swap. Inference runs entirely on CPU using llama.cpp and Whisper.cpp backends. GGUF model files load from local disk under /var/lib/localai/models and persist across restarts - no external database, no Redis, no cloud egress. Bring your own GGUF model (LLaMA family, Mistral, Qwen, Phi, Gemma) or install one in two clicks from the bundled gallery. MIT license.
Why self-host LocalAI
Running LLM workloads on your own VM keeps prompts, embeddings, and generated content inside your Azure subscription - no per-call billing, no third-party model provider seeing your data. Ideal for teams with data residency requirements (GDPR, HIPAA), organisations evaluating private LLM deployments, and developers prototyping OpenAI-SDK clients against a locally controlled endpoint.
What this VM image adds
Security hardening:
- Per-instance LOCALAI_API_KEY - 24-character random key generated at first boot, enforced by LocalAI for every /v1/, /models, /embeddings, /audio, /images call. Never baked into the image.
- Single credential for SDK and Web UI - the same key authenticates OpenAI-SDK REST clients (Authorization: Bearer) and the Web UI "Login with API Token". One rotation surface.
- LocalAI bound to 127.0.0.1:8080 - the API process never listens on a public interface; only the hardened nginx reverse proxy is reachable externally.
- nginx TLS perimeter - port 443 with self-signed cert, HTTP redirects to HTTPS, security headers (X-Content-Type-Options, X-Frame-Options, Referrer-Policy) applied.
- Certbot pre-installed - one command issues a Let's Encrypt certificate, no extra apt steps after deployment.
- UFW firewall - default deny; only 22, 80, 443 open. Internal 8080 explicitly blocked.
- fail2ban - SSH brute-force protection.
- AppArmor - mandatory access control.
- CVE scan - every image scanned with Trivy before release.
OS hardening (CIS Level 1):
- CIS Level 1 hardened - CIS Ubuntu 24.04 LTS Level 1 Benchmark via ansible-lockdown.
- auditd - system call auditing for critical paths.
- SSH hardening - PasswordAuthentication disabled, key-only access.
- Kernel hardening - SYN cookies, ASLR, rp_filter, TCP BBR.
- /tmp as tmpfs - nosuid, nodev, noexec.
- Azure IMDS endpoints - egress rules pre-configured (169.254.169.254, 168.63.129.16).
Compliance artifacts (inside the VM):
- SBOM - CycloneDX 1.6 at /etc/lynxroute/sbom.json.
- CIS Conformance Report - OpenSCAP HTML at /etc/lynxroute/cis-report.html.
- Tailored CIS profile - /usr/share/doc/lynxroute/CIS_TAILORED_PROFILE.md.
- Server credentials file - /root/localai-credentials.txt with the per-instance API key and public IP.
Quick Start
- Deploy VM from Azure Marketplace (Standard_D2s_v3 or larger recommended; CPU-only inference).
- Open NSG: TCP 443 from your client networks; SSH 22 from your management IPs only.
- SSH: ssh -i key.pem <username>@<PUBLIC_IP> (username set during VM creation, default: azureuser).
- Read credentials: sudo cat /root/localai-credentials.txt - contains the per-instance API key.
- Open https://<PUBLIC_IP>/ in your browser, accept the self-signed certificate, click "Login with API Token", paste the key.
- Install a model: Web UI "Models" tab - pick from the bundled gallery, or drop a GGUF file into /var/lib/localai/models and run sudo systemctl restart local-ai.
- REST API (OpenAI-compatible): point your OpenAI SDK base URL to https://<PUBLIC_IP>/v1 with header Authorization: Bearer <api-key>.
Models are NOT pre-loaded - install on demand. CPU-only - no GPU required. Replace the self-signed TLS certificate with a CA-signed one for production use (certbot pre-installed: sudo certbot --nginx -d your.domain.com).