AnythingLLM - Hardened Private AI Workspace for Documents
by Lynxroute
AnythingLLM - CIS Level 1 hardened private AI workspace on Ubuntu 24.04 LTS with SBOM + CIS Report.
What is AnythingLLM
AnythingLLM is an open-source, self-hosted private AI workspace for documents. It ingests PDFs, DOCX, HTML, text, audio transcripts, and code, embeds the content into a vector store, and serves a multi-user web UI and REST API for chat, retrieval-augmented generation (RAG), and AI agents. Operators get team workspaces with per-workspace document scope, role-based access, public chat embeds, and a built-in admin UI. Users, workspaces, chats, documents, and vectors persist in embedded SQLite and LanceDB on the host - no external database required.
Why self-host AnythingLLM
Self-hosting keeps every document, chat transcript, and provider API key inside your own tenant - no per-seat SaaS fees, no third-party access, no data leaving your region. Ideal for teams with data residency requirements (GDPR, HIPAA, ISO 27001), legal and consulting practices building internal knowledge bases, R&D groups querying proprietary research, and MSPs delivering private AI inside their own subscription.
What this VM image adds
Security hardening:
- Unique admin password generated per instance - admin account created at first boot with a per-VM password, password change required on first web login
- Three internal secrets generated per instance - JWT_SECRET, SIG_KEY, SIG_SALT (encrypt provider keys at rest in SQLite), regenerated at first boot
- Container bound to loopback only - AnythingLLM listens on 127.0.0.1:3001, no exposure on the public interface
- Nginx reverse proxy with TLS - HTTP to HTTPS redirect, hardened cipher suite, security headers, WebSocket pass-through for chat streaming
- Provider API keys not baked in - operators configure their own provider after first login, keys stored encrypted at rest
- Anonymous telemetry disabled by default
- CVE scan - every image is scanned for vulnerabilities with Trivy before release
- UFW firewall - only ports 22 (SSH), 80, and 443 open
- fail2ban - SSH brute-force protection
- AppArmor - mandatory access control
- Certbot pre-installed - one command issues a Let's Encrypt certificate after you point a domain at the VM
OS hardening (CIS Level 1):
- CIS Level 1 hardened - CIS Ubuntu 24.04 LTS Level 1 Benchmark via ansible-lockdown
- auditd - system call auditing for critical paths
- SSH hardening - PasswordAuthentication disabled, key-only access
- Kernel hardening - SYN cookies, ASLR, rp_filter, TCP BBR
- /tmp as tmpfs - nosuid, nodev, noexec
- Azure IMDS endpoints - egress rules pre-configured (169.254.169.254, 168.63.129.16)
Compliance artifacts (inside the VM):
- SBOM - CycloneDX 1.6 at /etc/lynxroute/sbom.json with SHA-256 hash of the container image and NTIA-compliant supplier metadata
- CIS Conformance Report - OpenSCAP HTML at /etc/lynxroute/cis-report.html
- Tailored CIS profile - /usr/share/doc/lynxroute/CIS_TAILORED_PROFILE.md
- Server credentials file - /root/anythingllm-credentials.txt with web UI URL and the per-instance admin password
AI-ready out of the box:
- Embedded vector store - LanceDB on the host bind mount, no external service required
- Native embedder pre-configured - Xenova/all-MiniLM-L6-v2, RAG works on first launch without any external API key
- Provider switch in admin UI - select your preferred provider and paste the API key from Settings, no shell access needed
- Storage on a host bind mount - /opt/anythingllm/storage, mountable on a separate data disk
Quick Start
- Deploy VM from Azure Marketplace (Standard_D2s_v3 or larger recommended)
- Open NSG: TCP 80 and 443 from your client networks, TCP 22 from your management IPs only
- SSH as azureuser (default), read sudo cat /root/anythingllm-credentials.txt for the web UI URL and per-instance admin password
- Open https://<PUBLIC_IP>, accept the self-signed certificate, log in as admin, set a new password when prompted
- Settings → AI Providers: configure your preferred provider (paste your API key) - vector store and embedder are already wired up
- Create a workspace, upload documents, and start chatting
- Issue a public TLS certificate before sharing with users: sudo certbot --nginx -d your.domain.com