Alluxio Enterprise Cluster for AKS
par Alluxio
Alluxio Enterprise Cluster is a CRD that provisions Alluxio Cluster on AKS
What is Alluxio
Alluxio is a distributed data orchestration system that brings your data closer to your compute frameworks. It acts as a caching layer between your persistent storage (like Amazon S3, HDFS, or Azure Blob Storage) and your computation frameworks (like Spark, Presto, and PyTorch).
By caching frequently accessed data on the compute cluster, Alluxio dramatically speeds up data access, reduces network congestion, and eliminates I/O bottlenecks, which is especially critical for data-intensive applications like AI/ML training and large-scale data analytics.
Why Use Alluxio?
You should consider using Alluxio if you are experiencing any of the following challenges:
- Slow AI/ML Training: Your expensive GPUs are often idle, waiting for data to be fetched from slow object stores, leading to long training times and high costs.
- Slow Cold Start of Deploying Models: When deploying new models for inference, the initial requests are slow because the model must be downloaded from a remote object store. This "cold start" problem leads to poor user experience and can be a bottleneck for autoscaling.
- Data Silos: Your data is spread across multiple data centers or cloud providers, and you need a unified way to access it without complex data migration.
- High Egress Costs: You are paying high fees to your cloud provider for repeatedly reading the same data from object storage.
Alluxio solves these problems by:
- Accelerating Performance: By caching data, Alluxio can improve I/O performance by over 10x for both model training and deployment.
- Providing Seamless Data Access: Alluxio provides standard APIs like POSIX (FUSE), S3, and FSSpec, allowing your applications to connect to your data without any code changes.
- Enabling High Scalability: The distributed architecture can scale to handle billions of objects and thousands of clients.
- Reducing Costs: By reducing data egress and eliminating the need for specialized, high-performance storage hardware, Alluxio helps lower your total cost of ownership.
Overview
This offering deploys the AlluxioCluster Custom Resource (CR) to automatically provision, configure, and manage an Alluxio Enterprise data orchestration system on Azure Kubernetes Service (AKS). Designed for high-performance AI and analytics workloads, Alluxio bridges the gap between compute frameworks and underlying storage, providing high performance data access.
⚠️ IMPORTANT PREREQUISITE: ALLUXIO OPERATOR REQUIRED ⚠️
You MUST install the Alluxio Operator extension on your target AKS cluster before deploying this offer. This deployment creates an AlluxioCluster custom resource, which relies entirely on the Alluxio Operator (alluxio.alluxio-operator-extensions) to reconcile its state and spin up the underlying pods. If the Operator is not detected on your cluster, this deployment will fail.
👉 If you haven't installed the Operator yet, please search for "Alluxio Operator" in the Azure Marketplace and deploy it first.
Key Features & Components Managed by this CRD:
- Automated Provisioning: Instantiates the complete Alluxio topology, including the Coordinator (Master), Distributed Workers, and CSI/FUSE daemonsets for seamless POSIX-like data access.
- Built-in High Availability: Automatically configures and integrates an embedded etcd cluster for Coordinator high availability and state management.
- Flexible Sizing Profiles: Choose from pre-defined "T-Shirt Sizing" deployment profiles (e.g., Standard) for quick setup, or select "Custom" to unlock advanced configuration.
- License Management: Securely injects and manages your Alluxio Enterprise License string through Azure's protected configuration settings.
How to get started:
- Contract with Alluxio to obtain the License and dedicated Docker images.
- Ensure the Alluxio Operator is successfully installed and running on your AKS cluster.
- Click Create to launch the deployment wizard.
- Select your target AKS cluster and target namespace.
- Input your Alluxio Enterprise license and Docker image tag.
- Review and deploy. The Alluxio Operator will automatically detect the new AlluxioCluster resource and orchestrate the underlying infrastructure.