hero-FG-blog

Kasten K10 Blog

All Things Kubernetes and Data Management

  Latest Posts

AKS and Storage: How to Design Storage for Cloud Native Applications

When building Cloud native applications, many often underestimate the requirements in terms of performance of the underlying storage solution. If you’re building services on a cloud provider such as Azure, you also have a lot of options in terms of storage: Azure Files, Blob Storage and even third-party services such as Azure NetApp Files.

In this first blog post in our 2-part series on AKS and Storage, I will elaborate a bit on different storage services provided by Azure for AKS, and how they differ in terms of features. We’ll also explore how to automatically provision storage directly from Kubernetes using the native API. 

Azure and Storage Support for AKS

Azure Kubernetes Services is essentially a set of virtual machines in Azure, where Microsoft is responsible for the control plane, a.k.a. the core Kubernetes services and orchestration of application workloads. So, when with respect to storage services, it’s important to understand what options are available for virtual machines in Azure.

image5

There are differences in the kinds of read/write operations, performance and built-in data protection capabilities Azure storage services support. I’ve provided a summary in the table below:

Features/Service

Azure Managed Disks

Azure Files

Azure NetApp Files

Azure Blob Storage

Protocols Support

Local attached drive

SMB/NFS 4.1 (Preview)

SMB/NFS (NFS v3 and v4.1)

Object / NFS (Preview)

Read/Write

Single Read/Write*

Multi Read/Write

Multi Read/Write

Multi Read/Write

Performance

Determined by disk size (HDD/SSD)

Determined by account type and size (HDD/SSD)

Determined by account type and size (HDD/SSD)

Determined by account type and size (HDD/SSD)

Integration with AKS

Native support

Native support

Not native; can be configured with Trident as provisioner using CSI

Not native; can be configured using Blobfuse

 

  • NOTE: Azure-managed disks can also be configured to be a shared disk, which mapps an Azure Managed Premium Disk, using the SCSI protocol. This configuration enables multi read/write. 
  • NOTE: Most storage services in Azure have different redundancy levels, such as LRS (local-redundant) or GRS (geo-redundant). In most cases, premium-based services only support local redundancy.

The type of virtual machine instance you use for your node pools will impact how many data disks you can attach to your VMs. Here’s an example: Ddv4 and Ddsv4-series - Azure Virtual Machines | Microsoft Docs If your applications require Azure disks for storage, strategize an appropriate node VM size. Storage capabilities,CPU and memory amounts play a major role when deciding on a VM size.

For example, while both the Standard_B2ms and Standard_DS2_v2 VM sizes include a similar amount of CPU and memory resources, their potential storage performance is different:

Node type and size

vCPU

Memory (GiB)

Max data disks

Max uncached disk IOPS

Max uncached throughput (MBps)

Standard_B2ms

2

8

4

1,920

22.5

Standard_DS2_v2

2

7

8

6,400

96


In addition, you can configure other NFS/block-based storage services that are supported by Kubernetes under the CNCF umbrella → CNCF Cloud Native Interactive Landscape.

Kubernetes and Working with Storage

When working with storage in Kubernetes, there are some key concepts to consider. You can have ephemeral volumes that live and die with a pod, or you can have persistent volumes, where the lifecycle is not attached to a pod. A Kubernetes volume can be a mounted storage service in Azure, such as a managed disk or Azure Files.

Another option is to use storage classes, which enables a Kubernetes Administrator to describe the "classes" of storage they offer. Volumes for pods can either be statically created by a cluster administrator or dynamically created by the Kubernetes API server. 

Let’s look at an example using AKS with Azure Files to automatically provision an Azure Files volume that we mount for a pod.

image4

Example Using Azure Files 

First we need to define the storage class in AKS, which can be done using Cloud Shell or any CLI that can interact with the AKS CLI.

This storage class will be using a built-in provisioner called Azure File. For all AKS clusters, it will automatically set up different storage classes — one for Azure File and one for Azure Disk (the default storage class). 

NOTE: You can view the different storage classes by running the command kubectl get storageclasses.

To create a storage class, we need to configure a name and provisioning type (the provisioner is native to Kubernetes; I will discuss another option called container storage interface (CSI) later. The sku is also specific for the provisioner.)

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: azurefiles
provisioner: kubernetes.io/azure-file
allowVolumeExpansion: true
mountOptions:
  - dir_mode=0777
  - file_mode=0777
  - uid=0
  - gid=0
  - mfsymlinks
  - cache=strict
  - actimeo=30
parameters:
  skuName: Standard_LRS


We can create this as a yaml file and apply using kubectl apply  -f file.yaml. Then we can verify the storage class created by using the command kubectl get storageclasses.

Next, we will create a persistent volume claim, which will use the storage class and AKS managed identity in Azure to automatically provision an Azure Fileshare. We will create this as a yaml file and use the same apply command to provision the volume: 

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pvcazf
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: azurefiles
  resources:
    requests:
      storage: 10Gi


If it succeeds, we will be notified through the CLI. 

persistentvolumeclaim/pvcazf created 

After a while, we will see the fileshare appear in the Azure portal. 

NOTE: Another option is to use the command kubectl get pvc nameofpvccolume to list the provisioning status.

Note that by default, the provisioner will create all file shares within the same storage account. If we create multiple volumes, they will all appear within the same managed storage account. 

image1

Now that the volume is in place, we can deploy a service which will automatically mount the volume as a local share. As an example, I’ll use a helm chart, which will deploy Wordpress on top of my AKS cluster.

Using helm parameters, we can define what storage classes the blueprints should use when being deployed. An example from Bitnami using Wordpress can be found here → azure-marketplace-charts/bitnami/wordpress at master · bitnami/azure-marketplace-charts (github.com)

helm repo add azure-marketplace https://marketplace.azurecr.io/helm/v1/repo
helm install my-release azure-marketplace/wordpress -set global.storageClass=azurefiles

Once deployed, we can check our persistent volume claims.

image2

AKS and CSI Provider

Prior to CSI, Kubernetes provided direct storage integrations as part of the core Kubernetes binaries. This means that if vendors wanted to add support for fixes for a specific provisioner, they would need to align with the Kubernetes release process. 

CSI provides for a more flexible approach to adding support and updates for new storage provisioners to Kubernetes, without need to update or even touch the Kubernetes core. 

Microsoft has also introduced CSI plugins for both Azure disks and Azure files. You can view the current status for other CSI drivers here → Drivers - Kubernetes CSI Developer Documentation (kubernetes-csi.github.io)

NOTE: Starting in Kubernetes version 1.21, Kubernetes will use CSI drivers only and by default, so you’ll only be able to enable CSI storage drivers on newly deployed clusters.

To use the new plugin, register for the CSI resource provider using the Azure CLI: 

az feature register —namespace "Microsoft.ContainerService" —name "EnableAzureDiskFileCSIDriver"


An Azure CLI extension must be in place prior to configuring a new cluster within a node pool:. 

az extension add —name aks-preview


One of the new capabilities of Azure’s CSI drivers is support for snapshots, cloning and resizing of volumes directly from the Kubernetes API. They also support shared managed disks. 

Next, create a new cluster with the CSI storage drivers for Azure Disks and Azure Files installed. Use the custom headers command when doing the installation:

az aks create -g MyResourceGroup -n MyManagedCluster —network-plugin azure  —aks-custom-headers EnableAzureDiskFileCSIDriver=true


NOTE:
If you have issues with provisioning PVC volumes using the CSI drivers, use the inspect command to determine the state of the provisioning:

kubectl describe pvc volumename


Finally, you can validate the Kubernetes nodes have valid CSI drivers by running this command: 

kubectl get csinodes

image3

Up Next: How to Benchmark and Validate Performance

As you can see, different Kubernetes storage services have different configuration requirements. Performance will also vary depending on the volume size and other factors. In part 2 of this series, we’ll examine differences in performance among the various storage services, and how to benchmark and validate storage using Kubestr.

 

Marius Sandbu

Marius Sandbu works as a Cloud Evangelist for SopraSteria, focused on building company-wide competency/expertise on the public cloud and looking over the technical initiatives on the public cloud including building managed services. He has expertise across multiple platforms such as Google Cloud, Amazon Web Services, and Microsoft Azure, including SME across multiple areas such as End-User Computing, Infrastructure, Governance, Security and Hybrid Solutions. Marius is also an instructor (authorized) for Microsoft, Citrix, Nutanix, and Veeam including author for several books related to networking.


Share

Recent Blog Posts