Blogs

Azure AKS Multitenancy Primer

By Chandrasekar Ganapathy posted 05-18-2023 15:40

  

Azure Kubernetes Service Multitenancy

 

This blog has been inspired by recent experiences with a project that attempted realizing a multitenant SaaS application using Azure Kubernetes Service (AKS).  While our scope was primarily focused on building the cloud management platform using Flux GitOps CD which in itself is a topic for a different blog;   understanding the nuances of multitenancy turned out to be a key for successful design.

 

1.     Let us start with what is Multitenancy?

Multitenancy is a software architecture model where a single instance of a system – infrastructure, code/database serves multiple tenants.  The tenants can be individual users, teams, or even customers and own their data, configurations, or any customizations, but shares the underlying infrastructure, code, and database without stepping onto each other through proper security and isolation techniques.

 

An analogy for multitenancy is an Apartment Complex where there are different units and tenants.  Each tenant is an owner of its unit and can raise families of different sizes or decorate the rooms with different types of furniture without bothering their neighbors. All the common areas they adhere to the common Homeowners rules.

 

2.     Why Multitenancy?

Multitenancy provides the efficiencies in cost due to consolidation and provide the necessary scale.  Building dedicated resources per client while can be secure becomes a management overhead.

 

3.     Multitenancy is hard

Realizing multitenancy is hard.  Every tenant should not only be properly secured but also be isolated.  While right security policies enable restricting one tenant access other tenant’s resources, does not ensure the effects of one affecting others.  Referred to as the “Noisy Neighbor problems” occurs when one tenant tends to overuse resources thereby adversely impacting the functioning of the others resulting in bad user experience.  This needs to be protected too.

 

4.     Multitenancy in AKS

There are multiple levels of multitenancy that can be accomplished within AKS based on the isolation of the resources.  The table below provides the classifications and depicts the benefits across different categories with green being good, amber being ok and red being bad.

 


In this above picture the Storage level has not illustrated as it is outside the purview of AKS.  Real multitenancy in AKS happens at the Nodepool and Namespace levels and requires a lot of controls at its resources level of enable this.

 

5.     Enforcing Isolation within AKS

AKS does not natively guarantee isolation, however provides the necessary resources to enforce Tenant isolation. 

 

All of these are done at the Namespace level.  Namespace is the foundational unit that isolates all the resources within that attribute to a unit.   It groups the relevant resources to create the view of a tenant. 

 

1.      The Namespace itself restricted by Service Account – the identity used by the Pods to access the K8S API and the resources within the namespace based on the access control polcies defined by the Roles and Rolebinding. 

 

2.      The NW Policies rules define that control the  traffic policy in and out of the names while the Limits and Quotas prevent any noisy neighbor problems.

 

 


 

 

 

3.      Code – The Deployments, Pods and Containers within, attribute the tenant’s code.

 

4.      Data – The Storage Classes, Persistence Volume and its Volume mounts for disk storage and Statefulsets for database store attribute to the tenant’s data

 

5.      Configuration – Configurations include the ConfigMaps which house non-critical data and the Secrets which carry sensitive data like TLS certs for cluster-wide communication.   

 

6.      Ingress – Stepping out of the code, Ingress includes the Ingress Rules that define routing rules, Annotations that define any changes in the behavior of the Ingress controller before applying the routing rules and the TLS certificates that carry the necessary TLS terminations at the Ingress controller

 

7.      Service – Includes the ClusterIP and the Service Endpoint of the Pods it loadbalances to.

 

8.      DNS Name – Every namespace is automatically assigned a FQDN domain name as below

 

mynamespace.myakscluster.eastus.cloudapp.azure.com

 

where in

·       mynamespace is the namespace name

·       myakscluster is the name

·       eastus is the us east region

 

a service myservice within will be addressed as …

 

myservice.mynamespace.myakscluster.eastus.cloudapp.azure.com

 

 


 

6.     Types of Multitenancy in AKS

Let us look at the types of multitenancy based on level of isolation before getting into the different agents that could help us enable this within AKS.  There are 4 different types of Multitenancy in AKS based in the increasing order of Isolation.

 

1.      Fully Shared Tenancy

2.      Horizontally Partitioned Tenancy

3.      Vertically Partitioned Tenancy

4.      Full Isolation Tenancy

 

For simplicity sake all the illustrations below have considered just the DB as only the state management layer.  In enterprise applications there could be Caching Layer, Managed Disk, ESB, Unstructured Storage like Azure Blobs besides these.

 

6.1      Full Shared Tenancy

 

Both App, Data and Infrastructure all shared. 

 



   

 

The following are the characteristics…

 

1.      Single AKS Cluster resulting in lesser mgmt overheads

2.      Namespace are the levels of isolations

3.      Storage instances are all shared

4.      Need for high levels of security posturing

5.      Need for quota to reduce noisy neighbour problem

 

Pros:

1.      Consolidation of resources and best cost optimization model

2.      Brings high level of resource utilization

3.      Reduces the overhead of managing multiple AKS clusters

4.      With AKS and Kubernetes API brings in higher levels of scale

 

Cons:

1.      Complex and requires a mature team to manage this ecosystem

2.      Needs higher degree of skills to enforce security and isolation

3.      Higher risk of noisy neighbor issue

4.      This requires Infosec approval to have the data of different customers in the same DB

 

6.2     Horizontally Partitioned Tenancy

 

This is the most adopted model as it provides the necessary isolation at the data level yet bringing the consolidation at the App level. 

 




 

 

This was the model adopted in the customer project too.  Specifically they had used Azure SQL Elastic pool DB which was built for handling multitenant scenarios.  See the reference architecture below.

 


 

1.      Single AKS Cluster

2.      Namespace are the levels of isolations for Tenants from an App side

3.      Data level isolation for each tenant

4.      Need for high levels of security posturing for the code

5.      Need for quota to reduce noisy neighbour problem

 

Pros:

1.      Consolidation of resources at the App level while isolation at the data level

2.      Brings high level of resource utilization

3.      With AKS and Kubernetes API brings in higher levels of scale

4.      This is an ideal model for multitenancy as the customer data is physically isolated in different DB yet there is an overall sharing of App resources

 

Cons:

5.      Complex and requires a mature team to manage this ecosystem

6.      Needs higher degree of skills to enforce security and isolation

7.      Higher risk of noisy neighbour issue

 

 

6.3     Vertically Partitioned Tenancy

 

This is also frequently adopted multitenancy model where customers that don’t want to co-exist with others are managed in separate clusters and DB instances.  This provides the maximum isolation at the data and App levels for such cases.

 

1.      More than one AKS Clusters depending on the demands of the customer for isolation

2.      Cluster and Namespaces levels of isolations for Tenants from an App side

3.      DB Instance and Database level isolation for data for tenants depending on the needs

4.      Need for high levels of security posturing and quota for reducing noisy neighbours for namespace level isolations

5.      Significant overhead to management as there are multiple clusters. 

 

Pros:

1.      Higher levels of isolation results in better performance and reduced noisy neighbour issue

2.      Easier to setup and operate as there security postering and policy need not be so complex

3.      This is an ideal model for customer data that demand physically separate code and data

 

Cons:

1.      Higher cost due to multiple clusters

2.      Higher management overheads eventhough operating can be easy

 

Alternative Vertical Partitioned Model for reducing management overhead with single cluster and dedicated nodepools.

 

 

 

6.4     Full Isolation Tenancy

 

This is technically not a multitenant model but more an multi-instance model.


 

This could have the highest cost and management overheads but provides greatest levels of security due to the isolation.  There is no noisy neighbour issues and predominantly the impact of one tenant does not affect the others.

 


 

7.     Big Picture

 

There more configurations to be done from multitenancy perspective outside of the AKS platform not limited to, for successful implementation.

 

·       Azure DNS A Record configuration for each tenant

·       Front Door profiles based on tenants

·       API Gateway/LB backend configurations

·       CI/CD pipeline setup

·       Azure Key Vault workspaces

·       Azure Monitor workspace

·       Azure AD Tenancy and strategy for scale

 

A reference architecture for such an implementation is provide below.

 



 


 

8.     Conclusion

 

While comprehensive, this document is by no means the complete guide to adopting AKS Multitenancy within customer organization, but it attempts to stay close to the truth and serve as a standard based on best practices from industry. By weighing in these inputs we feel customers can be greatly benefitted with the right approach towards multitenacy in AKS.

 

It will be interesting to fine tune this in a multi-vendor situation but that is a discussion for a different paper. 

 

This document will be updated as new practices are standardized. Please reach out to Chandra Ganapathy, Chandra.Ganapathy@Hitachivantara.com, with any questions or comments or if you would like to have Innersourcing adopted in customer development process.


#CloudOperations
0 comments
8 views

Permalink