Azure provides many options for high availability and resilience in its services. Two such Azure features are availability zones and availability sets. In this article, you will learn about the differences between availability zones and sets and how each applies to your Azure environment.

Understanding Azure Availability Zones

Before getting started with availability zones, it is essential to understand what is an Azure region. An Azure region is a group of multiple data centers connected through a dedicated, low-latency network. Microsoft groups these data centers into multiple regions worldwide to provide Azure and other cloud services.

Examine the picture below. An Azure region contains multiple data centers in a geographic region. These data centers are connected through a dedicated network.

diagram showing data centers in an azure region

Within an Azure region, Microsoft creates a minimum of three separate availability zones. An availability zone is one or more data centers grouped within an Azure region. Each zone consists of independent power, cooling, and networking infrastructure. Suppose one zone within an Azure region is affected by an issue. In that case, the remaining two zones in the region continue to provide the necessary capacity and infrastructure to keep the services online.

The availability zone network connectivity is a high-performance network with a round-trip latency of less than 2 ms. This network is responsible for keeping data synchronized between the zones and staying available when an issue occurs.

An updated Azure region diagram shows multiple data centers making up each availability zone.

diagram showing availability zones in an azure region

Azure services that are availability zone aware provide a higher level of resiliency and flexibility. You can configure services that are availability zone aware one of two ways.

First, the service can be zone redundant, meaning the service and data are automatically replicated across one of the three availability zones. Some zone redundant services include Azure Key Vault, Azure Storage account, and Azure virtual networks.

Second, the service can be zonal, where an instance of the service is pinned to a specific zone. Zonal services include virtual machines, Azure backup, and Azure Site Recovery. Some services can be zone redundant or zonal, such as managed disks, virtual machine scale sets, or application gateways (V2).

Understanding Azure Availability Sets

While availability zones apply to multiple Azure services, availability sets only apply to virtual machines. Availability sets enable Azure to understand how your application is built to provide a higher level of redundancy and availability. Two or more virtual machines in an availability meet the requirements for a 99.95% Azure service-level agreement (SLA).

Virtual machines in Azure are no different than virtual machines hosted on a virtualization platform in your on-premises data center. The virtual machine resides on a primary host server that provides CPU, memory, and disk space for the virtual machine. The server can host multiple virtual machines depending on its resource capabilities.

Let’s say you have three virtual machines hosting an application. If all three virtual machines are on the same host server, and that host server experiences a failure, the entire application goes down. All three virtual machines are no longer available since the host server is also down.

Azure availability sets ensure that virtual machines that provide a common service are not placed on the same host server. This redundancy is accomplished through update domains and fault domains.

You must place virtual machines in an availability set when you create them. You cannot add existing virtual machines to an availability set. Make the availability set first, then create the virtual machine and assign it to the availability set.

Update domains

Update domains are groups of virtual machines and the corresponding physical hardware (or host server) that can be rebooted simultaneously. When Microsoft performs planned maintenance on these physical hosts, only one update domain is rebooted at a time. This scheduling ensures that multiple hosts that support virtual machines hosting the same application are not taken offline. Availability sets can have up to twenty update domains.

Examine the picture below. Your availability group has three virtual machines (VM1, VM2, and VM3) placed across three physical host servers or update domains (UD1, UD2, and UD3). This ensures that most virtual machines in an availability set are not taken offline during maintenance or an outage with a physical host.

diagram of virtual machines in update domains
Virtual machines placed across update domains

Fault domains

Servers are placed in racks that share common power sources and network switches. If the rack experienced a power outage or switch failure, all the physical hosts and hosted virtual machines can be taken offline. Fault domains separate virtual machines in an availability set across multiple racks of servers to avoid this issue.

For example, in the update domain diagram above, VM2 and VM3 are on physical hosts inside Server Rack 2. If Server Rack 2 experiences a hardware or power failure, two of three virtual machines in the availability set are taken offline.

Examine the updated diagram below. A third rack has been added with VM3 moved to it. This ensures that a single rack failure does not take the majority of virtual machines in the availability set offline.

Virtual machines placed across fault domains
Virtual machines placed across fault domains

Virtual machines also use disk fault domains. Managed disks attached to virtual machines reside in the same fault domains. Virtual machines in an availability set must have managed disks.

Virtual Machine Domain Placement

Availability sets can have up to twenty update domains and three fault domains. If you have more virtual machines than there are domains available, Azure places the virtual machines in their domains in sequential order.

For example, you create an availability set with three fault domains and five update domains. If you deployed ten virtual machines, here is how Azure places them across the different domains.

Virtual MachineFault Domain (3)Update Domain (5)
VM0111
VM0222
VM0333
VM0414
VM0525
VM0631
VM0712
VM0823
VM0934
VM1015

Here is an updated diagram showing the above solution.

Availability set diagram with virtual machines placed in update domains and fault domains
Availability set diagram with virtual machines placed in update domains and fault domains

FAQs

What is an availability zone?

An availability zone is one or more data centers grouped within an Azure region. Each zone consists of independent power, cooling, and networking infrastructure.

What is an availability set?

Availability sets enable Azure to understand how your application is built to provide a higher level of redundancy and availability. Availability sets place virtual machines across multiple update domains and fault domains.

What is a fault domain?

Fault domains separate virtual machines in an availability set across multiple racks of servers to avoid this issue. Each server rack has independent power and networking hardware. Availability sets can have up to 3 fault domains.

What is an update domain?

Update domains are groups of virtual machines and the corresponding physical hardware (or host server) that can be rebooted simultaneously. Availability sets can have up to 20 update domains.

Summary

Availability zones provide high availability for many services by grouping data centers in a region with separate cooling, networking, and power. Availability sets only apply to virtual machines, ensuring all virtual machines are not placed on the same physical host or in the same physical rack. Knowing when to apply availability sets versus availability zones can increase your infrastructure availability when planning out your Azure solution.

Enjoyed this article? Check out more Azure content here!