Configuring a resilient infrastructure is key to configuring SQL Server for high availability (HA). Whether you’re configuring an on-prem, cloud, or hybrid infrastructure, there are certain issues you need to address to ensure SQL Server availability 99.99% of the time or more—and they may not be the issues that first come to mind. While determining the right geographic distribution of physical or virtual servers is obviously important, just as important is ensuring that the individual instances of SQL Server running on those servers have ready access to the data in your databases.
Turns out that ensuring that can be challenging.
Different Storage Options for Different Environments
When configuring SQL Server Failover Cluster Instances (FCI), storage can be shared in an on-premises infrastructure configured for HA, so ensuring access to the SQL databases is relatively straightforward. If the primary server in the FCI goes offline, the cluster will failover to a secondary server, which will then be granted control of the shared storage (which often resides on a storage area network [SAN], an SMB share, or a shared VHDX, or some other familiar on-prem storage solution that supports SQL Server FCIs.
But these kinds of shared storage options don’t work in the cloud (or in a hybrid configuration) when the goal is to configure a SQL Server FCI for 99.99% availability. Shared storage options like Storage Spaces Direct, AWS FSx, or Azure Files all have constraints that preclude an availability guarantee above 99.95%. Microsoft recently announced Azure shared disks, but right now it is unclear whether this option can support an infrastructure designed for 99.99% availability. If all the VMs and Azure shared disks reside in the same datacenter, your availability SLAs may top out at 99.95%.
The good news is that there are ways to configure SQL Server FCIs in the cloud for 99.99% availability. Moreover, the configuration approaches that enable a highly available SQL Server FCI can also work on-prem or in a hybrid configuration — so configuring for HA with the cloud in mind, even if you’re deploying on-prem, puts you in a position to migrate your highly available SQL Server instance to the cloud if in the future you want to.
The Always On Availability Group Approach
One way to configure HA for your cloud-based SQL Server infrastructure is to create a SQL Server Always On Availability Group (AG). You would create an FCI whose VMs reside in at least two geographically distinct datacenters. You would then identify the databases you want to add to the availability group(s). When activated, the AG replicates your protected databases from the primary VM to the secondary VMs. If the primary VM goes offline, users are automatically redirected to the secondary VM where a secondary instance of SQL Server can begin working with its copy of the protected databases.
One advantage to using Always On Availability Groups is that the feature is native to SQL Server Enterprise Edition 2012 and later. But there lies one disadvantage, too: It requires use of the pricy Enterprise edition, which may be overkill for your application. The Standard edition of SQL Server 2016 and later supports Always On Basic Availability Groups, but the “basic” AG option supports only a single database per AG. There’s also a second disadvantage to using either flavor of AG: AGs replicate only the user-defined databases. The SQL Server system databases — containing passwords, jobs, agents, and so forth — are not replicated to the secondary VMs. This can create problems for you if the primary remains offline for any length of time.
The SANLess Cluster Approach
A second way to configure HA storage for your cloud-based SQL Server infrastructure is to create what’s called a SANless cluster. As with the AG approach, you’d create an FCI built of VMs in multiple datacenters. Instead of configuring shared storage as you would in an on-prem FCI, though, you’d employ a third-party product to manage storage for the cluster. In a SANless cluster configuration, the third-party software provides block-level replication of the SQL databases from the primary to the secondary cluster nodes. Unlike in an AG, though, the SANless cluster approach replicates all the databases to each node. And, unlike in an AG, you can configure a SANless cluster solution, supporting any number of databases, using either SQL Server Enterprise Edition or Standard Edition from SQL Server 2008 through 2019.
Building the right SQL Server high availability solution is key to configuring a SQL Server infrastructure that will deliver database availability of 99.99% or higher. But just remember that the approach that works on-prem does not always work in the cloud—while what works in the cloud will work just fine both on-prem and in a hybrid HA configuration.
By Dave Bermingham, Senior Technical Evangelist, SIOS Technology
Dave Bermingham is the Senior Technical Evangelist at SIOS Technology. He is recognized within the technology community as a high-availability expert and has been honored to be elected a Microsoft MVP for the past 10 years: 6 years as a Cluster MVP and 4 years as a Cloud and Datacenter Management MVP. Dave holds numerous technical certifications and has more than thirty years of IT experience, including in finance, healthcare, and education.