vROps HA / CA clusters

Have you recently cast your beautiful eyes over the sizing guides for vRealize Operations 8.X? Of course you have; it’s mandatory reading. A real tribute to the works of H.G. Wells & H. P. Lovecraft.

Recently I was reviewing a vROps 8 design and the author had added a section on Continuous Availability (CA). The bit that caught my eye was a comparison table between CA and HA. Mostly specifically the maximum number of nodes. Something didn’t add up. Lets take a closer look:

vRealize Operations 8.0.1 HA Cluster Sizing
Figure 1. vRealize Operations HA Cluster Sizing
vRealize Operations 8.0.1 CA cluster sizing table.
Figure 2. vRealize Operations CA Cluster Sizing

Lets review the design guidance for a CA cluster:

Continuous Availability (CA) allows the cluster nodes to be stretched across two fault domains, with the ability to experience up to one fault domain failure and to recover without causing cluster downtime.  CA requires an equal number of nodes in each fault domain and a witness node, in a third site, to monitor split brain scenarios. 

Every CA fault domain must be balanced, therefore every CA cluster must have at least 2 fault domains.

So far that’s all fairly easy to follow but the numbers don’t align. An extra-large vROps HA cluster has a maximum number of nodes of 6 (the final column in Figure 1). The maximum number of nodes in a single fault domain is 4 (the final column in Figure 2). The minimum number of fault domains in a CA cluster is 2, therefore the total number of nodes in a CA cluster is 8.

Surely this is a mistake?

I asked the Borg collective for assimilation. They said no but did tell me, and I paraphrase:

The maximum number of nodes for vROps 8.X are different depending on if you are using HA or CA although the overall objects / metrics maximums are unchanged.

So, in conclusion, there is no increase in the objects or metrics that a CA cluster can support compared to a HA cluster. So the total supported* capacity remains the same, you can just have more nodes to support the CA fault domain capability.

*Obviously you can go bigger, but VMware support can tell you off.

*EDIT*

I’ve edited this post since it was originally posted to add some additional context and tidy up some of the language.