[ClusterLabs] How to set up "active-active" cluster by balancing multiple exports across servers?
Billy Wilson
billy_wilson at byu.edu
Tue Jan 12 13:04:12 EST 2021
I'm having trouble setting up what seems like should be a
straightforward NFS-HA design. It is similar to what Christoforos
Christoforou attempted to do earlier in 2020
(https://www.mail-archive.com/users@clusterlabs.org/msg09671.html).
My goal is to balance multiple NFS exports across two nodes to
effectively have an "active-active" configuration. Each export should
only be available from one node at a time, but they should be able to
freely fail back and forth to balance between the two nodes.
I'm also hoping to isolate each exported filesystem to its own set of
underlying disks, to prevent heavy IO on one exported filesystem from
affecting another one. So each filesystem to be exported should be
backed by a unique volume group.
I've set up two nodes with fencing, an ethmonitor clone, and the
following two resource groups.
"""
* Resource Group: ha1:
* alice_lvm (ocf::heartbeat:LVM-activate): Started host1
* alice_xfs (ocf::heartbeat:Filesystem): Started host1
* alice_nfs (ocf::heartbeat:nfsserver): Started host1
* alice_ip (ocf::heartbeat:IPaddr2): Started host1
* alice_nfsnotify (ocf::heartbeat:nfsnotify): Started host1
* alice_login01 (ocf::heartbeat:exportfs): Started host1
* alice_login02 (ocf::heartbeat:exportfs): Started host1
* Resource Group: ha2:
* bob_lvm (ocf::heartbeat:LVM-activate): Started host2
* bob_xfs (ocf::heartbeat:Filesystem): Started host2
* bob_nfs (ocf::heartbeat:nfsserver): Started host2
* bob_ip (ocf::heartbeat:IPaddr2): Started host2
* bob_nfsnotify (ocf::heartbeat:nfsnotify): Started host2
* bob_login01 (ocf::heartbeat:exportfs): Started host2
* bob_login02 (ocf::heartbeat:exportfs): Started host2
"""
We had an older storage appliance that used Red Hat HA on RHEL 6 (back
when it still used RGManager and not Pacemaker), and it was capable of
load-balanced NFS-HA like this.
The problem with this approach using Pacemaker is that the "nfsserver"
resource agent only wants one instance per host. During a failover
event, both "nfsserver" RAs will try to bind mount the NFS shared info
directory to /var/lib/nfs/. Only one will claim the directory.
If I convert everything to a single resource group as Christoforos did,
then the cluster is active-passive, and all the resources fail as a
single unit. Having one node serve all the exports while the other is
idle doesn't seem very ideal.
I'd like to eventually have something like this:
"""
* Resource Group: ha1:
* alice_lvm (ocf::heartbeat:LVM-activate): Started host1
* alice_xfs (ocf::heartbeat:Filesystem): Started host1
* charlie_lvm (ocf::heartbeat:LVM-activate): Started host1
* charlie_xfs (ocf::heartbeat:Filesystem): Started host1
* ha1_nfs (ocf::heartbeat:nfsserver): Started host1
* alice_ip (ocf::heartbeat:IPaddr2): Started host1
* charlie_ip (ocf::heartbeat:IPaddr2): Started host1
* ha1_nfsnotify (ocf::heartbeat:nfsnotify): Started host1
* alice_login01 (ocf::heartbeat:exportfs): Started host1
* alice_login02 (ocf::heartbeat:exportfs): Started host1
* charlie_login01 (ocf::heartbeat:exportfs): Started host1
* charlie_login02 (ocf::heartbeat:exportfs): Started host1
* Resource Group: ha2:
* bob_lvm (ocf::heartbeat:LVM-activate): Started host2
* bob_xfs (ocf::heartbeat:Filesystem): Started host2
* david_lvm (ocf::heartbeat:LVM-activate): Started host2
* david_xfs (ocf::heartbeat:Filesystem): Started host2
* ha2_nfs (ocf::heartbeat:nfsserver): Started host2
* bob_ip (ocf::heartbeat:IPaddr2): Started host2
* david_ip (ocf::heartbeat:IPaddr2): Started host2
* ha2_nfsnotify (ocf::heartbeat:nfsnotify): Started host2
* bob_login01 (ocf::heartbeat:exportfs): Started host2
* bob_login02 (ocf::heartbeat:exportfs): Started host2
* david_login01 (ocf::heartbeat:exportfs): Started host2
* david_login02 (ocf::heartbeat:exportfs): Started host2
"""
Or even this:
"""
* Resource Group: alice_research:
* alice_lvm (ocf::heartbeat:LVM-activate): Started host1
* alice_xfs (ocf::heartbeat:Filesystem): Started host1
* alice_nfs (ocf::heartbeat:nfsserver): Started host1
* alice_ip (ocf::heartbeat:IPaddr2): Started host1
* alice_nfsnotify (ocf::heartbeat:nfsnotify): Started host1
* alice_login01 (ocf::heartbeat:exportfs): Started host1
* alice_login02 (ocf::heartbeat:exportfs): Started host1
* Resource Group: charlie_research:
* charlie_lvm (ocf::heartbeat:LVM-activate): Started host1
* charlie_xfs (ocf::heartbeat:Filesystem): Started host1
* charlie_nfs (ocf::heartbeat:nfsserver): Started host1
* charlie_ip (ocf::heartbeat:IPaddr2): Started host1
* charlie_nfsnotify (ocf::heartbeat:nfsnotify): Started host1
* charlie_login01 (ocf::heartbeat:exportfs): Started host1
* charlie_login02 (ocf::heartbeat:exportfs): Started host1
* Resource Group: bob_research:
* bob_lvm (ocf::heartbeat:LVM-activate): Started host2
* bob_xfs (ocf::heartbeat:Filesystem): Started host2
* bob_nfs (ocf::heartbeat:nfsserver): Started host2
* bob_ip (ocf::heartbeat:IPaddr2): Started host2
* bob_nfsnotify (ocf::heartbeat:nfsnotify): Started host2
* bob_login01 (ocf::heartbeat:exportfs): Started host2
* bob_login02 (ocf::heartbeat:exportfs): Started host2
* Resource Group: david_research:
* david_lvm (ocf::heartbeat:LVM-activate): Started host2
* david_xfs (ocf::heartbeat:Filesystem): Started host2
* david_nfs (ocf::heartbeat:nfsserver): Started host2
* david_ip (ocf::heartbeat:IPaddr2): Started host2
* david_nfsnotify (ocf::heartbeat:nfsnotify): Started host2
* david_login01 (ocf::heartbeat:exportfs): Started host2
* david_login02 (ocf::heartbeat:exportfs): Started host2
"""
Is there a way to have a load-balanced NFS-HA solution with
Pacemaker/Corosync? Can I make a clone set of the nfsserver resource
while the rest fail back and forth, or find some other workaround? Do I
need to modify the existing resource agent?
Thanks,
Billy
More information about the Users
mailing list