[ClusterLabs] How to set up "active-active" cluster by balancing multiple exports across servers?

Wed Jan 13 11:46:05 EST 2021

Hi.

I would run nfsserver and nfsnotify as a separate cloned group and make
both other groups colocated/ordered with it.
So nfs server will be just a per-host service, and then you attach
exports (with LVs, filesystems, ip addresses) to it.
NFS server in linux is an in-kernel creature, not an userspace process,
and it is not designed to have several instances bound to different
addresses. But with the approach above you can overcome that.

On Tue, 2021-01-12 at 11:04 -0700, Billy Wilson wrote:
> I'm having trouble setting up what seems like should be a 
> straightforward NFS-HA design. It is similar to what Christoforos 
> Christoforou attempted to do earlier in 2020 
> (https://www.mail-archive.com/users@clusterlabs.org/msg09671.html).
> 
> My goal is to balance multiple NFS exports across two nodes to 
> effectively have an "active-active" configuration. Each export should
> only be available from one node at a time, but they should be able to
> freely fail back and forth to balance between the two nodes.
> 
> I'm also hoping to isolate each exported filesystem to its own set of
> underlying disks, to prevent heavy IO on one exported filesystem from
> affecting another one. So each filesystem to be exported should be 
> backed by a unique volume group.
> 
> I've set up two nodes with fencing, an ethmonitor clone, and the 
> following two resource groups.
> 
> """
>    * Resource Group: ha1:
>      * alice_lvm    (ocf::heartbeat:LVM-activate):    Started host1
>      * alice_xfs    (ocf::heartbeat:Filesystem):    Started host1
>      * alice_nfs    (ocf::heartbeat:nfsserver):    Started host1
>      * alice_ip    (ocf::heartbeat:IPaddr2):    Started host1
>      * alice_nfsnotify    (ocf::heartbeat:nfsnotify):    Started
> host1
>      * alice_login01    (ocf::heartbeat:exportfs):    Started host1
>      * alice_login02    (ocf::heartbeat:exportfs):    Started host1
>    * Resource Group: ha2:
>      * bob_lvm    (ocf::heartbeat:LVM-activate):    Started host2
>      * bob_xfs    (ocf::heartbeat:Filesystem):    Started host2
>      * bob_nfs    (ocf::heartbeat:nfsserver):    Started host2
>      * bob_ip    (ocf::heartbeat:IPaddr2):    Started host2
>      * bob_nfsnotify    (ocf::heartbeat:nfsnotify):    Started host2
>      * bob_login01    (ocf::heartbeat:exportfs):    Started host2
>      * bob_login02    (ocf::heartbeat:exportfs):    Started host2
> """
> 
> We had an older storage appliance that used Red Hat HA on RHEL 6
> (back 
> when it still used RGManager and not Pacemaker), and it was capable
> of 
> load-balanced NFS-HA like this.
> 
> The problem with this approach using Pacemaker is that the
> "nfsserver" 
> resource agent only wants one instance per host. During a failover 
> event, both "nfsserver" RAs will try to bind mount the NFS shared
> info 
> directory to /var/lib/nfs/. Only one will claim the directory.
> 
> If I convert everything to a single resource group as Christoforos
> did, 
> then the cluster is active-passive, and all the resources fail as a 
> single unit. Having one node serve all the exports while the other is
> idle doesn't seem very ideal.
> 
> I'd like to eventually have something like this:
> 
> """
>    * Resource Group: ha1:
>      * alice_lvm    (ocf::heartbeat:LVM-activate):    Started host1
>      * alice_xfs    (ocf::heartbeat:Filesystem):    Started host1
>      * charlie_lvm    (ocf::heartbeat:LVM-activate):    Started host1
>      * charlie_xfs    (ocf::heartbeat:Filesystem):    Started host1
>      * ha1_nfs    (ocf::heartbeat:nfsserver):    Started host1
>      * alice_ip    (ocf::heartbeat:IPaddr2):    Started host1
>      * charlie_ip    (ocf::heartbeat:IPaddr2):    Started host1
>      * ha1_nfsnotify    (ocf::heartbeat:nfsnotify):    Started host1
>      * alice_login01    (ocf::heartbeat:exportfs):    Started host1
>      * alice_login02    (ocf::heartbeat:exportfs):    Started host1
>      * charlie_login01    (ocf::heartbeat:exportfs):    Started host1
>      * charlie_login02    (ocf::heartbeat:exportfs):    Started host1
>    * Resource Group: ha2:
>      * bob_lvm    (ocf::heartbeat:LVM-activate):    Started host2
>      * bob_xfs    (ocf::heartbeat:Filesystem):    Started host2
>      * david_lvm    (ocf::heartbeat:LVM-activate):    Started host2
>      * david_xfs    (ocf::heartbeat:Filesystem):    Started host2
>      * ha2_nfs    (ocf::heartbeat:nfsserver):    Started host2
>      * bob_ip    (ocf::heartbeat:IPaddr2):    Started host2
>      * david_ip    (ocf::heartbeat:IPaddr2):    Started host2
>      * ha2_nfsnotify    (ocf::heartbeat:nfsnotify):    Started host2
>      * bob_login01    (ocf::heartbeat:exportfs):    Started host2
>      * bob_login02    (ocf::heartbeat:exportfs):    Started host2
>      * david_login01    (ocf::heartbeat:exportfs):    Started host2
>      * david_login02    (ocf::heartbeat:exportfs):    Started host2
> """
> 
> Or even this:
> 
> """
>    * Resource Group: alice_research:
>      * alice_lvm    (ocf::heartbeat:LVM-activate):    Started host1
>      * alice_xfs    (ocf::heartbeat:Filesystem):    Started host1
>      * alice_nfs    (ocf::heartbeat:nfsserver):    Started host1
>      * alice_ip    (ocf::heartbeat:IPaddr2):    Started host1
>      * alice_nfsnotify    (ocf::heartbeat:nfsnotify):    Started
> host1
>      * alice_login01    (ocf::heartbeat:exportfs):    Started host1
>      * alice_login02    (ocf::heartbeat:exportfs):    Started host1
>    * Resource Group: charlie_research:
>      * charlie_lvm    (ocf::heartbeat:LVM-activate):    Started host1
>      * charlie_xfs    (ocf::heartbeat:Filesystem):    Started host1
>      * charlie_nfs    (ocf::heartbeat:nfsserver):    Started host1
>      * charlie_ip    (ocf::heartbeat:IPaddr2):    Started host1
>      * charlie_nfsnotify    (ocf::heartbeat:nfsnotify):    Started
> host1
>      * charlie_login01    (ocf::heartbeat:exportfs):    Started host1
>      * charlie_login02    (ocf::heartbeat:exportfs):    Started host1
>    * Resource Group: bob_research:
>      * bob_lvm    (ocf::heartbeat:LVM-activate):    Started host2
>      * bob_xfs    (ocf::heartbeat:Filesystem):    Started host2
>      * bob_nfs    (ocf::heartbeat:nfsserver):    Started host2
>      * bob_ip    (ocf::heartbeat:IPaddr2):    Started host2
>      * bob_nfsnotify    (ocf::heartbeat:nfsnotify):    Started host2
>      * bob_login01    (ocf::heartbeat:exportfs):    Started host2
>      * bob_login02    (ocf::heartbeat:exportfs):    Started host2
>    * Resource Group: david_research:
>      * david_lvm    (ocf::heartbeat:LVM-activate):    Started host2
>      * david_xfs    (ocf::heartbeat:Filesystem):    Started host2
>      * david_nfs    (ocf::heartbeat:nfsserver):    Started host2
>      * david_ip    (ocf::heartbeat:IPaddr2):    Started host2
>      * david_nfsnotify    (ocf::heartbeat:nfsnotify):    Started
> host2
>      * david_login01    (ocf::heartbeat:exportfs):    Started host2
>      * david_login02    (ocf::heartbeat:exportfs):    Started host2
> """
> 
> Is there a way to have a load-balanced NFS-HA solution with 
> Pacemaker/Corosync? Can I make a clone set of the nfsserver resource 
> while the rest fail back and forth, or find some other workaround? Do
> I 
> need to modify the existing resource agent?
> 
> 
> Thanks,
> Billy
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/