[ClusterLabs] Help with tweaking an active/passive NFS cluster
Andrei Borzenkov
arvidjaar at gmail.com
Wed Apr 5 03:36:34 EDT 2023
On Fri, Mar 31, 2023 at 12:42 AM Ronny Adsetts
<ronny.adsetts at amazinginternet.com> wrote:
>
> Hi,
>
> I wonder if someone more familiar with the workings of pacemaker/corosync would be able to assist in solving an issue.
>
> I have a 3-node NFS cluster which exports several iSCSI LUNs. The LUNs are presented to the nodes via multipathd.
>
> This all works fine except that I can't stop just one export. Sometimes I need to take a single filesystem offline for maintenance for example. Or if there's an issue and a filesystem goes offline and can't come back.
>
> There's a trimmed down config below but essentially I want all the NFS exports on one node but I don't want any of the exports to block. So it's OK to stop (or fail) a single export.
>
> My config has a group for each export and filesystem and another group for the NFS server and VIP. I then co-locate them together.
>
> Cut-down config to limit the number of exports:
>
> node 1: nfs-01
> node 2: nfs-02
> node 3: nfs-03
> primitive NFSExportAdminHomes exportfs \
> params clientspec="172.16.40.0/24" options="rw,async,no_root_squash" directory="/srv/adminhomes" fsid=dcfd1bbb-c026-4d6d-8541-7fc29d6fef1a \
> op monitor timeout=20 interval=10 \
> op_params interval=10
> primitive NFSExportArchive exportfs \
> params clientspec="172.16.40.0/24" options="rw,async,no_root_squash" directory="/srv/archive" fsid=3abb6e34-bff2-4896-b8ff-fc1123517359 \
> op monitor timeout=20 interval=10 \
> op_params interval=10 \
> meta target-role=Started
> primitive NFSExportDBBackups exportfs \
> params clientspec="172.16.40.0/24" options="rw,async,no_root_squash" directory="/srv/dbbackups" fsid=df58b9c0-593b-45c0-9923-155b3d7d9483 \
> op monitor timeout=20 interval=10 \
> op_params interval=10
> primitive NFSFSAdminHomes Filesystem \
> params device="/dev/mapper/adminhomes-part1" directory="/srv/adminhomes" fstype=xfs \
> op start interval=0 timeout=120 \
> op monitor interval=60 timeout=60 \
> op_params OCF_CHECK_LEVEL=20 \
> op stop interval=0 timeout=240
> primitive NFSFSArchive Filesystem \
> params device="/dev/mapper/archive-part1" directory="/srv/archive" fstype=xfs \
> op start interval=0 timeout=120 \
> op monitor interval=60 timeout=60 \
> op_params OCF_CHECK_LEVEL=20 \
> op stop interval=0 timeout=240 \
> meta target-role=Started
> primitive NFSFSDBBackups Filesystem \
> params device="/dev/mapper/dbbackups-part1" directory="/srv/dbbackups" fstype=xfs \
> op start timeout=60 interval=0 \
> op monitor interval=20 timeout=40 \
> op stop timeout=60 interval=0 \
> op_params OCF_CHECK_LEVEL=20
> primitive NFSIP-01 IPaddr2 \
> params ip=172.16.40.17 cidr_netmask=24 nic=ens14 \
> op monitor interval=30s
> group AdminHomes NFSFSAdminHomes NFSExportAdminHomes \
> meta target-role=Started
> group Archive NFSFSArchive NFSExportArchive \
> meta target-role=Started
> group DBBackups NFSFSDBBackups NFSExportDBBackups \
> meta target-role=Started
> group NFSServerIP NFSIP-01 NFSServer \
> meta target-role=Started
> colocation NFSMaster inf: NFSServerIP AdminHomes Archive DBBackups
This is entirely equivalent to defining a group and says that
resources must be started in strict order on the same node. Like with
a group, if an earlier resource cannot be started, all following
resources are not started either.
> property cib-bootstrap-options: \
> have-watchdog=false \
> dc-version=2.0.1-9e909a5bdd \
> cluster-infrastructure=corosync \
> cluster-name=nfs-cluster \
> stonith-enabled=false \
> last-lrm-refresh=1675344768
> rsc_defaults rsc-options: \
> resource-stickiness=200
>
>
> The problem is that if one export fails, none of the following exports will be attempted. Reading the docs, that's to be expected as each item in the colocation needs the preceding item to succeed.
>
> I tried changing the colocation line like so to remove the dependency:
>
> colocation NFSMaster inf: NFSServerIP ( AdminHomes Archive DBBackups )
>
1. The ( AdminHomes Archive DBBackups ) creates a set with
sequential=false. Now, the documentation for "sequential" is one of
the most obscure I have seen, but judging by "the individual members
within any one set may or may not be colocated relative to each other
(determined by the set’s sequential property)" and "A colocated set
with sequential="false" makes sense only if there is another set in
the constraint. Otherwise, the constraint has no effect" members of a
set with sequential=false are not colocated on the same node.
2. The condition is backward. You colocate NFSServerIP *with* set (
AdminHomes Archive DBBackups ), while you actually want to colocate
set ( AdminHomes Archive DBBackups ) *with* NFSServerIP.
So the
colocation NFSMaster inf: ( AdminHomes Archive DBBackups ) ( NFSServerIP )
may work.
The pacemaker behavior is rather puzzling though. According to
documentation "in order for any member of one set in the constraint to
be active, all members of sets listed after it must also be active
(and naturally on the same node)", but in your case members of set are
on the same node which would imply that NFSServerIP (which is a sole
member of an implicit set) should not be active.
Anyway, an alternative is to define separate colocation for each group
which likely makes configuration more clear
> but this gave me two problems:
>
> 1. Issuing a "resource stop DBBackups" took everything offline briefly
>
> 2. Issuing a "resource start DBBackups" brought it back on a different node to NFSServerIP
>
> I'm very obviously missing something here.
>
> Could someone kindly point me in the right direction?
>
> TIA.
>
> Ronny
>
> --
> Ronny Adsetts
> Technical Director
> Amazing Internet Ltd, London
> t: +44 20 8977 8943
> w: www.amazinginternet.com
>
> Registered office: 85 Waldegrave Park, Twickenham, TW1 4TJ
> Registered in England. Company No. 4042957
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
More information about the Users
mailing list