[ClusterLabs] Resolving cart before the horse with mounted filesystems.

Mon May 3 02:10:21 EDT 2021

On 03.05.2021 06:27, Matthew Schumacher wrote:
> On 4/30/21 12:08 PM, Matthew Schumacher wrote:
>> On 4/30/21 11:51 AM, Ken Gaillot wrote:
>>> On Fri, 2021-04-30 at 16:20 +0000, Strahil Nikolov wrote:
>>>> Ken ment yo use 'Filesystem' resourse for mounting that NFS server
>>>> and then clone that resource.
>>>>
>>>> Best Regards,
>>>> Strahil Nikolov
>>
>> I'm currently working on understanding and implementing this
>> suggestion from Andrei:
>>
>> Which is exactly what clones are for. Clone NFS mount and order
>> VirtualDomain after clone. Just do not forget to set interleave=true so
>> VirtualDomain considers only local clone instance.
> 
> I tried to use this config, but it's not working for me.
> 
> I have a group that puts together a ZFS mount (which starts an NFS
> share), configures some iscsi stuff, and binds a failover IP address:
> 
> group IP-ZFS-iSCSI fence-datastore zfs-datastore ZFSiSCSI failover-ip
> 
> Then, I made a mount to that NFS server as a resource:
> 
> primitive mount-datastore-nfs Filesystem \
>     params device="<ip>:/datastore" directory="/datastore" fstype=nfs op
> monitor timeout=40s interval=20s
> 
> Then I made a clone of this:
> 
> clone clone-mount-datastore-nfs mount-datastore-nfs meta interleave=true
> target-role=Started
> 
> So, in theory, the ZFS/NFS server is mounted on all of the nodes with
> the clone config.  Now I define some orders to make sure stuff comes up
> in order:
> 
> order mount-datastore-before-vm-testvm Mandatory:
> clone-mount-datastore-nfs vm-testvm
> order zfs-datastore-before-mount-datastore Mandatory: IP-ZFS-iSCSI
> clone-mount-datastore-nfs
> 
> In theory, when a node comes on line, it should check to make sure
> IP-ZFS-iSCSI is running somewhere in the cluster, then check the local
> instance of mount-datastore-nfs to make sure he have the NFS mounts we
> need, then start vm-testvm, however that doesn't work.  If I kill
> pacemaker on one node, it's fenced, rebooted, and when it comes back I
> note this in the log:
> 
> 
> # grep -v  pacemaker /var/log/pacemaker/pacemaker.log
> May 03 03:02:41  VirtualDomain(vm-testvm)[1300]:    INFO: Configuration
> file /datastore/vm/testvm/testvm.xml not readable during probe.
> May 03 03:02:41  VirtualDomain(vm-testvm)[1300]:    INFO: environment is
> invalid, resource considered stopped
> May 03 03:02:42  Filesystem(mount-datastore-nfs)[1442]:    INFO: Running
> start for 172.25.253.110:/dev/datastore-nfs-stub on /datastore
> May 03 03:02:45  VirtualDomain(vm-testvm)[2576]:    INFO: Virtual domain
> testvm currently has no state, retrying.
> May 03 03:02:46  VirtualDomain(vm-testvm)[2576]:    INFO: Domain testvm
> already stopped.
> 

It is impossible to comment basing on couple of random lines from log.
You need to provide full log from DC and the node in question from the
moment pacemaker was restarted.

But the obvious answer - pacemaker runs probes when it starts and these
probes run asynchronously. So this may be simply output of resource
agent doing probe. In which case the result is correct - probe found out
domain was not running.

> Looks like the VirtualDomain resource vm-testvm is started before the
> Filesystem resource clone-mount-datastore-nfs even though I have this:
> 
> order mount-datastore-before-vm-testvm Mandatory:
> clone-mount-datastore-nfs vm-testvm
> 
> I'm not sure what I'm missing.  I need to make sure this NFS mount is
> started on the local node before starting virtualdomain on that same
> node.  Should I use the resource instead of the clone in the order
> statement?  Like this:
> 
> order mount-datastore-before-vm-testvm Mandatory: mount-datastore-nfs
> vm-testvm
> 
> Any suggestions appreciated.
> 
> Matt
> 
> 
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/