[ClusterLabs] resource fails manual failover

Tue Dec 12 10:17:13 EST 2023

On Tue, 2023-12-12 at 16:50 +0300, Artem wrote:
> Is there a detailed explanation for resource monitor and start
> timeouts and intervals with examples, for dummies?

No, though Pacemaker Explained has some reference information:

https://clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Explained/singlehtml/index.html#resource-operations

> 
> my resource configured s follows:
> [root at lustre-mds1 ~]# pcs resource show MDT00
> Warning: This command is deprecated and will be removed. Please use
> 'pcs resource config' instead.
> Resource: MDT00 (class=ocf provider=heartbeat type=Filesystem)
>   Attributes: MDT00-instance_attributes
>     device=/dev/mapper/mds00
>     directory=/lustre/mds00
>     force_unmount=safe
>     fstype=lustre
>   Operations:
>     monitor: MDT00-monitor-interval-20s
>       interval=20s
>       timeout=40s
>     start: MDT00-start-interval-0s
>       interval=0s
>       timeout=60s
>     stop: MDT00-stop-interval-0s
>       interval=0s
>       timeout=60s
> 
> I issued manual failover with the following commands:
> crm_resource --move -r MDT00 -H lustre-mds1
> 
> resource tried but returned back with the entries in pacemaker.log
> like these:
> Dec 12 15:53:23  Filesystem(MDT00)[1886100]:    INFO: Running start
> for /dev/mapper/mds00 on /lustre/mds00
> Dec 12 15:53:45  Filesystem(MDT00)[1886100]:    ERROR: Couldn't mount
> device [/dev/mapper/mds00] as /lustre/mds00
> 
> tried again with the same result:
> Dec 12 16:11:04  Filesystem(MDT00)[1891333]:    INFO: Running start
> for /dev/mapper/mds00 on /lustre/mds00
> Dec 12 16:11:26  Filesystem(MDT00)[1891333]:    ERROR: Couldn't mount
> device [/dev/mapper/mds00] as /lustre/mds00
> 
> Why it cannot move?

The error is outside the cluster software, in the mount attempt itself.
The resource agent logged the ERROR above, so if you can't find more
information in the system logs you may want to look at the agent code
to see what it's doing around that message.

> 
> Does this 20 sec interval (between start and error) have anything to
> do with monitor interval settings?

No. The monitor interval says when to schedule another recurring
monitor check after the previous one completes. The first monitor isn't
scheduled until after the start succeeds.

> 
> [root at lustre-mgs ~]# pcs constraint show --full
> Location Constraints:
>   Resource: MDT00
>     Enabled on:
>       Node: lustre-mds1 (score:100) (id:location-MDT00-lustre-mds1-
> 100)
>       Node: lustre-mds2 (score:100) (id:location-MDT00-lustre-mds2-
> 100)
>     Disabled on:
>       Node: lustre-mgs (score:-INFINITY) (id:location-MDT00-lustre-
> mgs--INFINITY)
>       Node: lustre1 (score:-INFINITY) (id:location-MDT00-lustre1
> --INFINITY)
>       Node: lustre2 (score:-INFINITY) (id:location-MDT00-lustre2
> --INFINITY)
>       Node: lustre3 (score:-INFINITY) (id:location-MDT00-lustre3
> --INFINITY)
>       Node: lustre4 (score:-INFINITY) (id:location-MDT00-lustre4
> --INFINITY)
> Ordering Constraints:
>   start MGT then start MDT00 (kind:Optional) (id:order-MGT-MDT00-
> Optional)
>   start MDT00 then start OST1 (kind:Optional) (id:order-MDT00-OST1-
> Optional)
>   start MDT00 then start OST2 (kind:Optional) (id:order-MDT00-OST2-
> Optional)
> 
> with regards to ordering constraint: OST1 and OST2 are started now,
> while I'm exercising MDT00 failover.
> 
-- 
Ken Gaillot <kgaillot at redhat.com>