[ClusterLabs] Resolving cart before the horse with mounted filesystems.

Mon May 3 10:19:55 EDT 2021

On 03.05.2021 16:12, Matthew Schumacher wrote:
...

> 
> You are right Andrei.  Looking at the logs:
> 
...
> May 03 03:02:41 node2 pacemaker-controld  [1283] (do_lrm_rsc_op) info:
> Performing key=7:1:7:b8b0100c-2951-4d07-83da-27cfc1225718
> op=vm-testvm_monitor_0

This is probe operation.

> May 03 03:02:41 node2 pacemaker-controld  [1283] (action_synced_wait)
>     info: VirtualDomain_meta-data_0[1288] exited with status 0
> May 03 03:02:41 node2 pacemaker-based     [1278] (cib_process_request)
>     info: Forwarding cib_modify operation for section status to all
> (origin=local/crmd/8)
> May 03 03:02:41 node2 pacemaker-execd     [1280]
> (process_lrmd_get_rsc_info)     info: Agent information for
> 'fence-datastore' not in cache
> May 03 03:02:41 node2 pacemaker-execd     [1280]
> (process_lrmd_rsc_register)     info: Cached agent information for
> 'fence-datastore'
> May 03 03:02:41 node2 pacemaker-controld  [1283] (do_lrm_rsc_op) info:
> Performing key=8:1:7:b8b0100c-2951-4d07-83da-27cfc1225718
> op=fence-datastore_monitor_0
> May 03 03:02:41  VirtualDomain(vm-testvm)[1300]:    INFO: Configuration
> file /datastore/vm/testvm/testvm.xml not readable during probe.
> May 03 03:02:41 node2 pacemaker-based     [1278] (cib_perform_op)    
> info: Diff: --- 0.1608.23 2
> May 03 03:02:41 node2 pacemaker-based     [1278] (cib_perform_op)    
> info: Diff: +++ 0.1608.24 (null)
> May 03 03:02:41 node2 pacemaker-based     [1278] (cib_perform_op)    
> info: +  /cib:  @num_updates=24
> May 03 03:02:41 node2 pacemaker-based     [1278] (cib_perform_op)    
> info: ++ /cib/status/node_state[@id='2']: <transient_attributes id="2"/>
> May 03 03:02:41 node2 pacemaker-based     [1278] (cib_perform_op)    
> info: ++ <instance_attributes id="status-2">
> May 03 03:02:41 node2 pacemaker-based     [1278] (cib_perform_op)    
> info: ++                                       <nvpair
> id="status-2-.node-unfenced" name="#node-unfenced" value="1620010887"/>
> May 03 03:02:41 node2 pacemaker-based     [1278] (cib_perform_op)    
> info: ++ </instance_attributes>
> May 03 03:02:41 node2 pacemaker-based     [1278] (cib_perform_op)    
> info: ++ </transient_attributes>
> May 03 03:02:41 node2 pacemaker-based     [1278] (cib_process_request)
>     info: Completed cib_modify operation for section status: OK (rc=0,
> origin=node1/attrd/16, version=0.1608.24)
> May 03 03:02:41  VirtualDomain(vm-testvm)[1300]:    INFO: environment is
> invalid, resource considered stopped
> 
> When node2 comes back from being fenced (testing a hard failure), it
> checks the status of vm-testvm because I previously did a "crm resouce
> move vm-testvm node2" so it's trying to put the VirtualDomain resource
> back on node2, but calling monitor finds that the config file is missing
> because the NFS mount isn't up yet, so it assumes the resource is
> stopped (it's not),

Resource must be stopped on node2. How can it be started if node just
rebooted? Do you start resources manually, outside of pacemaker?

> then its confused:
> 
> May 03 03:02:45  VirtualDomain(vm-testvm)[2576]:    INFO: Virtual domain
> testvm currently has no state, retrying.
> May 03 03:02:46  VirtualDomain(vm-testvm)[2576]:    INFO: Domain testvm
> already stopped.
> 
> Eventually it does end up stopped on node1 and started on node2.
> 

It does exactly what you told it to do.

> Is there a way to configure the order so that we don't even run monitor
> until the dependent resource is running?
> 

This was already asked for the same reason. No, there is not. The goal
of monitor is to find out whether resource is active or not. If
prerequisite resources are not there, resource cannot be active.

> Is there a way to have a delayed start?
> 
> At the end of the day, the way VirtualDomain works has been very
> troublesome for me.  The second that the config file isn't available
> pacemaker thinks that the domain is down and starts kicking the stool
> from under things, even if the domain is running just fine. It seems to

You misunderstand what happens. Probes check whether specific resource
is running on specific node (which allows pacemaker to skip resource
start if it already active, e.g. after pacemaker service was restarted).
Then pacemaker recomputes resource distribution. It does it every time
something changed. So when node2 came back and pacemaker reevaluated
resource placement node2 became preferred choice. The choice is
preferred because "crm resource move vm-testvm node2" creates constraint
that tells exactly that - resource vm-testvm MUST run on node2 if node2
is available to run resources.

Pacemaker did exactly what you told it to do.

See "crm resource clear" for the way to remove such constraints.

> me that reading the config file is a poor way to test if it's working as
> it surely can be up even if the config file is missing, and because it's
> generated lots of false positives for me. I wonder why it was written
> this way.  Wouldn't it make more sense for monitor to get a status from
> virsh, 

monitor does call virsh. Configuration file is checked earlier, during
validation.

> and that we don't bother to look for the config file unless we
> are starting, and if its missing, we return failed to start?
> 

Because without config file resource agent does not know what to start
and what to monitor.