[ClusterLabs] OCF_TIMEOUT - Does it recover by itself?

Salatiel Filho salatiel.filho at gmail.com
Tue Apr 26 14:20:53 EDT 2022


I have a question about OCF_TIMEOUT. Some times my cluster shows me
this on pcs status:
Failed Resource Actions:
  * fence-server02_monitor_60000 on server01 'OCF_TIMEOUT' (198):
call=419, status='Timed Out', exitreason='',
last-rc-change='2022-04-26 14:47:32 -03:00', queued=0ms, exec=20004ms

I can see in the same pcs status output that the fence device is
started, so does that mean it failed some moment in the past and now
it is OK? Or do I have to do something to recover it?

# pcs status
Cluster name: cluster1
Cluster Summary:
  * Stack: corosync
  * Current DC: server02 (version 2.1.0-8.el8-7c3f660707) - partition
with quorum
  * Last updated: Tue Apr 26 14:52:56 2022
  * Last change:  Tue Apr 26 14:37:22 2022 by hacluster via crmd on server01
  * 2 nodes configured
  * 11 resource instances configured

Node List:
  * Online: [ server01 server02 ]

Full List of Resources:
  * fence-server01    (stonith:fence_vmware_rest):     Started server02
  * fence-server02    (stonith:fence_vmware_rest):     Started server01
...

Is "pcs resource cleanup" the right way to remove those messages ?




Atenciosamente/Kind regards,
Salatiel


More information about the Users mailing list