[ClusterLabs] failure-timeout not working in corosync 2.0.1

Wed Mar 31 15:39:29 EDT 2021

Hi.

I've pared my configureation down to almost a bare minimum to demonstrate the 
problem I'm having.

I have two questions:

1. What command can I use to find out what pacemaker thinks my cluster.cib file 
really means?

I know what I put in it, but I want to see what pacemaker has understood from 
it, to make sure that pacemaker has the same idea about how to manage my 
resources as I do.

2. Can anyone tell me what the problem is with the following cluster.cib 
(lines split on spaces to make things more readable, the actual file consists 
of four lines of text):

primitive IP-float4
	IPaddr2
	params
	ip=10.1.0.5
	cidr_netmask=24
	meta
	migration-threshold=3
	op
	monitor
	interval=10
	timeout=30
	on-fail=restart
	failure-timeout=180
primitive IPsecVPN
	lsb:ipsecwrapper
	meta
	migration-threshold=3
	op
	monitor
	interval=10
	timeout=30
	on-fail=restart
	failure-timeout=180
group Everything
	IP-float4
	IPsecVPN
	resource-stickiness=100
property cib-bootstrap-options:
	stonith-enabled=no
	no-quorum-policy=stop
	start-failure-is-fatal=false
	cluster-recheck-interval=60s

My problem is that "failure-timeout" is not being honoured.  A resource 
failure simply never times out, and 3 failures (over a fortnight, if that's 
how long it takes to get 3 failures) mean that the resources move.

I want a failure to be forgotten about after 180 seconds (or at least, soon 
after that - 240 seconds would be fine, if cluster-recheck-interval means that 
180 can't quite be achieved).

Somehow or other, _far_ more than 180 seconds go by, and I *still* have:

	fail-count=1 last-failure='Wed Mar 31 21:23:11 2021'

as part of the output of "crm status -f" (the above timestamp is BST, so 
that's 70 minutes ago now).

Thanks for any help,

Antony.

-- 
Don't procrastinate - put it off until tomorrow.

                                                   Please reply to the list;
                                                         please *don't* CC me.