[ClusterLabs] Q: Should a cleanup reset the failcount also?

Ulrich Windl Ulrich.Windl at rz.uni-regensburg.de
Wed Feb 3 04:04:29 EST 2021


Hi!

I'm wondering:
I had a failed clone resource. After fixing the problem, I performed a cleanup, but the fail-counts weren't reset (I thought that was the case in older versions of pacemaker):

Before:
Full List of Resources:
  * Clone Set: cln_iotw-md10 [prm_iotw-md10]:
    * Started: [ h19 ]
    * Stopped: [ h16 h18 ]
Migration Summary:
  * Node: h16:
    * prm_iotw-md10: migration-threshold=1000000 fail-count=1000000 last-failure='Tue Feb  2 16:09:51 2021'
  * Node: h19:
    * prm_iotw-md10: migration-threshold=1000000 fail-count=1 last-failure='Tue Feb  2 16:02:02 2021'
  * Node: h18:
    * prm_iotw-md10: migration-threshold=1000000 fail-count=1000000 last-failure='Tue Feb  2 16:09:51 2021'
Failed Resource Actions:
  * prm_iotw-md10_start_0 on h16 'error' (1): call=180, status='complete', exitreason='', last-rc-change='2021-02-02 16:09:51 +01:00', queued=0ms, exec=330ms
  * prm_iotw-md10_start_0 on h18 'error' (1): call=109, status='complete', exitreason='', last-rc-change='2021-02-02 16:09:51 +01:00', queued=0ms, exec=350ms

Cleanup:
h16:~ # crm_resource -C -r prm_iotw-md10 -n start -N h18
Cleaned up prm_iotw-md10:0 on h18
Cleaned up prm_iotw-md10:1 on h18
Cleaned up prm_iotw-md10:2 on h18
Waiting for 1 reply from the controller. OK
h16:~ # crm_resource -C -r prm_iotw-md10 -n start -N h16
Cleaned up prm_iotw-md10:0 on h16
Cleaned up prm_iotw-md10:1 on h16
Cleaned up prm_iotw-md10:2 on h16
Waiting for 1 reply from the controller. OK

After:
Full List of Resources:
  * Clone Set: cln_iotw-md10 [prm_iotw-md10]:
    * Started: [ h16 h18 h19 ]
Migration Summary:
  * Node: h16:
    * prm_iotw-md10: migration-threshold=1000000 fail-count=1 last-failure='Tue Feb  2 16:02:03 2021'
  * Node: h19:
    * prm_iotw-md10: migration-threshold=1000000 fail-count=1 last-failure='Tue Feb  2 16:02:02 2021'
  * Node: h18:
    * prm_iotw-md10: migration-threshold=1000000 fail-count=1 last-failure='Tue Feb  2 16:02:03 2021'
Failed Resource Actions:


Note: Output was truncated to the relevant parts

Regards,
Ulrich





More information about the Users mailing list