[ClusterLabs] Q: Should a cleanup reset the failcount also?
Ulrich Windl
Ulrich.Windl at rz.uni-regensburg.de
Wed Feb 3 04:04:29 EST 2021
Hi!
I'm wondering:
I had a failed clone resource. After fixing the problem, I performed a cleanup, but the fail-counts weren't reset (I thought that was the case in older versions of pacemaker):
Before:
Full List of Resources:
* Clone Set: cln_iotw-md10 [prm_iotw-md10]:
* Started: [ h19 ]
* Stopped: [ h16 h18 ]
Migration Summary:
* Node: h16:
* prm_iotw-md10: migration-threshold=1000000 fail-count=1000000 last-failure='Tue Feb 2 16:09:51 2021'
* Node: h19:
* prm_iotw-md10: migration-threshold=1000000 fail-count=1 last-failure='Tue Feb 2 16:02:02 2021'
* Node: h18:
* prm_iotw-md10: migration-threshold=1000000 fail-count=1000000 last-failure='Tue Feb 2 16:09:51 2021'
Failed Resource Actions:
* prm_iotw-md10_start_0 on h16 'error' (1): call=180, status='complete', exitreason='', last-rc-change='2021-02-02 16:09:51 +01:00', queued=0ms, exec=330ms
* prm_iotw-md10_start_0 on h18 'error' (1): call=109, status='complete', exitreason='', last-rc-change='2021-02-02 16:09:51 +01:00', queued=0ms, exec=350ms
Cleanup:
h16:~ # crm_resource -C -r prm_iotw-md10 -n start -N h18
Cleaned up prm_iotw-md10:0 on h18
Cleaned up prm_iotw-md10:1 on h18
Cleaned up prm_iotw-md10:2 on h18
Waiting for 1 reply from the controller. OK
h16:~ # crm_resource -C -r prm_iotw-md10 -n start -N h16
Cleaned up prm_iotw-md10:0 on h16
Cleaned up prm_iotw-md10:1 on h16
Cleaned up prm_iotw-md10:2 on h16
Waiting for 1 reply from the controller. OK
After:
Full List of Resources:
* Clone Set: cln_iotw-md10 [prm_iotw-md10]:
* Started: [ h16 h18 h19 ]
Migration Summary:
* Node: h16:
* prm_iotw-md10: migration-threshold=1000000 fail-count=1 last-failure='Tue Feb 2 16:02:03 2021'
* Node: h19:
* prm_iotw-md10: migration-threshold=1000000 fail-count=1 last-failure='Tue Feb 2 16:02:02 2021'
* Node: h18:
* prm_iotw-md10: migration-threshold=1000000 fail-count=1 last-failure='Tue Feb 2 16:02:03 2021'
Failed Resource Actions:
Note: Output was truncated to the relevant parts
Regards,
Ulrich
More information about the Users
mailing list