[ClusterLabs] Pacemaker's "stonith too many failures" log is not accurate

井上 和徳 inouekazu at intellilink.co.jp
Wed May 17 05:28:33 EDT 2017


Hi,
I'm testing Pacemaker-1.1.17-rc1.
The number of failures in "Too many failures (10) to fence" log does not match the number of actual failures.

After the 11th time fence failure, "Too many failures (10) to fence" is output.
Incidentally, stonith-max-attempts has not been set, so it is 10 by default..

[root at x3650f log]# egrep "Requesting fencing|error: Operation reboot|Stonith failed|Too many failures"
##Requesting fencing : 1st time
May 12 05:51:47 rhel73-1 crmd[5269]:  notice: Requesting fencing (reboot) of node rhel73-2
May 12 05:52:52 rhel73-1 stonith-ng[5265]:   error: Operation reboot of rhel73-2 by rhel73-1 for crmd.5269 at rhel73-1.8415167d: No data available
May 12 05:52:52 rhel73-1 crmd[5269]:  notice: Transition aborted: Stonith failed
## 2nd time
May 12 05:52:52 rhel73-1 crmd[5269]:  notice: Requesting fencing (reboot) of node rhel73-2
May 12 05:53:56 rhel73-1 stonith-ng[5265]:   error: Operation reboot of rhel73-2 by rhel73-1 for crmd.5269 at rhel73-1.53d3592a: No data available
May 12 05:53:56 rhel73-1 crmd[5269]:  notice: Transition aborted: Stonith failed
## 3rd time
May 12 05:53:56 rhel73-1 crmd[5269]:  notice: Requesting fencing (reboot) of node rhel73-2
May 12 05:55:01 rhel73-1 stonith-ng[5265]:   error: Operation reboot of rhel73-2 by rhel73-1 for crmd.5269 at rhel73-1.9177cb76: No data available
May 12 05:55:01 rhel73-1 crmd[5269]:  notice: Transition aborted: Stonith failed
## 4th time
May 12 05:55:01 rhel73-1 crmd[5269]:  notice: Requesting fencing (reboot) of node rhel73-2
May 12 05:56:05 rhel73-1 stonith-ng[5265]:   error: Operation reboot of rhel73-2 by rhel73-1 for crmd.5269 at rhel73-1.946531cb: No data available
May 12 05:56:05 rhel73-1 crmd[5269]:  notice: Transition aborted: Stonith failed
## 5th time
May 12 05:56:05 rhel73-1 crmd[5269]:  notice: Requesting fencing (reboot) of node rhel73-2
May 12 05:57:10 rhel73-1 stonith-ng[5265]:   error: Operation reboot of rhel73-2 by rhel73-1 for crmd.5269 at rhel73-1.278b3c4b: No data available
May 12 05:57:10 rhel73-1 crmd[5269]:  notice: Transition aborted: Stonith failed
## 6th time
May 12 05:57:10 rhel73-1 crmd[5269]:  notice: Requesting fencing (reboot) of node rhel73-2
May 12 05:58:14 rhel73-1 stonith-ng[5265]:   error: Operation reboot of rhel73-2 by rhel73-1 for crmd.5269 at rhel73-1.7a49aebb: No data available
May 12 05:58:14 rhel73-1 crmd[5269]:  notice: Transition aborted: Stonith failed
## 7th time
May 12 05:58:14 rhel73-1 crmd[5269]:  notice: Requesting fencing (reboot) of node rhel73-2
May 12 05:59:19 rhel73-1 stonith-ng[5265]:   error: Operation reboot of rhel73-2 by rhel73-1 for crmd.5269 at rhel73-1.83421862: No data available
May 12 05:59:19 rhel73-1 crmd[5269]:  notice: Transition aborted: Stonith failed
## 8th time
May 12 05:59:19 rhel73-1 crmd[5269]:  notice: Requesting fencing (reboot) of node rhel73-2
May 12 06:00:24 rhel73-1 stonith-ng[5265]:   error: Operation reboot of rhel73-2 by rhel73-1 for crmd.5269 at rhel73-1.afd7ef98: No data available
May 12 06:00:24 rhel73-1 crmd[5269]:  notice: Transition aborted: Stonith failed
## 9th time
May 12 06:00:24 rhel73-1 crmd[5269]:  notice: Requesting fencing (reboot) of node rhel73-2
May 12 06:01:28 rhel73-1 stonith-ng[5265]:   error: Operation reboot of rhel73-2 by rhel73-1 for crmd.5269 at rhel73-1.3b033dbe: No data available
May 12 06:01:28 rhel73-1 crmd[5269]:  notice: Transition aborted: Stonith failed
## 10th time
May 12 06:01:28 rhel73-1 crmd[5269]:  notice: Requesting fencing (reboot) of node rhel73-2
May 12 06:02:33 rhel73-1 stonith-ng[5265]:   error: Operation reboot of rhel73-2 by rhel73-1 for crmd.5269 at rhel73-1.5447a345: No data available
May 12 06:02:33 rhel73-1 crmd[5269]:  notice: Transition aborted: Stonith failed
## 11th time
May 12 06:02:33 rhel73-1 crmd[5269]:  notice: Requesting fencing (reboot) of node rhel73-2
May 12 06:03:37 rhel73-1 stonith-ng[5265]:   error: Operation reboot of rhel73-2 by rhel73-1 for crmd.5269 at rhel73-1.db50c21a: No data available
May 12 06:03:37 rhel73-1 crmd[5269]: warning: Too many failures (10) to fence rhel73-2, giving up
May 12 06:03:37 rhel73-1 crmd[5269]:  notice: Transition aborted: Stonith failed

Regards,
Kazunori INOUE




More information about the Users mailing list