[ClusterLabs] Antw: Re: [Question] About a change of crm_failcount.
renayama19661014 at ybb.ne.jp
renayama19661014 at ybb.ne.jp
Thu Feb 9 10:24:22 UTC 2017
> 1. Return a "hard" error such as OCF_ERR_ARGS or OCF_ERR_PERM. When
> Pacemaker gets one of these errors from an agent, it will ban the
> resource from that node (until the failure is cleared).
The first suggestion does not work well.
Even if this returns OCF_ERR_ARGS and OCF_ERR_PERM, it seems to be to be pre_promote(notify) handling of RA.
Pacemaker does not record the notify(pre promote) error in CIB.
Because it is not recorded in CIB, there cannot be the thing that pengine works as "hard error".
> 2. Use crm_resource --ban instead. This would ban the resource from that
> node until the user removes the ban with crm_resource --clear (or by
> deleting the ban consraint from the configuration).
The second suggestion works well.
I intend to adopt the second suggestion.
As other methods, you think crm_resource -F to be available, but what do you think?
I think that last-failure does not have a problem either to let you handle pseudotrouble if it is crm_resource -F.
I think whether crm_resource -F is available, but adopt crm_resource -B because RA wants to completely stop pgsql resource.
``` @pgsql RA
if [ "$cmp_location" != "$my_master_baseline" ]; then
ocf_exit_reason "My data is newer than new master's one. New master's location : $master_baseline"
exec_with_retry 0 $CRM_RESOURCE -B -r $OCF_RESOURCE_INSTANCE -N $NODENAME -Q
I test movement a little more and send a patch.
----- Original Message -----
> From: Ulrich Windl <Ulrich.Windl at rz.uni-regensburg.de>
> To: users at clusterlabs.org; kgaillot at redhat.com
> Date: 2017/2/6, Mon 17:44
> Subject: [ClusterLabs] Antw: Re: [Question] About a change of crm_failcount.
>>>> Ken Gaillot <kgaillot at redhat.com> schrieb am 02.02.2017 um
> 19:33 in Nachricht
> <91a83571-9930-94fd-e635-96283067105c at redhat.com>:
>> On 02/02/2017 12:23 PM, renayama19661014 at ybb.ne.jp wrote:
>>> Hi All,
>>> By the next correction, the user was not able to set a value except
> zero in
>>> - [Fix: tools: implement crm_failcount command-line options correctly]
>>> However, pgsql RA sets INFINITY in a script.
>>> ocf_exit_reason "My data is newer than new master's one.
> New master's
>> location : $master_baseline"
>>> exec_with_retry 0 $CRM_FAILCOUNT -r $OCF_RESOURCE_INSTANCE -U
> $NODENAME -v
>>> return $OCF_ERR_GENERIC
>>> There seems to be the influence only in pgsql somehow or other.
>>> Can you revise it to set a value except zero in crm_failcount?
>>> We make modifications to use crm_attribute in pgsql RA if we cannot
>>> Best Regards,
>>> Hideo Yamauchi.
>> Hmm, I didn't realize that was used. I changed it because it's not
>> good idea to set fail-count without also changing last-failure and
>> having a failed op in the LRM history. I'll have to think about what
>> best alternative is.
> The question also is whether the RA can acieve the same effect otherwise. I
> thought CRM sets the failcount, not the RA...
>> Users mailing list: Users at clusterlabs.org
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
> Users mailing list: Users at clusterlabs.org
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
More information about the Users