[ClusterLabs] Antw: Re: [Question] About a change of crm_failcount.
renayama19661014 at ybb.ne.jp
renayama19661014 at ybb.ne.jp
Thu Feb 9 05:24:22 EST 2017
Hi Ken,
> 1. Return a "hard" error such as OCF_ERR_ARGS or OCF_ERR_PERM. When
> Pacemaker gets one of these errors from an agent, it will ban the
> resource from that node (until the failure is cleared).
The first suggestion does not work well.
Even if this returns OCF_ERR_ARGS and OCF_ERR_PERM, it seems to be to be pre_promote(notify) handling of RA.
Pacemaker does not record the notify(pre promote) error in CIB.
* https://github.com/ClusterLabs/pacemaker/blob/master/crmd/lrm.c#L2411
Because it is not recorded in CIB, there cannot be the thing that pengine works as "hard error".
> 2. Use crm_resource --ban instead. This would ban the resource from that
> node until the user removes the ban with crm_resource --clear (or by
> deleting the ban consraint from the configuration).
The second suggestion works well.
I intend to adopt the second suggestion.
As other methods, you think crm_resource -F to be available, but what do you think?
I think that last-failure does not have a problem either to let you handle pseudotrouble if it is crm_resource -F.
I think whether crm_resource -F is available, but adopt crm_resource -B because RA wants to completely stop pgsql resource.
``` @pgsql RA
pgsql_pre_promote() {
(snip)
if [ "$cmp_location" != "$my_master_baseline" ]; then
ocf_exit_reason "My data is newer than new master's one. New master's location : $master_baseline"
exec_with_retry 0 $CRM_RESOURCE -B -r $OCF_RESOURCE_INSTANCE -N $NODENAME -Q
return $OCF_ERR_GENERIC
fi
(snip)
CRM_FAILCOUNT="${HA_SBIN_DIR}/crm_failcount"
CRM_RESOURCE="${HA_SBIN_DIR}/crm_resource"
```
I test movement a little more and send a patch.
Best Regards,
Hideo Yamauchi.
----- Original Message -----
> From: Ulrich Windl <Ulrich.Windl at rz.uni-regensburg.de>
> To: users at clusterlabs.org; kgaillot at redhat.com
> Cc:
> Date: 2017/2/6, Mon 17:44
> Subject: [ClusterLabs] Antw: Re: [Question] About a change of crm_failcount.
>
>>>> Ken Gaillot <kgaillot at redhat.com> schrieb am 02.02.2017 um
> 19:33 in Nachricht
> <91a83571-9930-94fd-e635-96283067105c at redhat.com>:
>> On 02/02/2017 12:23 PM, renayama19661014 at ybb.ne.jp wrote:
>>> Hi All,
>>>
>>> By the next correction, the user was not able to set a value except
> zero in
>> crm_failcount.
>>>
>>> - [Fix: tools: implement crm_failcount command-line options correctly]
>>> -
>>
> https://github.com/ClusterLabs/pacemaker/commit/95db10602e8f646eefed335414e40
>> a994498cafd#diff-6e58482648938fd488a920b9902daac4
>>>
>>> However, pgsql RA sets INFINITY in a script.
>>>
>>> ```
>>> (snip)
>>> CRM_FAILCOUNT="${HA_SBIN_DIR}/crm_failcount"
>>> (snip)
>>> ocf_exit_reason "My data is newer than new master's one.
> New master's
>> location : $master_baseline"
>>> exec_with_retry 0 $CRM_FAILCOUNT -r $OCF_RESOURCE_INSTANCE -U
> $NODENAME -v
>> INFINITY
>>> return $OCF_ERR_GENERIC
>>> (snip)
>>> ```
>>>
>>> There seems to be the influence only in pgsql somehow or other.
>>>
>>> Can you revise it to set a value except zero in crm_failcount?
>>> We make modifications to use crm_attribute in pgsql RA if we cannot
> revise
>> it.
>>>
>>> Best Regards,
>>> Hideo Yamauchi.
>>
>> Hmm, I didn't realize that was used. I changed it because it's not
> a
>> good idea to set fail-count without also changing last-failure and
>> having a failed op in the LRM history. I'll have to think about what
> the
>> best alternative is.
>
> The question also is whether the RA can acieve the same effect otherwise. I
> thought CRM sets the failcount, not the RA...
>
>>
>> _______________________________________________
>> Users mailing list: Users at clusterlabs.org
>> http://lists.clusterlabs.org/mailman/listinfo/users
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
>
>
>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
More information about the Users
mailing list