[Pacemaker] clone resource doesn't stop during node standby

Tue Mar 23 12:37:31 EDT 2010

On Fri, Mar 19, 2010 at 2:32 AM, Junko IKEDA <ikedaj at intellilink.co.jp> wrote:
> Hi,
>
>>> # crm_mon -1
>>>
>>> ============
>>> Stack: openais
>>> Current DC: cspm01 - partition with quorum
>>> Version: 1.0.8-2a76c6ac04bc stable-1.0 tip
>>> 2 Nodes configured, 2 expected votes
>>> 2 Resources configured.
>>> ============
>>>
>>> Node cspm02: standby
>>> Online: [ cspm01 ]
>>>
>>>    Resource Group: UMgroup01
>>>        UmDummy03      (ocf::heartbeat:Dummy): Started cspm01
>>>        UmDummy04      (ocf::heartbeat:Dummy01):       Started cspm01
>>> (unmanaged)
>>> FAILED
>>>    Clone Set: clnUMgroup01
>>>        Started: [ cspm01 cspm02 ]
>>>
>>> Failed actions:
>>>       UmDummy04_monitor_10000 (node=cspm01, call=13, rc=7,
>>> status=complete):
>>> not running
>>>       UmDummy04_stop_0 (node=cspm01, call=14, rc=1, status=complete):
>>> unknown error
>>>
>>>
>>> It seems that constraints setting prevents to stop action because there
>>> is
>>> the unmanaged resource.
>>
>> No. The UmDummy04 is unmanaged because it failed to stop when we asked it
>> to.
>> This therefore prevents UmDummy03 from being stopped (as required by
>> the semantics of group resources).
>
> UmDummy03 and UmDummy04 has been running on cspm01,
> and now, its pair node, cspm02 is in standby status.
> but the clone on cspm02 couldn't be stopped despite the node was in standby.
> It should be stopped, isn't it?

No.

The host its running on is irrelevant.

You have an ordering constraint that tells clnUMgroup01 not to stop
until UMgroup01 has.
If UMgroup01 doesn't stop, neither can clnUMgroup01.

Thats why we have stonith, to ensure there is a way for the cluster to
continue when resources fail in this way.