[Pacemaker] [Problem]Reboot by the error of the clone resource influences the resource of other nodes.

Andrew Beekhof andrew at beekhof.net
Fri Apr 1 03:20:49 EDT 2011


The clone instance numbers for anonymous clones are an implementation
detail and nothing should be inferred from them.
Did anything actually get moved or just the numbers changed?

On Thu, Mar 31, 2011 at 10:07 AM,  <renayama19661014 at ybb.ne.jp> wrote:
> Hi Vladislav,
>
> Thank you for comment.
>
> As for us, this problem is taking place in the top of 1.0.10 and 1.0.
>
> Though possibly there may be this problem from a considerably version in front.
>
> Let's wait for comment of Andrew.
>
> Best Regards,
> Hideo Yamauchi.
>
> --- On Thu, 2011/3/31, Vladislav Bogdanov <bubble at hoster-ok.com> wrote:
>
>> Hi,
>>
>> 31.03.2011 04:15, renayama19661014 at ybb.ne.jp wrote:
>> [...]
>> > Node srv01 (45f985d7-e7c8-4834-b01b-16b99526672b): online
>> >         main_rsc        (ocf::pacemaker:Dummy) Started
>> >         prmDummy1:0     (ocf::pacemaker:Dummy) Started
>> >         prmPingd:0      (ocf::pacemaker:ping) Started
>> > Node srv02 (ed7fdcbf-9c17-4f31-8a27-a831a6b39ed5): online
>> >         prmDummy1:1     (ocf::pacemaker:Dummy) Started
>> >         main_rsc2       (ocf::pacemaker:Dummy) Started
>> >         prmPingd:1      (ocf::pacemaker:ping) Started
>> > Node srv03 (e2ffc1ed-3ebe-47e2-b51b-b0f04b454311): online
>> >         prmDummy1:2     (ocf::pacemaker:Dummy) Started
>> >         prmPingd:2      (ocf::pacemaker:ping) Started
>> [...]
>> > Node srv01 (45f985d7-e7c8-4834-b01b-16b99526672b): online
>> > Node srv02 (ed7fdcbf-9c17-4f31-8a27-a831a6b39ed5): online
>> >         prmDummy1:1     (ocf::pacemaker:Dummy) Started     ---------> :1(funny)
>> >         prmPingd:0      (ocf::pacemaker:ping) Started      ---------> :0(funny)
>> > Node srv03 (e2ffc1ed-3ebe-47e2-b51b-b0f04b454311): online
>> >         main_rsc        (ocf::pacemaker:Dummy) Started
>> >         prmDummy1:2     (ocf::pacemaker:Dummy) Started     ---------> :2(funny)
>> >         prmPingd:1      (ocf::pacemaker:ping) Started      ---------> :1(funny)
>> >
>> > We think the reboot of pingd to be unnecessary in a srv02 node.
>> > Is there the method how this problem is settled?
>>
>> I observe this problem too (with latest 1.1 tip):
>> pengine unnecessarily decides to swap anonymous clone instances between
>> nodes when it rearranges cluster resources. This leads to all dependent
>> resources on that nodes to be stopped and started again.
>>
>> In your case it swapped
>> srv02:prmPingd:1,srv03:prmPingd:2 <-> srv02:prmPingd:0,srv03:prmPingd:1
>>
>> In my case I often see something like this:
>>
>> Jan 17 09:18:58 v02-a pengine: [29790]: notice: LogActions: Move
>> resource libvirtd:0#011(Started v02-c -> v02-d)
>> Jan 17 09:18:58 v02-a pengine: [29790]: notice: LogActions: Move
>> resource libvirtd:1#011(Started v02-d -> v02-a)
>> Jan 17 09:18:58 v02-a pengine: [29790]: notice: LogActions: Move
>> resource libvirtd:2#011(Started v02-a -> v02-b)
>> Jan 17 09:18:58 v02-a pengine: [29790]: notice: LogActions: Move
>> resource libvirtd:3#011(Started v02-b -> v02-c)
>>
>> I contacted Andrew about this directly some time ago (with hb_report),
>> but hadn't have power to raise this problem on ML (what is he actually
>> asked me to do) :( .
>>
>> I suspect this is 1.1-specific, but this is solely a feeling.
>>
>> Maybe somebody familiar with mercurial can bisect when this bug was
>> introduced?
>>
>> Best,
>> Vladislav
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>




More information about the Pacemaker mailing list