[Pacemaker] Resource starts on wrong node ?

Wed Sep 21 08:24:27 EDT 2011

Hi,

On Wed, Sep 21, 2011 at 3:03 PM, Hans Lammerts <j.lammerts at chello.nl> wrote:
>  Dan,
>
>
>
> Thanks for the swift reply.
>
> I didn't know pacemaker was sort of loadbalancing across nodes.
>
> Maybe I should read the documentation in more detail.
>
>
>
> Regarding the versions:
>
> I would like to have the newest versions, but what I've done until now is
> just install what's available
>
> from the Centos repositories.
>
> Indeed I would like to upgrade since I also sometimes experience the issue
> that several heartbeat daemons
>
> start looping when I change something in the config. Something that's
> supposed to be fixed in a higher level
>
> of corosync/heartbeat/pacemaker
>

Have a look at http://clusterlabs.org/wiki/RHEL as to how to add repos
for EL6. Unfortunately, afaics, only Pacemaker is available as a newer
version, 1.1.5, corosync is still at 1.2.3.

I'd also recommend building corosync RPM's from the tarball
(http://www.corosync.org/) , but that's just my personal preference,
some prefer pre-built binaries.

>
>
> About what you said: Is there a limited number of resources that can run on
> one node, before pacemaker decides it is going to run a subsequent resource
> on another node ?

The algorithm is basically round robin. By default it doesn't make any
assumptions about the "importance" of the resources, first resource
goes to first node, second to second node, third to first node, fourth
to second node, a.s.o., it's round robin, like I said.

>
> Wouldn't it be best to always use the colocation and order directives to
> prevent this from happening ?
>

It all depends on the purpose of the cluster, if it fits the need for
your setup, than yes, use colocation and ordering. There really isn't
a "one size fits all" scenario.

Regards,
Dan

>
>
> Thanks again,
>
>
>
> Hans
>
>
> -----Original message-----
> To: The Pacemaker cluster resource manager <pacemaker at oss.clusterlabs.org>;
> From: Dan Frincu <df.cluster at gmail.com>
> Sent: Wed 21-09-2011 12:44
> Subject: Re: [Pacemaker] Resource starts on wrong node ?
> Hi,
>
> On Wed, Sep 21, 2011 at 1:02 PM, Hans Lammerts <j.lammerts at chello.nl> wrote:
>> Hi all,
>>
>>
>>
>> Just started to configure a two node cluster (Centos 6) with drbd
>> 8.4.0-31.el6,
>>
>> corosync 1.2.3 and pacemaker 1.1.2.
>
> Strange choice of versions, if it's a new setup, why don't you go for
> corosync 1.4.1 and pacemaker 1.1.5?
>
>>
>> I created three DRBD filesystems, and started to add them in the crm
>> config
>> one by one.
>>
>> Everything went OK. After adding these resources they start on node1, and
>> when I set node1
>>
>> in standby, these three DRBD resources failover nicely to the second node.
>> And vice versa.
>>
>> So far so good.
>>
>>
>>
>> Next, I added one extra resource, that is supposed to put an IP alias on
>> eth0.
>>
>> This also works, but strangely enough the alias is set on eth0 of the
>> second
>> node, where I would have
>>
>> expected it to start on the first node (just as the three drbd resources
>> did).
>>
>> Why the.... does Pacemaker decide that this resource is to be started on
>> the
>> second node ? I cannot grasp
>>
>> the reason why.
>
> Because it tries to load balance resources on available nodes. You
> have several resources running on one node, and didn't specify any
> restrictions on the mysqlip, therefore it chose the second node as it
> had less resources on it. You override the behavior with constraints.
> See below.
>
>>
>> Hope anyone can tell me what I'm doing wrong.
>>
>>
>>
>> Thanks,
>>
>> Hans
>>
>>
>>
>> Just to be sure, I'll show my config below:
>>
>>
>>
>> node cl1 \
>>
>>         attributes standby="off"
>>
>> node cl2 \
>>
>>         attributes standby="off"
>>
>> primitive drbd0 ocf:linbit:drbd \
>>
>>         params drbd_resource="mysql" drbdconf="/etc/drbd.conf" \
>>
>>         op start interval="0" timeout="240s" \
>>
>>         op monitor interval="20s" timeout="20s" \
>>
>>         op stop interval="0" timeout="100s"
>>
>> primitive drbd1 ocf:linbit:drbd \
>>
>>         params drbd_resource="www" drbdconf="/etc/drbd.conf" \
>>
>>         op start interval="0" timeout="240s" \
>>
>>         op monitor interval="20s" timeout="20s" \
>>
>>         op stop interval="0" timeout="100s"
>>
>> primitive drbd2 ocf:linbit:drbd \
>>
>>         params drbd_resource="zarafa" drbdconf="/etc/drbd.conf" \
>>
>>         op start interval="0" timeout="240s" \
>>
>>         op monitor interval="20s" timeout="20s" \
>>
>>         op stop interval="0" timeout="100s"
>>
>> primitive mysqlfs ocf:heartbeat:Filesystem \
>>
>>         params device="/dev/drbd0" fstype="ext4"
>> directory="/var/lib/mysql"
>> \
>>
>>         op start interval="0" timeout="60s" \
>>
>>         op monitor interval="20s" timeout="40s" \
>>
>>         op stop interval="0" timeout="60s" \
>>
>>         meta target-role="Started"
>>
>> primitive mysqlip ocf:heartbeat:IPaddr2 \
>>
>>         params ip="192.168.2.30" nic="eth0" cidr_netmask="24" \
>>
>>         op start interval="0s" timeout="60s" \
>>
>>         op monitor interval="5s" timeout="20s" \
>>
>>         op stop interval="0s" timeout="60s" \
>>
>>         meta target-role="Started"
>>
>> primitive wwwfs ocf:heartbeat:Filesystem \
>>
>>         params device="/dev/drbd1" fstype="ext4" directory="/var/www" \
>>
>>         op start interval="0" timeout="60s" \
>>
>>         op monitor interval="20s" timeout="40s" \
>>
>>         op stop interval="0" timeout="60s"
>>
>> primitive zarafafs ocf:heartbeat:Filesystem \
>>
>>         params device="/dev/drbd2" fstype="ext4"
>> directory="/var/lib/zarafa"
>> \
>>
>>         op start interval="0" timeout="60s" \
>>
>>         op monitor interval="20s" timeout="40s" \
>>
>>         op stop interval="0" timeout="60s"
>>
>> ms ms_drbd0 drbd0 \
>>
>>         meta master-max="1" master-node-max="1" clone-max="2"
>> clone-node-max="1" notify="true" target-role="Started"
>>
>> ms ms_drbd1 drbd1 \
>>
>>         meta master-max="1" master-node-max="1" clone-max="2"
>> clone-node-max="1" notify="true"
>>
>> ms ms_drbd2 drbd2 \
>>
>>         meta master-max="1" master-node-max="1" clone-max="2"
>> clone-node-max="1" notify="true"
>>
>> colocation fs2_on_drbd inf: wwwfs ms_drbd1:Master
>>
>> colocation fs3_on_drbd inf: zarafafs ms_drbd2:Master
>>
>> colocation fs_on_drbd inf: mysqlfs ms_drbd0:Master
>>
>> order fs2_after_drbd inf: ms_drbd1:promote wwwfs:start
>>
>> order fs3_after_drbd inf: ms_drbd2:promote zarafafs:start
>>
>> order fs_after_drbd inf: ms_drbd0:promote mysqlfs:start
>>
>
> You either set a location constraint for mysqlip or use a colocation
> and ordering constraint for it.
>
> e.g.: colocation mysqlip_on_drbd inf: mysqlip ms_drbd0:Master
> order mysqlip_after_drbd inf: ms_drbd0:promote mysqlip:start
>
>> property $id="cib-bootstrap-options" \
>>
>>         dc-version="1.1.2-f059ec7ced7a86f18e5490b67ebf4a0b963bccfe" \
>>
>>         cluster-infrastructure="openais" \
>>
>>         expected-quorum-votes="2" \
>>
>>         no-quorum-policy="ignore" \
>>
>>         stonith-enabled="false"
>>
>> rsc_defaults $id="rsc-options" \
>>
>>         resource_stickyness="INFINITY" \
>
> I wouldn't set INFINITY, it will cause problems, I'd give it a value
> of 500 or 1000.
>
> Regards,
> Dan
>
>>
>>         migration-threshold="1"
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs:
>> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>>
>>
>
>
>
> --
> Dan Frincu
> CCNA, RHCE
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs:
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs:
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>
>

-- 
Dan Frincu
CCNA, RHCE