[Pacemaker] Unnecessarily Failover when restarting network!!!

Mon Mar 1 05:34:09 EST 2010

On Thu, Feb 25, 2010 at 7:32 PM, Jayakrishnan <jayakrishnanlll at gmail.com> wrote:
> Hiiii,
> One more question...
> I managed make every things to work with Heartbeat-Pacemaker (2.99 - 1.0.5).
> I have a cluster ip, pingd, postgresql lsb and a lsb resource successfully
> configured for slony replication .. But when I restart network via
>
> # /etc/init.d/networking restart
>
> split-brain is happening.. I have increases my monitor intervels and even
> dampening in all resources and even in by ha.cf file but still split brain
> is happening. Please advice me!!!

I guess its permanently affecting Heartbeat's communication or
membership mechanisms.

> As I have specified dampening as 90s, the resources are sticking in the
> master for 90s and the n gets started in slave. that is for 90s it is in
> tact and gets shifted .. What could be the reason??
> Here stonith would not work because the resources (maybe pingd) are not
> supposed to do a failover if I do restart my network in my active machine..
>
> Please advice...
>
> ha.cf
>
> autojoin none
> keepalive 2
> deadtime 60
> warntime 50
> initdead 64
> udpport    694
> bcast eth1
> auto_failback off
> node node1
> node node2
> crm respawn
>  use_logd yes
> -----------------------------------
>
> My crm-cli
>
> node $id="3952b93e-786c-47d4-8c2f-a882e3d3d105" node2 \
>     attributes standby="off"
> node $id="ac87f697-5b44-4720-a8af-12a6f2295930" node1 \
>     attributes standby="off"
> primitive pgsql lsb:postgresql-8.4 \
>     meta target-role="Started" \
>     op monitor interval="120s" timeout="60s"
> primitive pingd ocf:heartbeat:pingd \
>     params name="pingd" dampen="90" multiplier="100" host_list="192.168.10.1
> 192.168.10.69" \
>     op monitor interval="65s" timeout="20s"
> primitive slony-fail lsb:slony_failover \
>     meta target-role="Started"
> primitive vir-ip ocf:heartbeat:IPaddr2 \
>     params ip="192.168.10.10" nic="eth0" cidr_netmask="24"
> broadcast="192.168.10.255" \
>     op monitor interval="65s" timeout="20s" \
>     meta target-role="Started"
> clone pgclone pgsql \
>     meta notify="true" globally-unique="false" interleave="true"
> target-role="Started"
> clone pingclone pingd \
>     meta globally-unique="false"
> location ip-on-pingd vir-ip \
>     rule $id="ip-on-pingd-rule" -inf: not_defined pingd or pingd number:lte
> 0
> colocation slony-with-ip inf: slony-fail vir-ip
> order ip-b4-slony inf: vir-ip slony-fail
> property $id="cib-bootstrap-options" \
>     no-quorum-policy="ignore" \
>     stonith-enabled="false" \
>     dc-version="1.0.5-3840e6b5a305ccb803d29b468556739e75532d56" \
>     cluster-infrastructure="Heartbeat" \
>     last-lrm-refresh="1267090743"
> rsc_defaults $id="rsc-options" \
>     resource-stickiness="1000"
> ==========================================
> --
> Thanks & Regards,
>
> Jayakrishnan. L
>
> Visit: www.jayakrishnan.bravehost.com
>
>
> _______________________________________________
> Pacemaker mailing list
> Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
>