[Pacemaker] Preventing Automatic Failback

Michael Monette mmonette at 2keys.ca
Mon Jan 20 09:22:25 EST 2014


Hi, 

I posted this question before but my question was a bit unclear.

I have 2 nodes with DRBD with Postgresql.

When node-1 fails, everything fails to node-2 . But when node 1 is recovered, things try to failback to node-1 and all the services running on node-2 get disrupted(things don't ACTUALLY fail back to node-1..they try, fail, and then all services on node-2 are simply restarted..very annoying). This does not happen if I perform the same tests on node-2! I can reboot node-2, things fail to node-1 and node-2 comes online and waits until he is needed(this is what I want!) It seems to only affect my node-1's.

I have tried to set resource stickiness, I have tried everything I can really think of, but whenever the Primary has recovered, it will always disrupt services running on node-2. 

Also I tried removing things from this config to try and isolate this. At one point I removed the atlassian_jira and drbd2_var primitives and only had a failover-ip and drbd1_opt, but still had the same problem. Hopefully someone can pinpoint this out for me. If I can't really avoid this, I would at least like to make this "bug" or whatever happen on node-2 instead of the actives.


Here is my config:

node node-1.comp.com \
        attributes standby="off"
node node-1.comp.com \
        attributes standby="off"
primitive atlassian_jira lsb:jira \
        op start interval="0" timeout="240" \
        op stop interval="0" timeout="240"
primitive drbd1_opt ocf:heartbeat:Filesystem \
        params device="/dev/drbd1" directory="/opt/atlassian" fstype="ext4"
primitive drbd2_var ocf:heartbeat:Filesystem \
        params device="/dev/drbd2" directory="/var/atlassian" fstype="ext4"
primitive drbd_data ocf:linbit:drbd \
        params drbd_resource="r0" \
        op monitor interval="29s" role="Master" \
        op monitor interval="31s" role="Slave"
primitive failover-ip ocf:heartbeat:IPaddr2 \
        params ip="10.199.0.13"
group jira_services drbd1_opt drbd2_var failover-ip atlassian_jira
ms ms_drbd_data drbd_data \
        meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"
colocation jira_services_on_drbd inf: atlassian_jira ms_drbd_data:Master
order jira_services_after_drbd inf: ms_drbd_data:promote jira_services:start
property $id="cib-bootstrap-options" \
        dc-version="1.1.10-14.el6_5.1-368c726" \
        cluster-infrastructure="classic openais (with plugin)" \
        expected-quorum-votes="2" \
        stonith-enabled="false" \
        no-quorum-policy="ignore" \
        last-lrm-refresh="1390183165" \
        default-resource-stickiness="INFINITY"
rsc_defaults $id="rsc-options" \
        resource-stickiness="INFINITY"

Thanks

Mike




More information about the Pacemaker mailing list