[Pacemaker] Infinite fail-count and migration-threshold after node fail-back

Thu Nov 11 10:59:40 EST 2010

Hi,

Pavlos Parissis wrote:
> On 11 November 2010 13:04, Dan Frincu <dfrincu at streamwide.ro> wrote:
>   
>> Hi,
>>
>> Andrew Beekhof wrote:
>>
>> On Mon, Oct 11, 2010 at 9:40 AM, Dan Frincu <dfrincu at streamwide.ro> wrote:
>>
>>
>> Hi all,
>>
>> I've managed to make this setup work, basically the issue with a
>> symmetric-cluster="false" and specifying the resources' location manually
>> means that the resources will always obey the location constraint, and (as
>> far as I could see) disregard the rsc_defaults resource-stickiness values.
>>
>>
>> This definitely should not be the case.
>> Possibly your stickiness setting is being eclipsed by the combination
>> of the location constraint scores.
>> Try INFINITY instead.
>>
>>
>>
>> I understand your point and I believe also this to be the case, however I've
>> noticed that by specifying symmetric-cluster="false" for each resource I
>> need to add 2 location constraints, which overcrowds the config, and if I
>> want (and hope) to go to a config with multiple servers and resources, each
>> with specific rules, then also adding location constraints for each resource
>> is an overhead which I'd rather not include, if possible.
>>     
>
> From the documentation [1]
> 6.2.2. Asymmetrical "Opt-In" Clusters
> To create an opt-in cluster, start by preventing resources from
> running anywhere by default
> crm_attribute --attr-name symmetric-cluster --attr-value false
> Then start enabling nodes. The following fragment says that the web
> server prefers sles-1, the database prefers sles-2 and both can
> failover to sles-3 if their most preferred node fails.
>
>   <constraints>
>     <rsc_location id="loc-1" rsc="Webserver" node="sles-1" score="200"/>
>     <rsc_location id="loc-2" rsc="Webserver" node="sles-3" score="0"/>
>     <rsc_location id="loc-3" rsc="Database" node="sles-2" score="200"/>
>     <rsc_location id="loc-4" rsc="Database" node="sles-3" score="0"/>
>   </constraints>
> Example 6.1. Example set of opt-in location constraints
>
> At the moment you have symmetric-cluster=false, you need to add
> location constraints in order to get your resources running.
> Below is my conf and it works as expected, pbx_service_01 starts on
> node-01 and never fails back, in case failed over to node-03 and
> node-01 is back on line, due to resource-stickiness="1000", but take a
> look at the score in location constraint, very low scores compared to
> 1000 - I could  have also set it to inf
>   
Yes but you don't have groups defined in your setup, having groups means 
the score of each active resource is added. 
http://www.clusterlabs.org/doc/en-US/Pacemaker/1.0/html/Pacemaker_Explained/ch-advanced-resources.html#id2220530

For example:

root at cluster1:~# ptest -sL
Allocation scores:
group_color: all allocation score on cluster1: 0
group_color: all allocation score on cluster2: -1000000
group_color: virtual_ip_1 allocation score on cluster1: 1000
group_color: virtual_ip_1 allocation score on cluster2: -1000000
group_color: virtual_ip_2 allocation score on cluster1: 1000
group_color: virtual_ip_2 allocation score on cluster2: 0
group_color: Failover_Alert allocation score on cluster1: 1000
group_color: Failover_Alert allocation score on cluster2: 0
group_color: fs_home allocation score on cluster1: 1000
group_color: fs_home allocation score on cluster2: 0
group_color: fs_mysql allocation score on cluster1: 1000
group_color: fs_mysql allocation score on cluster2: 0
group_color: fs_storage allocation score on cluster1: 1000
group_color: fs_storage allocation score on cluster2: 0
group_color: httpd allocation score on cluster1: 1000
group_color: httpd allocation score on cluster2: 0
group_color: mysqld allocation score on cluster1: 1000
group_color: mysqld allocation score on cluster2: 0
clone_color: ms_drbd_home allocation score on cluster1: 9000
clone_color: ms_drbd_home allocation score on cluster2: -1000000
clone_color: drbd_home:0 allocation score on cluster1: 1100
clone_color: drbd_home:0 allocation score on cluster2: 0
clone_color: drbd_home:1 allocation score on cluster1: 0
clone_color: drbd_home:1 allocation score on cluster2: 1100
native_color: drbd_home:0 allocation score on cluster1: 1100
native_color: drbd_home:0 allocation score on cluster2: 0
native_color: drbd_home:1 allocation score on cluster1: -1000000
native_color: drbd_home:1 allocation score on cluster2: 1100
drbd_home:0 promotion score on cluster1: 18100
drbd_home:1 promotion score on cluster2: -1000000
clone_color: ms_drbd_mysql allocation score on cluster1: 10100
clone_color: ms_drbd_mysql allocation score on cluster2: -1000000
clone_color: drbd_mysql:0 allocation score on cluster1: 1100
clone_color: drbd_mysql:0 allocation score on cluster2: 0
clone_color: drbd_mysql:1 allocation score on cluster1: 0
clone_color: drbd_mysql:1 allocation score on cluster2: 1100
native_color: drbd_mysql:0 allocation score on cluster1: 1100
native_color: drbd_mysql:0 allocation score on cluster2: 0
native_color: drbd_mysql:1 allocation score on cluster1: -1000000
native_color: drbd_mysql:1 allocation score on cluster2: 1100
drbd_mysql:0 promotion score on cluster1: 20300
drbd_mysql:1 promotion score on cluster2: -1000000
clone_color: ms_drbd_storage allocation score on cluster1: 11200
clone_color: ms_drbd_storage allocation score on cluster2: -1000000
clone_color: drbd_storage:0 allocation score on cluster1: 1100
clone_color: drbd_storage:0 allocation score on cluster2: 0
clone_color: drbd_storage:1 allocation score on cluster1: 0
clone_color: drbd_storage:1 allocation score on cluster2: 1100
native_color: drbd_storage:0 allocation score on cluster1: 1100
native_color: drbd_storage:0 allocation score on cluster2: 0
native_color: drbd_storage:1 allocation score on cluster1: -1000000
native_color: drbd_storage:1 allocation score on cluster2: 1100
drbd_storage:0 promotion score on cluster1: 22500
drbd_storage:1 promotion score on cluster2: -1000000
native_color: virtual_ip_1 allocation score on cluster1: 12300
native_color: virtual_ip_1 allocation score on cluster2: -1000000
native_color: virtual_ip_2 allocation score on cluster1: 8000
native_color: virtual_ip_2 allocation score on cluster2: -1000000
native_color: Failover_Alert allocation score on cluster1: 7000
native_color: Failover_Alert allocation score on cluster2: -1000000
native_color: fs_home allocation score on cluster1: 6000
native_color: fs_home allocation score on cluster2: -1000000
native_color: fs_mysql allocation score on cluster1: 5000
native_color: fs_mysql allocation score on cluster2: -1000000
native_color: fs_storage allocation score on cluster1: 4000
native_color: fs_storage allocation score on cluster2: -1000000
native_color: mysqld allocation score on cluster1: 4000
native_color: mysqld allocation score on cluster2: -1000000
native_color: httpd allocation score on cluster1: 16000
native_color: httpd allocation score on cluster2: -1000000
drbd_home:0 promotion score on cluster1: 1000000
drbd_home:1 promotion score on cluster2: -1000000
drbd_mysql:0 promotion score on cluster1: 1000000
drbd_mysql:1 promotion score on cluster2: -1000000
drbd_storage:0 promotion score on cluster1: 1000000
drbd_storage:1 promotion score on cluster2: -1000000
clone_color: ping_gw_clone allocation score on cluster1: 0
clone_color: ping_gw_clone allocation score on cluster2: 0
clone_color: ping_gw:0 allocation score on cluster1: 1000
clone_color: ping_gw:0 allocation score on cluster2: 0
clone_color: ping_gw:1 allocation score on cluster1: 0
clone_color: ping_gw:1 allocation score on cluster2: 1000
native_color: ping_gw:0 allocation score on cluster1: 1000
native_color: ping_gw:0 allocation score on cluster2: 0
native_color: ping_gw:1 allocation score on cluster1: -1000000
native_color: ping_gw:1 allocation score on cluster2: 1000

Regards,

Dan

> location PrimaryNode-drbd_01 ms-drbd_01 100: node-01
> location PrimaryNode-drbd_02 ms-drbd_02 100: node-02
> location PrimaryNode-pbx_service_01 pbx_service_01 200: node-01
> location PrimaryNode-pbx_service_02 pbx_service_02 200: node-02
> location SecondaryNode-drbd_01 ms-drbd_01 0: node-03
> location SecondaryNode-drbd_02 ms-drbd_02 0: node-03
> location SecondaryNode-pbx_service_01 pbx_service_01 10: node-03
> location SecondaryNode-pbx_service_02 pbx_service_02 10: node-03
> location fencing-on-node-01 pdu 1: node-01
> location fencing-on-node-02 pdu 1: node-02
> location fencing-on-node-03 pdu 1: node-03
> rsc_defaults $id="rsc-options"   resource-stickiness="1000"
>
>
> [1]http://www.clusterlabs.org/doc/en-US/Pacemaker/1.0/html/Pacemaker_Explained/ch06s02s02.html
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>   

-- 
Dan FRINCU
Systems Engineer
CCNA, RHCE
Streamwide Romania

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20101111/18b9f6cf/attachment-0001.html>