[ClusterLabs] Order/Colocation Dependancy On Cloned Resource

Vladislav Bogdanov bubble at hoster-ok.com
Tue Mar 17 05:45:30 UTC 2015


16.03.2015 20:49, Thomas Meagher wrote:
> Hello, I am having some issues ordering a resource that depends on a
> clone.  Specifically, I have a two node cluster with two virtual IPs.
> Each vip has a slight preference for its own "home" node via a location
> constraint, and they have mandatory colocation and order constraints on
> a cloned (anonymous) webserver.  Problem I am seeing is when a node
> comes back after being down (let's say node2 goes down).  Both vips fail
> over to node1 (still up), but when node2 comes back the vip is moved
> back immediately.  This results in vip2 being unavailable until the
> webserver actually comes back up on node2.
>
> * node2 goes down.
> * vip2 moves to node1
> * node2 comes back up
> * everything starts/moves IMMEDIATELY to node2 (same transition)
>      * cloned tomcat webserver starting up
>      * vip2 is stopped on node1
> * webserver becomes started on node2
> * vip2 finally becomes available again on node2

At the first glance this correlates with 
http://oss.clusterlabs.org/pipermail/pacemaker/2015-January/023300.html

Playing with pe-inputs and crm_simulate I found that migratable 
resources may behave differently. Particularly, if I add 
'allow-migrate=true' to the VIP definition in the pe-input, crm_simulate 
shows that it will be migrated after downstream clone is started on a 
destination node. I didn't try that in the real life yet (that would 
require adding migration (pseudo?)support in the IPAddr2 RA), but that 
could be a valid work-around and is on my todo list.

>
> The vip resources have colocation and ordering constraints on the cloned
> webserver, so I am confused why they are immediately moved when the node
> comes back online, instead of waiting to move them until the webserver
> resource finishes starting on node2.  Does ordering on a cloned resource
> only take into account that the resource is started "somewhere"? Should
> my webserver clone to be globally unique?  Is there a different way to
> enforce what I am trying to achieve here?  I want to make sure the
> cloned resource is started on the target node before dependant resources
> are started/moved there.
>
> Relevant Config (crm format):
>
> clone tomcat-clone tomcat-group
> location loc-vip1 vip1 rule 100: nodeNumber eq 1
> location loc-vip2 vip2 rule 100: nodeNumber eq 2
> order order-vip1 inf: tomcat-clone vip1
> order order-vip2 inf: tomcat-clone vip2
> colocation colo-vip1 inf: vip1 tomcat-clone
> colocation colo-vip2 inf: vip2 tomcat-clone
>
> Relevant Logs:
>
> Mar 16 17:27:24 localhost corosync[3198]: [TOTEM ] A new membership
> (172.20.70.151:100) was formed. Members joined: 2
> Mar 16 17:27:24 localhost corosync[3198]: [QUORUM] Members[2]: 1 2
> Mar 16 17:27:24 localhost corosync[3198]: [MAIN  ] Completed service
> synchronization, ready to provide service.
> Mar 16 17:27:24 localhost pacemakerd[3241]: notice:
> crm_update_peer_state: pcmk_quorum_notification: Node module-2[2] -
> state is now member (was lost)
>
> Mar 16 17:27:31 localhost pengine[3268]: notice: LogActions: Move  vip1
>        (Started module-1 -> module-2)
> Mar 16 17:27:31 localhost pengine[3268]: notice: LogActions: Start
> postgres:1 (module-2)
> Mar 16 17:27:31 localhost pengine[3268]: notice: LogActions: Start
> ethmonitor:1       (module-2)
> Mar 16 17:27:31 localhost pengine[3268]: notice: LogActions: Start
> fsmonitor:1        (module-2)
> Mar 16 17:27:31 localhost pengine[3268]: notice: LogActions: Start
> tomcat-instance:1  (module-2)
> Mar 16 17:27:31 localhost pengine[3268]: notice: LogActions: Start
> ClusterMonitor:1   (module-2)
> Mar 16 17:27:33 localhost pengine[3268]: notice: process_pe_message:
> Calculated Transition 26: /var/lib/pacemaker/pengine/pe-input-66.bz2
>
> Mar 16 17:27:33 localhost crmd[3269]: notice: te_rsc_command: Initiating
> action 41: stop vip1_stop_0 on module-1 (local)
> Mar 16 17:27:33 localhost crmd[3269]: notice: te_rsc_command: Initiating
> action 117: notify postgres_pre_notify_start_0 on module-1 (local)
> Mar 16 17:27:33 localhost crmd[3269]: notice: te_rsc_command: Initiating
> action 80: start ethmonitor_start_0 on module-2
> Mar 16 17:27:33 localhost crmd[3269]: notice: te_rsc_command: Initiating
> action 102: start tomcat-instance_start_0 on module-2
>
> Mar 16 17:28:23 localhost pengine[3268]: notice: LogActions: Start
> vip1       (module-2)
>
> Mar 16 17:29:55 localhost crmd[3269]: notice: te_rsc_command: Initiating
> action 39: start vip1_start_0 on module-2
>
>
>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>





More information about the Users mailing list