<div dir="ltr">Thank you Andrew, perfect !<div>The more I use Pacemaker, the more tricky it gets when working with third-party tools (although crmsh is great). Completing PCMK configuration obviously involves some manual XML file tuning. It's often more crmsh issues than PCMK actually...<br>
</div><div><br></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">2014-06-27 7:43 GMT+02:00 Andrew Beekhof <span dir="ltr"><<a href="mailto:andrew@beekhof.net" target="_blank">andrew@beekhof.net</a>></span>:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5"><br>
On 25 Jun 2014, at 7:36 pm, Sékine Coulibaly <<a href="mailto:scoulibaly@gmail.com">scoulibaly@gmail.com</a>> wrote:<br>
<br>
> Hi all,<br>
><br>
> My setup is as follows : RedHat 6.3 (yes, I know,this is quite old) , Pacemaker 1.1.7, Corosync 1.4.1.<br>
><br>
> I noticed something that is strange because since it doesn't complies with what I read (and understood) from the following ressources :<br>
> 1. <a href="http://crmsh.nongnu.org/crm.8.html#cmdhelp_configure_order" target="_blank">http://crmsh.nongnu.org/crm.8.html#cmdhelp_configure_order</a><br>
> 2. <a href="http://doc.opensuse.org/products/draft/SLE-HA/SLE-ha-guide_sd_draft/cha.ha.manual_config.html" target="_blank">http://doc.opensuse.org/products/draft/SLE-HA/SLE-ha-guide_sd_draft/cha.ha.manual_config.html</a><br>
><br>
><br>
> The strange behaviour I experience is related to ordering.<br>
> Let's have this very basic case :<br>
><br>
> One zookeeper process, and one PostgreSQL master. I want them to run alltogether on the master node. Since they bind on the VIP IP, I want the VIP to be started before ZK and PostgreSQL.<br>
> So I setup :<br>
><br>
> primitive VIP73 ocf:heartbeat:IPaddr2 \<br>
> params ip="192.168.73.222" broadcast="192.168.73.255" nic="eth1" cidr_netmask="24" iflabel="VIP73" \<br>
> op monitor interval="10s" timeout="20s"<br>
><br>
> primitive POSTGRESQL ocf:custom:postgresql \<br>
> params repmgr_conf="/var/lib/pgsql/repmgr/repmgr.conf" pgctl="/usr/pgsql-9.2/bin/pg_ctl" pgdata="/opt/custom/pgdata" \<br>
> op start interval="0" timeout="90s" \<br>
> op stop interval="0" timeout="60s" \<br>
> op promote interval="0" timeout="120s" \<br>
> op monitor interval="53s" role="Master" \<br>
> op monitor interval="60s" role="Slave"<br>
><br>
> primitive ZK ocf:custom:zookeeper \<br>
> op monitor interval="5s" timeout="10s" \<br>
> op start interval="0" timeout="10s" \<br>
> op stop interval="0" timeout="10s"<br>
><br>
> ms MS_POSTGRESQL POSTGRESQL \<br>
> meta clone-max="2" target-role="Started" resource-stickiness="100" notify="true"<br>
><br>
> I then add an ordering such as this one :<br>
> order VIP_last inf: VIP73 ZK MS_POSTGRESQL:promote<br>
><br>
> I expect the VIP to be mounted, and then ZK started, and then PostgreSQL master to be promoted. Instead, all the resources seem to be started in parallel.<br>
<br>
</div></div>Yep, that'll do that.<br>
I think you need brackets around the group or something, my memory is a little hazy.<br>
Thats the trigger for crmsh to set sequential=true which gives the behaviour you're looking for.<br>
<br>
The default is quite silly.<br>
<br>
Possibly its easier to just edit the xml.<br>
<div><div class="h5"><br>
><br>
> Here comes a /var/log/messages extract, taken just after a corosync restart.<br>
> It can be seen at Jun 25 05:19:52 that VIP73 and POSTGRESQL are all started simultaneously.<br>
> Do I have something wrong here, or something I didn't understand ?<br>
><br>
><br>
><br>
> primitive VIP73 ocf:heartbeat:IPaddr2 \<br>
> params ip="192.168.73.222" broadcast="192.168.73.255" nic="eth1" cidr_netmask="24" iflabel="VIP73" \<br>
> op monitor interval="10s" timeout="20s"<br>
><br>
> primitive POSTGRESQL ocf:custom:postgresql \<br>
> params repmgr_conf="/var/lib/pgsql/repmgr/repmgr.conf" pgctl="/usr/pgsql-9.2/bin/pg_ctl" pgdata="/opt/custom/pgdata" \<br>
> op start interval="0" timeout="90s" \<br>
> op stop interval="0" timeout="60s" \<br>
> op promote interval="0" timeout="120s" \<br>
> op monitor interval="53s" role="Master" \<br>
> op monitor interval="60s" role="Slave"<br>
><br>
> primitive ZK ocf:custom:zookeeper \<br>
> op monitor interval="5s" timeout="10s" \<br>
> op start interval="0" timeout="10s" \<br>
> op stop interval="0" timeout="10s"<br>
> ms MS_POSTGRESQL POSTGRESQL \<br>
> meta clone-max="2" target-role="Started" resource-stickiness="100" notify="true"<br>
><br>
> I then add an ordering such as this one :<br>
> order VIP_last inf: VIP73 ZK MS_POSTGRESQL:promote<br>
><br>
> I expect the VIP to be mounted, and then ZK started, and then PostgreSQL master to be promoted. Instead, all the resources seem to be started in parallel.<br>
><br>
> Here comes a /var/log/messages extract, taken just after a corosync restart.<br>
> It can be seen at Jun 25 05:19:52 that VIP73 and POSTGRESQL are all started simultaneously.<br>
> Do I have something wrong here, or something I didn't understand ?<br>
><br>
> Thank you !<br>
><br>
> Jun 25 05:19:52 clustera pengine[33832]: info: determine_online_status: Node clustera is online<br>
> Jun 25 05:19:52 clustera pengine[33832]: info: native_print: VIP73#011(ocf::heartbeat:IPaddr2):#011Stopped<br>
> Jun 25 05:19:52 clustera pengine[33832]: info: clone_print: Master/Slave Set: MS_POSTGRESQL [POSTGRESQL]<br>
> Jun 25 05:19:52 clustera pengine[33832]: info: short_print: Stopped: [ POSTGRESQL:0 POSTGRESQL:1 ]<br>
> Jun 25 05:19:52 clustera pengine[33832]: info: native_print: ZK#011(ocf::custom:zookeeper):#011Stopped<br>
> Jun 25 05:19:52 clustera pengine[33832]: info: native_color: Resource POSTGRESQL:1 cannot run anywhere<br>
> Jun 25 05:19:52 clustera pengine[33832]: info: master_color: MS_POSTGRESQL: Promoted 0 instances of a possible 1 to master<br>
> Jun 25 05:19:52 clustera pengine[33832]: info: RecurringOp: Start recurring monitor (10s) for VIP73 on clustera<br>
> Jun 25 05:19:52 clustera pengine[33832]: info: RecurringOp: Start recurring monitor (60s) for POSTGRESQL:0 on clustera<br>
> Jun 25 05:19:52 clustera pengine[33832]: info: RecurringOp: Start recurring monitor (60s) for POSTGRESQL:0 on clustera<br>
> Jun 25 05:19:52 clustera pengine[33832]: info: RecurringOp: Start recurring monitor (5s) for ZK on clustera<br>
> Jun 25 05:19:52 clustera pengine[33832]: notice: LogActions: Start VIP73#011(clustera)<br>
> Jun 25 05:19:52 clustera pengine[33832]: notice: LogActions: Start POSTGRESQL:0#011(clustera)<br>
> Jun 25 05:19:52 clustera pengine[33832]: info: LogActions: Leave POSTGRESQL:1#011(Stopped)<br>
> Jun 25 05:19:52 clustera pengine[33832]: notice: LogActions: Start ZK#011(clustera)<br>
><br>
> ...<br>
> Jun 25 05:19:52 clustera crmd[33833]: info: te_rsc_command: Initiating action 3: probe_complete probe_complete on clustera (local) - no waiting<br>
> Jun 25 05:19:52 clustera crmd[33833]: info: te_rsc_command: Initiating action 7: start VIP73_start_0 on clustera (local)<br>
> Jun 25 05:19:52 clustera attrd[33831]: info: find_hash_entry: Creating hash entry for probe_complete<br>
> Jun 25 05:19:52 clustera attrd[33831]: notice: attrd_trigger_update: Sending flush op to all hosts for: probe_complete (true)<br>
> Jun 25 05:19:52 clustera lrmd: [33830]: info: rsc:VIP73:5: start<br>
> Jun 25 05:19:52 clustera attrd[33831]: notice: attrd_perform_update: Sent update 4: probe_complete=true<br>
> Jun 25 05:19:52 clustera crmd[33833]: info: te_rsc_command: Initiating action 9: start POSTGRESQL:0_start_0 on clustera (local)<br>
> Jun 25 05:19:52 clustera lrmd: [33830]: info: rsc:POSTGRESQL:0:6: start<br>
> Jun 25 05:19:52 clustera IPaddr2(VIP73)[33953]: INFO: ip -f inet addr add <a href="http://192.168.73.222/24" target="_blank">192.168.73.222/24</a> brd 192.168.73.255 dev eth1 label eth1:VIP73<br>
> Jun 25 05:19:52 clustera avahi-daemon[2065]: Registering new address record for 192.168.73.222 on eth1.IPv4.<br>
> Jun 25 05:19:52 clustera IPaddr2(VIP73)[33953]: INFO: ip link set eth1 up<br>
> Jun 25 05:19:52 clustera IPaddr2(VIP73)[33953]: INFO: /usr/lib64/heartbeat/send_arp -i 200 -r 5 -p /var/run/heartbeat/rsctmp/send_arp-192.168.73.222 eth1 192.168.73.222 auto not_used not_used<br>
> Jun 25 05:19:52 clustera postgresql(POSTGRESQL:0)[33954]: INFO: POSTGRESQL:0: Starting<br>
> Jun 25 05:19:52 clustera postgresql(POSTGRESQL:0)[33954]: INFO: ACTION=start<br>
> Jun 25 05:19:52 clustera lrmd: [33830]: info: Managed VIP73:start process 33953 exited with return code 0.<br>
> Jun 25 05:19:52 clustera crmd[33833]: info: process_lrm_event: LRM operation VIP73_start_0 (call=5, rc=0, cib-update=29, confirmed=true) ok<br>
> Jun 25 05:19:52 clustera crmd[33833]: info: te_rsc_command: Initiating action 8: monitor VIP73_monitor_10000 on clustera (local)<br>
> Jun 25 05:19:52 clustera lrmd: [33830]: info: rsc:VIP73:7: monitor<br>
> Jun 25 05:19:52 clustera crmd[33833]: info: te_rsc_command: Initiating action 35: start ZK_start_0 on clustera (local)<br>
> Jun 25 05:19:52 clustera lrmd: [33830]: info: rsc:ZK:8: start<br>
> Jun 25 05:19:52 clustera postgresql(POSTGRESQL:0)[33954]: INFO: OCF_RA_VERSION_MAJOR=1<br>
> Jun 25 05:19:52 clustera zookeeper(ZK)[34058]: INFO: [ZK] No pid file found<br>
> Jun 25 05:19:52 clustera postgresql(POSTGRESQL:0)[33954]: INFO: OCF_RA_VERSION_MINOR=0<br>
> Jun 25 05:19:52 clustera postgresql(POSTGRESQL:0)[33954]: INFO: OCF_RESKEY_CRM_meta_clone=0<br>
> Jun 25 05:19:52 clustera postgresql(POSTGRESQL:0)[33954]: INFO: OCF_RESKEY_CRM_meta_clone_max=2<br>
> Jun 25 05:19:52 clustera postgresql(POSTGRESQL:0)[33954]: INFO: OCF_RESKEY_CRM_meta_clone_node_max=1<br>
> Jun 25 05:19:52 clustera postgresql(POSTGRESQL:0)[33954]: INFO: OCF_RESKEY_CRM_meta_globally_unique=false<br>
> Jun 25 05:19:52 clustera postgresql(POSTGRESQL:0)[33954]: INFO: OCF_RESKEY_CRM_meta_master_max=1<br>
> Jun 25 05:19:52 clustera postgresql(POSTGRESQL:0)[33954]: INFO: OCF_RESKEY_CRM_meta_master_node_max=1<br>
> Jun 25 05:19:52 clustera lrmd: [33830]: info: Managed VIP73:monitor process 34057 exited with return code 0.<br>
> Jun 25 05:19:52 clustera crmd[33833]: info: process_lrm_event: LRM operation VIP73_monitor_10000 (call=7, rc=0, cib-update=30, confirmed=false) ok<br>
> Jun 25 05:19:52 clustera postgresql(POSTGRESQL:0)[33954]: INFO: OCF_RESKEY_CRM_meta_name=start<br>
> Jun 25 05:19:52 clustera postgresql(POSTGRESQL:0)[33954]: INFO: OCF_RESKEY_CRM_meta_notify=true<br>
> Jun 25 05:19:52 clustera postgresql(POSTGRESQL:0)[33954]: INFO: OCF_RESKEY_CRM_meta_notify_active_resource=<br>
> Jun 25 05:19:52 clustera postgresql(POSTGRESQL:0)[33954]: INFO: OCF_RESKEY_CRM_meta_notify_active_uname=<br>
> Jun 25 05:19:52 clustera postgresql(POSTGRESQL:0)[33954]: INFO: OCF_RESKEY_CRM_meta_notify_demote_resource=<br>
> Jun 25 05:19:52 clustera postgresql(POSTGRESQL:0)[33954]: INFO: OCF_RESKEY_CRM_meta_notify_demote_uname=<br>
> Jun 25 05:19:52 clustera postgresql(POSTGRESQL:0)[33954]: INFO: OCF_RESKEY_CRM_meta_notify_inactive_resource=POSTGRESQL:0<br>
> Jun 25 05:19:52 clustera postgresql(POSTGRESQL:0)[33954]: INFO: POSTGRESQL:1<br>
> Jun 25 05:19:52 clustera postgresql(POSTGRESQL:0)[33954]: INFO: OCF_RESKEY_CRM_meta_notify_master_resource=<br>
> Jun 25 05:19:52 clustera postgresql(POSTGRESQL:0)[33954]: INFO: OCF_RESKEY_CRM_meta_notify_master_uname=<br>
> Jun 25 05:19:52 clustera postgresql(POSTGRESQL:0)[33954]: INFO: OCF_RESKEY_CRM_meta_notify_promote_resource=<br>
> Jun 25 05:19:52 clustera postgresql(POSTGRESQL:0)[33954]: INFO: OCF_RESKEY_CRM_meta_notify_promote_uname=<br>
> Jun 25 05:19:52 clustera postgresql(POSTGRESQL:0)[33954]: INFO: OCF_RESKEY_CRM_meta_notify_slave_resource=<br>
> Jun 25 05:19:52 clustera postgresql(POSTGRESQL:0)[33954]: INFO: OCF_RESKEY_CRM_meta_notify_slave_uname=<br>
> Jun 25 05:19:52 clustera postgresql(POSTGRESQL:0)[33954]: INFO: OCF_RESKEY_CRM_meta_notify_start_resource=POSTGRESQL:0<br>
> Jun 25 05:19:52 clustera postgresql(POSTGRESQL:0)[33954]: INFO: OCF_RESKEY_CRM_meta_notify_start_uname=clustera<br>
> Jun 25 05:19:52 clustera lrmd: [33830]: info: RA output: (ZK:start:stdout) Starting zookeeper ...<br>
> Jun 25 05:19:52 clustera postgresql(POSTGRESQL:0)[33954]: INFO: OCF_RESKEY_CRM_meta_notify_stop_resource=<br>
> Jun 25 05:19:52 clustera postgresql(POSTGRESQL:0)[33954]: INFO: OCF_RESKEY_CRM_meta_notify_stop_uname=<br>
> Jun 25 05:19:52 clustera postgresql(POSTGRESQL:0)[33954]: INFO: OCF_RESKEY_CRM_meta_timeout=90000<br>
> Jun 25 05:19:52 clustera postgresql(POSTGRESQL:0)[33954]: INFO: OCF_RESKEY_crm_feature_set=3.0.6<br>
> Jun 25 05:19:52 clustera postgresql(POSTGRESQL:0)[33954]: INFO: OCF_RESKEY_pgctl=/usr/pgsql-9.2/bin/pg_ctl<br>
> Jun 25 05:19:52 clustera postgresql(POSTGRESQL:0)[33954]: INFO: OCF_RESKEY_pgdata=/opt/custom/pgdata<br>
> Jun 25 05:19:52 clustera postgresql(POSTGRESQL:0)[33954]: INFO: OCF_RESKEY_repmgr_conf=/var/lib/pgsql/repmgr/repmgr.conf<br>
> Jun 25 05:19:52 clustera avahi-daemon[2065]: Invalid legacy unicast query packet.<br>
> Jun 25 05:19:52 clustera avahi-daemon[2065]: Received response from host 192.168.72.1 with invalid source port 50070 on interface 'eth0.0'<br>
> Jun 25 05:19:52 clustera postgresql(POSTGRESQL:0)[33954]: INFO: OCF_RESOURCE_INSTANCE=POSTGRESQL:0<br>
> Jun 25 05:19:52 clustera postgresql(POSTGRESQL:0)[33954]: INFO: OCF_RESOURCE_PROVIDER=custom<br>
> Jun 25 05:19:52 clustera postgresql(POSTGRESQL:0)[33954]: INFO: OCF_RESOURCE_TYPE=postgresql<br>
> Jun 25 05:19:52 clustera postgresql(POSTGRESQL:0)[33954]: INFO: OCF_ROOT=/usr/lib/ocf<br>
> Jun 25 05:19:52 clustera postgresql(POSTGRESQL:0)[33954]: INFO: Run as postgres: /usr/pgsql-9.2/bin/pg_ctl start -w -D /opt/custom/pgdata -l /var/log/custom/custom/ha/postgresql/postgres_ha.log -o '-c config_file=/opt/custom/pgdata/postgresql.conf' -o '-p 5432'<br>
> Jun 25 05:19:53 clustera avahi-daemon[2065]: Invalid legacy unicast query packet.<br>
> Jun 25 05:19:53 clustera avahi-daemon[2065]: Invalid legacy unicast query packet.<br>
> Jun 25 05:19:53 clustera avahi-daemon[2065]: Received response from host 192.168.72.1 with invalid source port 50070 on interface 'eth0.0'<br>
> Jun 25 05:19:53 clustera avahi-daemon[2065]: Received response from host 192.168.72.1 with invalid source port 50070 on interface 'eth0.0'<br>
> Jun 25 05:19:53 clustera lrmd: [33830]: info: RA output: (ZK:start:stdout) STARTED<br>
> Jun 25 05:19:53 clustera lrmd: [33830]: info: RA output: (ZK:start:stdout) ---------------------------------------------------------#012Starting ZK restore#012---------------------------------------------------------<br>
> Jun 25 05:19:53 clustera lrmd: [33830]: info: RA output: (ZK:start:stderr) ls: cannot access /var/custom/custom/ha/snapshots/zk_dump_*.xml: No such file or directory<br>
> Jun 25 05:19:53 clustera lrmd: [33830]: info: RA output: (ZK:start:stdout) [ZK] Last snapshot : <NONE>#012[ZK] WARNING: Nothing to restore<br>
> Jun 25 05:19:53 clustera zookeeper(ZK)[34058]: INFO: [ZK] Nothing to restore<br>
> Jun 25 05:19:53 clustera lrmd: [33830]: info: RA output: (ZK:start:stdout) ---------------------------------------------------------<br>
> Jun 25 05:19:53 clustera lrmd: [33830]: info: Managed ZK:start process 34058 exited with return code 0.<br>
> Jun 25 05:19:53 clustera crmd[33833]: info: process_lrm_event: LRM operation ZK_start_0 (call=8, rc=0, cib-update=31, confirmed=true) ok<br>
> Jun 25 05:19:53 clustera crmd[33833]: info: te_rsc_command: Initiating action 36: monitor ZK_monitor_5000 on clustera (local)<br>
> Jun 25 05:19:53 clustera lrmd: [33830]: info: rsc:ZK:9: monitor<br>
> Jun 25 05:19:53 clustera lrmd: [33830]: info: Managed ZK:monitor process 34390 exited with return code 0.<br>
> Jun 25 05:19:53 clustera crmd[33833]: info: process_lrm_event: LRM operation ZK_monitor_5000 (call=9, rc=0, cib-update=32, confirmed=false) ok<br>
> Jun 25 05:19:53 clustera postgresql(POSTGRESQL:0)[33954]: INFO: waiting for server to start.... done server started<br>
> Jun 25 05:19:54 clustera avahi-daemon[2065]: Received response from host 192.168.72.1 with invalid source port 50070 on interface 'eth0.0'<br>
> Jun 25 05:19:56 clustera avahi-daemon[2065]: Received response from host 192.168.72.1 with invalid source port 50070 on interface 'eth0.0'<br>
> Jun 25 05:19:56 clustera lrmd: [33830]: info: RA output: (VIP73:start:stderr) ARPING 192.168.73.222 from 192.168.73.222 eth1#012Sent 5 probes (5 broadcast(s))#<br>
</div></div>> _______________________________________________<br>
> Pacemaker mailing list: <a href="mailto:Pacemaker@oss.clusterlabs.org">Pacemaker@oss.clusterlabs.org</a><br>
> <a href="http://oss.clusterlabs.org/mailman/listinfo/pacemaker" target="_blank">http://oss.clusterlabs.org/mailman/listinfo/pacemaker</a><br>
><br>
> Project Home: <a href="http://www.clusterlabs.org" target="_blank">http://www.clusterlabs.org</a><br>
> Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>
> Bugs: <a href="http://bugs.clusterlabs.org" target="_blank">http://bugs.clusterlabs.org</a><br>
<br>
<br>_______________________________________________<br>
Pacemaker mailing list: <a href="mailto:Pacemaker@oss.clusterlabs.org">Pacemaker@oss.clusterlabs.org</a><br>
<a href="http://oss.clusterlabs.org/mailman/listinfo/pacemaker" target="_blank">http://oss.clusterlabs.org/mailman/listinfo/pacemaker</a><br>
<br>
Project Home: <a href="http://www.clusterlabs.org" target="_blank">http://www.clusterlabs.org</a><br>
Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>
Bugs: <a href="http://bugs.clusterlabs.org" target="_blank">http://bugs.clusterlabs.org</a><br>
<br></blockquote></div><br></div>