[Pacemaker] Weird behavior of PCS command while defining DRBD resources

Tue Jan 7 21:17:58 EST 2014

On 27 Nov 2013, at 10:21 pm, Muhammad Kamran Azeem <kamranazeem at gmail.com> wrote:

> Apologies for double post. In my initial post, I forgot to set the subject properly.
> 
> 
> Hello List,
> 
> I am new here. I worked with Linux HA during 2006-2008, went in HPC direction, and came back to HA a month ago. Realized that a lot has changed. 
> 
> My setup:
> 
> Two KVM machines vdb1 (192.168.122.11), vdb2 (192.168.122.12)
> ClusterIP: 192.168.122.10 
> Fedora 19 (64 bit). PCS, CoroSync, PaceMaker, DRBD
> 
> Note: I use the names node1 and node2 for vdb1 and vdb2 for explanations.
> 
> I am trying to setup a test cluster, using  http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html/Clusters_from_Scratch/_configure_the_cluster_for_drbd.html
> 
> First, the status:
> 
> [root at vdb1 drbd.d]# pcs status
> Cluster name: MySQLCluster
> Last updated: Tue Nov 26 14:05:33 2013
> Last change: Mon Nov 25 17:25:59 2013 via crm_resource on vdb2.example.com
> Stack: corosync
> Current DC: vdb1.example.com (1) - partition with quorum
> Version: 1.1.9-3.fc19-781a388
> 2 Nodes configured, unknown expected votes
> 2 Resources configured.
> 
> Online: [ vdb1.example.com vdb2.example.com ]
> 
> Full list of resources:
> 
>  ClusterIP	(ocf::heartbeat:IPaddr2):	Started vdb1.example.com 
>  Apache	(ocf::heartbeat:apache):	Started vdb1.example.com 
> 
> [root at vdb1 drbd.d]#
> 
> 
> My DRBD disks are: 
> 
> [root at vdb1 drbd.d]# drbd-overview 
>   1:MySQLDisk/0   Connected Secondary/Secondary UpToDate/UpToDate C r----- 
>   2:ApacheDisk/0  Connected Secondary/Secondary UpToDate/UpToDate C r----- 
> [root at vdb1 drbd.d]# 
> 
> 
> Now, the guide suggests creating a small config file, define the new resources in that, and then push that in CIB. Extract from the guide:
> # pcs cluster cib drbd_cfg
> # pcs -f drbd_cfg resource create WebData ocf:linbit:drbd \
>          drbd_resource=wwwdata op monitor interval=60s
> # pcs -f drbd_cfg resource master WebDataClone WebData \
>          master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 \
>          notify=true
> 
> 
> I decided to execute the commands (manually), without using the config file method, as:
> 
> # pcs resource create p_ApacheDisk ocf:linbit:drbd \
> >         drbd_resource=ApacheDisk op monitor interval=60s
> 
> # pcs resource master MasterApacheDisk p_ApacheDisk \
> >          master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 \
> >          notify=true
> 
> (I changed the names of resources a bit)
> 
> I get the following errors:
> 
> [root at vdb2 ~]# pcs resource create p_ApacheDisk ocf:linbit:drbd \
> >         drbd_resource=ApacheDisk op monitor interval=60s
> 
> 
> [root at vdb2 ~]# pcs status
> Cluster name: MySQLCluster
> Last updated: Wed Nov 27 11:50:35 2013
> Last change: Wed Nov 27 11:49:36 2013 via cibadmin on vdb2.example.com
> Stack: corosync
> Current DC: vdb1.example.com (1) - partition with quorum
> Version: 1.1.9-3.fc19-781a388
> 2 Nodes configured, unknown expected votes
> 3 Resources configured.
> 
> Online: [ vdb1.example.com vdb2.example.com ]
> 
> Full list of resources:
> 
>  ClusterIP	(ocf::heartbeat:IPaddr2):	Started vdb1.example.com 
>  Apache	(ocf::heartbeat:apache):	Started vdb1.example.com 
>  p_ApacheDisk	(ocf::linbit:drbd):	Stopped 
> 
> Failed actions:
>     p_ApacheDisk_monitor_0 (node=vdb1.example.com, call=27, rc=6, status=complete, last-rc-change=Wed Nov 27 11:49:36 2013
> , queued=23ms, exec=0ms
> ): not configured
>     p_ApacheDisk_monitor_0 (node=vdb2.example.com, call=15, rc=6, status=complete, last-rc-change=Wed Nov 27 11:49:36 2013
> , queued=22ms, exec=1ms
> ): not configured
> 
> 
> Got the following in /var/log/messages on DC (node 1):
> 
> Nov 27 11:49:36 vdb1 cib[538]:   notice: cib:diff: Diff: --- 0.43.13
> Nov 27 11:49:36 vdb1 cib[538]:   notice: cib:diff: Diff: +++ 0.44.1 f4b87d9dee145747f86583cb5eb8276b
> Nov 27 11:49:36 vdb1 stonith-ng[539]:   notice: unpack_config: On loss of CCM Quorum: Ignore
> Nov 27 11:49:36 vdb1 crmd[543]:   notice: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph ]
> Nov 27 11:49:36 vdb1 pengine[542]:   notice: unpack_config: On loss of CCM Quorum: Ignore
> Nov 27 11:49:36 vdb1 pengine[542]:   notice: LogActions: Start   p_ApacheDisk#011(vdb2.example.com)
> Nov 27 11:49:36 vdb1 pengine[542]:   notice: process_pe_message: Calculated Transition 92: /var/lib/pacemaker/pengine/pe-input-74.bz2
> Nov 27 11:49:36 vdb1 crmd[543]:   notice: te_rsc_command: Initiating action 8: monitor p_ApacheDisk_monitor_0 on vdb2.example.com
> Nov 27 11:49:36 vdb1 crmd[543]:   notice: te_rsc_command: Initiating action 6: monitor p_ApacheDisk_monitor_0 on vdb1.example.com (local)
> Nov 27 11:49:36 vdb1 drbd(p_ApacheDisk)[9807]: ERROR: meta parameter misconfigured, expected clone-max -le 2, but found unset.
> Nov 27 11:49:36 vdb1 crmd[543]:   notice: process_lrm_event: LRM operation p_ApacheDisk_monitor_0 (call=27, rc=6, cib-update=124, confirmed=true) not configured
> Nov 27 11:49:36 vdb1 crmd[543]:  warning: status_from_rc: Action 8 (p_ApacheDisk_monitor_0) on vdb2.example.com failed (target: 7 vs. rc: 6): Error
> Nov 27 11:49:36 vdb1 crmd[543]:   notice: te_rsc_command: Initiating action 7: probe_complete probe_complete on vdb2.example.com - no waiting
> Nov 27 11:49:36 vdb1 crmd[543]:  warning: status_from_rc: Action 6 (p_ApacheDisk_monitor_0) on vdb1.example.com failed (target: 7 vs. rc: 6): Error
> Nov 27 11:49:36 vdb1 crmd[543]:   notice: te_rsc_command: Initiating action 5: probe_complete probe_complete on vdb1.example.com (local) - no waiting
> Nov 27 11:49:36 vdb1 crmd[543]:   notice: run_graph: Transition 92 (Complete=4, Pending=0, Fired=0, Skipped=3, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-74.bz2): Stopped
> Nov 27 11:49:36 vdb1 pengine[542]:   notice: unpack_config: On loss of CCM Quorum: Ignore
> Nov 27 11:49:36 vdb1 pengine[542]:    error: unpack_rsc_op: Preventing p_ApacheDisk from re-starting anywhere in the cluster : operation monitor failed 'not configured' (rc=6)
> Nov 27 11:49:36 vdb1 pengine[542]:    error: unpack_rsc_op: Preventing p_ApacheDisk from re-starting anywhere in the cluster : operation monitor failed 'not configured' (rc=6)
> Nov 27 11:49:36 vdb1 pengine[542]:   notice: process_pe_message: Calculated Transition 93: /var/lib/pacemaker/pengine/pe-input-75.bz2
> Nov 27 11:49:36 vdb1 crmd[543]:   notice: run_graph: Transition 93 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-75.bz2): Complete
> Nov 27 11:49:36 vdb1 crmd[543]:   notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]
> 
> 
> I then executed the second command and got this:
> 
> [root at vdb2 drbd.d]# pcs resource master MasterApacheDisk p_ApacheDisk \
> >          master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 \
> >          notify=true
> 
> [root at vdb2 drbd.d]# pcs status
> Cluster name: MySQLCluster
> Last updated: Wed Nov 27 11:52:54 2013
> Last change: Wed Nov 27 11:52:39 2013 via cibadmin on vdb1.example.com
> Stack: corosync
> Current DC: vdb1.example.com (1) - partition with quorum
> Version: 1.1.9-3.fc19-781a388
> 2 Nodes configured, unknown expected votes
> 4 Resources configured.
> 
> Online: [ vdb1.example.com vdb2.example.com ]
> 
> Full list of resources:
> 
>  ClusterIP	(ocf::heartbeat:IPaddr2):	Started vdb1.example.com 
>  Apache	(ocf::heartbeat:apache):	Started vdb1.example.com 
>  Master/Slave Set: MasterApacheDisk [p_ApacheDisk]
>      Stopped: [ vdb1.example.com vdb2.example.com ]
> 
> Failed actions:
>     p_ApacheDisk_monitor_0 (node=vdb1.example.com, call=27, rc=6, status=complete, last-rc-change=Wed Nov 27 11:49:36 2013
> , queued=23ms, exec=0ms
> ): not configured
>     p_ApacheDisk_monitor_0 (node=vdb2.example.com, call=15, rc=6, status=complete, last-rc-change=Wed Nov 27 11:49:36 2013
> , queued=22ms, exec=1ms
> ): not configured
> 
> [root at vdb2 drbd.d]#
> 
> 
> Got the following in log file on /var/log/messages:
> 
> Nov 27 11:52:39 vdb1 cib[538]:   notice: cib:diff: Diff: --- 0.44.3
> Nov 27 11:52:39 vdb1 cib[538]:   notice: cib:diff: Diff: +++ 0.45.1 d62a8fd52495c636c2bc012ac156d3e2
> Nov 27 11:52:39 vdb1 crmd[543]:   notice: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph ]
> Nov 27 11:52:39 vdb1 stonith-ng[539]:   notice: unpack_config: On loss of CCM Quorum: Ignore
> Nov 27 11:52:39 vdb1 pengine[542]:   notice: unpack_config: On loss of CCM Quorum: Ignore
> Nov 27 11:52:39 vdb1 pengine[542]:    error: unpack_rsc_op: Preventing MasterApacheDisk from re-starting anywhere in the cluster : operation monitor failed 'not configured' (rc=6)
> Nov 27 11:52:39 vdb1 pengine[542]:    error: unpack_rsc_op: Preventing MasterApacheDisk from re-starting anywhere in the cluster : operation monitor failed 'not configured' (rc=6)
> Nov 27 11:52:39 vdb1 pengine[542]:   notice: process_pe_message: Calculated Transition 94: /var/lib/pacemaker/pengine/pe-input-76.bz2
> Nov 27 11:52:39 vdb1 crmd[543]:   notice: run_graph: Transition 94 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-76.bz2): Complete
> Nov 27 11:52:39 vdb1 crmd[543]:   notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]
> 
> 
> I have tried them they way it is specified in the guide and it works. (i.e. by using an intermediate config file). But my question is, why I cannot use the two commands individually?

Because the file method sends both commands to the cluster as a single update.
Running them separately tells the cluster about drbd but without specifying that it is a master/slave resource with special parameters.
Parameters that the drbd agent checks for and produces an error when missing.

> Are there any limitations in doing so? If someone can explain the above errors and the reasons behind them, I would really appreciate. Thank you for your time.
> 
> Regards,
> K
> 
> -- 
> http://www.wbitt.com , http://techsnail.com
> Computer bugs are like shipwrecks. They are not found, unless they want to be found.
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.clusterlabs.org/pipermail/pacemaker/attachments/20140108/4e0daf12/attachment-0002.sig>