Hi Andrew,<br>I am adding the log messages which i get when I commit the crm configuration and crm_verify -LM output for your consideration. My crm configuration is attached ..It is showing that the resources cannot run anywhere. What should I do??<br>
<br>crm_verify -LV snippet<br>-------------------------------<br>root@node1:~# crm_verify -LV<br>crm_verify[10393]: 2010/02/23_11:27:44 WARN: native_color: Resource vir-ip cannot run anywhere<br>crm_verify[10393]: 2010/02/23_11:27:44 WARN: native_color: Resource slony-fail cannot run anywhere<br>
crm_verify[10393]: 2010/02/23_11:27:44 WARN: native_color: Resource slony-fail2 cannot run anywhere<br>Warnings found during check: config may not be valid<br>root@node1:~# crm_verify -LV<br>crm_verify[10760]: 2010/02/23_11:32:50 WARN: native_color: Resource vir-ip cannot run anywhere<br>
crm_verify[10760]: 2010/02/23_11:32:50 WARN: native_color: Resource slony-fail cannot run anywhere<br>crm_verify[10760]: 2010/02/23_11:32:50 WARN: native_color: Resource slony-fail2 cannot run anywhere<br>Warnings found during check: config may not be valid<br>
--------------------------------------------------------------<br><br>Log snippet<br>-------------------------------------------------<br>Feb 23 11:25:48 node1 cib: [1625]: info: log_data_element: cib:diff: - <cib admin_epoch="0" epoch="285" num_updates="33" ><br>
Feb 23 11:25:48 node1 crmd: [1629]: info: abort_transition_graph: need_abort:59 - Triggered transition abort (complete=1) : Non-status change<br>Feb 23 11:25:48 node1 cib: [1625]: info: log_data_element: cib:diff: - <configuration ><br>
Feb 23 11:25:48 node1 crmd: [1629]: info: need_abort: Aborting on change to admin_epoch<br>Feb 23 11:25:48 node1 cib: [1625]: info: log_data_element: cib:diff: - <constraints ><br>Feb 23 11:25:48 node1 crmd: [1629]: info: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph ]<br>
Feb 23 11:25:48 node1 cib: [1625]: info: log_data_element: cib:diff: - <rsc_location id="vir-ip-with-pingd" ><br>Feb 23 11:25:48 node1 crmd: [1629]: info: do_state_transition: All 2 cluster nodes are eligible to run resources.<br>
Feb 23 11:25:48 node1 cib: [1625]: info: log_data_element: cib:diff: - <rule score="-1000" id="vir-ip-with-pingd-rule" /><br>Feb 23 11:25:48 node1 crmd: [1629]: info: do_pe_invoke: Query 187: Requesting the current CIB: S_POLICY_ENGINE<br>
Feb 23 11:25:48 node1 cib: [1625]: info: log_data_element: cib:diff: - </rsc_location><br>Feb 23 11:25:48 node1 cib: [1625]: info: log_data_element: cib:diff: - </constraints><br>Feb 23 11:25:48 node1 cib: [1625]: info: log_data_element: cib:diff: - </configuration><br>
Feb 23 11:25:48 node1 cib: [1625]: info: log_data_element: cib:diff: - </cib><br>Feb 23 11:25:48 node1 cib: [1625]: info: log_data_element: cib:diff: + <cib admin_epoch="0" epoch="286" num_updates="1" ><br>
Feb 23 11:25:48 node1 cib: [1625]: info: log_data_element: cib:diff: + <configuration ><br>Feb 23 11:25:48 node1 cib: [1625]: info: log_data_element: cib:diff: + <constraints ><br>Feb 23 11:25:48 node1 cib: [1625]: info: log_data_element: cib:diff: + <rsc_location id="vir-ip-with-pingd" ><br>
Feb 23 11:25:48 node1 cib: [1625]: info: log_data_element: cib:diff: + <rule score="-INFINITY" id="vir-ip-with-pingd-rule" /><br>Feb 23 11:25:48 node1 cib: [1625]: info: log_data_element: cib:diff: + </rsc_location><br>
Feb 23 11:25:48 node1 cib: [1625]: info: log_data_element: cib:diff: + </constraints><br>Feb 23 11:25:48 node1 cib: [1625]: info: log_data_element: cib:diff: + </configuration><br>Feb 23 11:25:48 node1 cib: [1625]: info: log_data_element: cib:diff: + </cib><br>
Feb 23 11:25:48 node1 cib: [1625]: info: cib_process_request: Operation complete: op cib_replace for section constraints (origin=local/cibadmin/2, version=0.286.1): ok (rc=0)<br>Feb 23 11:25:48 node1 crmd: [1629]: info: do_pe_invoke_callback: Invoking the PE: ref=pe_calc-dc-1266904548-176, seq=12, quorate=1<br>
Feb 23 11:25:48 node1 pengine: [6277]: notice: unpack_config: On loss of CCM Quorum: Ignore<br>Feb 23 11:25:48 node1 pengine: [6277]: info: unpack_config: Node scores: 'red' = -INFINITY, 'yellow' = 0, 'green' = 0<br>
Feb 23 11:25:48 node1 pengine: [6277]: info: determine_online_status: Node node2 is online<br>Feb 23 11:25:48 node1 pengine: [6277]: info: determine_online_status: Node node1 is online<br>Feb 23 11:25:48 node1 pengine: [6277]: info: unpack_rsc_op: slony-fail2_monitor_0 on node1 returned 0 (ok) instead of the expected value: 7 (not running)<br>
Feb 23 11:25:48 node1 pengine: [6277]: notice: unpack_rsc_op: Operation slony-fail2_monitor_0 found resource slony-fail2 active on node1<br>Feb 23 11:25:48 node1 pengine: [6277]: info: unpack_rsc_op: pgsql:1_monitor_0 on node1 returned 0 (ok) instead of the expected value: 7 (not running)<br>
Feb 23 11:25:48 node1 pengine: [6277]: notice: unpack_rsc_op: Operation pgsql:1_monitor_0 found resource pgsql:1 active on node1<br>Feb 23 11:25:48 node1 pengine: [6277]: notice: native_print: vir-ip (ocf::heartbeat:IPaddr2): Started node1<br>
Feb 23 11:25:48 node1 pengine: [6277]: notice: native_print: slony-fail (lsb:slony_failover): Started node1<br>Feb 23 11:25:48 node1 pengine: [6277]: notice: clone_print: Clone Set: pgclone<br>Feb 23 11:25:48 node1 pengine: [6277]: notice: print_list: Started: [ node2 node1 ]<br>
Feb 23 11:25:48 node1 pengine: [6277]: notice: native_print: slony-fail2 (lsb:slony_failover2): Started node1<br>Feb 23 11:25:48 node1 pengine: [6277]: notice: clone_print: Clone Set: pingclone<br>Feb 23 11:25:48 node1 pengine: [6277]: notice: print_list: Started: [ node2 node1 ]<br>
Feb 23 11:25:48 node1 pengine: [6277]: info: native_merge_weights: vir-ip: Rolling back scores from slony-fail<br>Feb 23 11:25:48 node1 pengine: [6277]: info: native_merge_weights: vir-ip: Rolling back scores from slony-fail2<br>
Feb 23 11:25:48 node1 pengine: [6277]: WARN: native_color: Resource vir-ip cannot run anywhere<br>Feb 23 11:25:48 node1 pengine: [6277]: WARN: native_color: Resource slony-fail cannot run anywhere<br>Feb 23 11:25:48 node1 pengine: [6277]: WARN: native_color: Resource slony-fail2 cannot run anywhere<br>
Feb 23 11:25:48 node1 pengine: [6277]: notice: LogActions: Stop resource vir-ip(node1)<br>Feb 23 11:25:48 node1 pengine: [6277]: notice: LogActions: Stop resource slony-fail (node1)<br>Feb 23 11:25:48 node1 pengine: [6277]: notice: LogActions: Leave resource pgsql:0 (Started node2)<br>
Feb 23 11:25:48 node1 pengine: [6277]: notice: LogActions: Leave resource pgsql:1 (Started node1)<br>Feb 23 11:25:48 node1 pengine: [6277]: notice: LogActions: Stop resource slony-fail2 (node1)<br>Feb 23 11:25:48 node1 pengine: [6277]: notice: LogActions: Leave resource pingd:0 (Started node2)<br>
Feb 23 11:25:48 node1 pengine: [6277]: notice: LogActions: Leave resource pingd:1 (Started node1)<br>Feb 23 11:25:48 node1 lrmd: [1626]: info: rsc:slony-fail:41: stop<br>Feb 23 11:25:48 node1 cib: [10242]: info: write_cib_contents: Archived previous version as /var/lib/heartbeat/crm/cib-8.raw<br>
Feb 23 11:25:48 node1 crmd: [1629]: info: do_state_transition: State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ]<br>Feb 23 11:25:48 node1 lrmd: [1626]: info: rsc:slony-fail2:42: stop<br>
Feb 23 11:25:48 node1 crmd: [1629]: info: unpack_graph: Unpacked transition 22: 4 actions in 4 synapses<br>Feb 23 11:25:48 node1 crmd: [1629]: info: do_te_invoke: Processing graph 22 (ref=pe_calc-dc-1266904548-176) derived from /var/lib/pengine/pe-warn-101.bz2<br>
Feb 23 11:25:48 node1 crmd: [1629]: info: te_rsc_command: Initiating action 11: stop slony-fail_stop_0 on node1 (local)<br>Feb 23 11:25:48 node1 crmd: [1629]: info: do_lrm_rsc_op: Performing key=11:22:0:fd31c6bc-df43-4481-8b69-2c54c50075fb op=slony-fail_stop_0 )<br>
Feb 23 11:25:48 node1 crmd: [1629]: info: te_rsc_command: Initiating action 28: stop slony-fail2_stop_0 on node1 (local)<br>Feb 23 11:25:48 node1 crmd: [1629]: info: do_lrm_rsc_op: Performing key=28:22:0:fd31c6bc-df43-4481-8b69-2c54c50075fb op=slony-fail2_stop_0 )<br>
Feb 23 11:25:48 node1 lrmd: [10244]: WARN: For LSB init script, no additional parameters are needed.<br>Feb 23 11:25:48 node1 lrmd: [10243]: WARN: For LSB init script, no additional parameters are needed.<br>Feb 23 11:25:48 node1 crmd: [1629]: info: process_lrm_event: LRM operation slony-fail_stop_0 (call=41, rc=0, cib-update=188, confirmed=true) complete ok<br>
Feb 23 11:25:48 node1 cib: [10242]: info: write_cib_contents: Wrote version 0.286.0 of the CIB to disk (digest: aaddbe7aeaf08365be5bbbdb4931295e)<br>Feb 23 11:25:48 node1 crmd: [1629]: info: match_graph_event: Action slony-fail_stop_0 (11) confirmed on node1 (rc=0)<br>
Feb 23 11:25:48 node1 pengine: [6277]: WARN: process_pe_message: Transition 22: WARNINGs found during PE processing. PEngine Input stored in: /var/lib/pengine/pe-warn-101.bz2<br>Feb 23 11:25:48 node1 pengine: [6277]: info: process_pe_message: Configuration WARNINGs found during PE processing. Please run "crm_verify -L" to identify issues.<br>
Feb 23 11:25:48 node1 crmd: [1629]: info: process_lrm_event: LRM operation slony-fail2_stop_0 (call=42, rc=0, cib-update=189, confirmed=true) complete ok<br>Feb 23 11:25:48 node1 lrmd: [1626]: info: rsc:vir-ip:43: stop<br>
Feb 23 11:25:48 node1 crmd: [1629]: info: match_graph_event: Action slony-fail2_stop_0 (28) confirmed on node1 (rc=0)<br>Feb 23 11:25:48 node1 crmd: [1629]: info: te_rsc_command: Initiating action 10: stop vir-ip_stop_0 on node1 (local)<br>
Feb 23 11:25:48 node1 crmd: [1629]: info: do_lrm_rsc_op: Performing key=10:22:0:fd31c6bc-df43-4481-8b69-2c54c50075fb op=vir-ip_stop_0 )<br>Feb 23 11:25:48 node1 crmd: [1629]: info: process_lrm_event: LRM operation vir-ip_monitor_15000 (call=31, rc=-2, cib-update=0, confirmed=true) Cancelled unknown exec error<br>
Feb 23 11:25:48 node1 cib: [10242]: info: retrieveCib: Reading cluster configuration from: /var/lib/heartbeat/crm/cib.gwOpFZ (digest: /var/lib/heartbeat/crm/cib.UtFyLu)<br>Feb 23 11:25:48 node1 IPaddr2[10249]: [10285]: INFO: ip -f inet addr delete <a href="http://192.168.10.10/24">192.168.10.10/24</a> dev eth0<br>
Feb 23 11:25:48 node1 IPaddr2[10249]: [10287]: INFO: ip -o -f inet addr show eth0<br>Feb 23 11:25:48 node1 crmd: [1629]: info: process_lrm_event: LRM operation vir-ip_stop_0 (call=43, rc=0, cib-update=190, confirmed=true) complete ok<br>
Feb 23 11:25:48 node1 crmd: [1629]: info: match_graph_event: Action vir-ip_stop_0 (10) confirmed on node1 (rc=0)<br>Feb 23 11:25:48 node1 crmd: [1629]: info: te_pseudo_action: Pseudo action 6 fired and confirmed<br>Feb 23 11:25:48 node1 crmd: [1629]: info: run_graph: ====================================================<br>
Feb 23 11:25:48 node1 crmd: [1629]: notice: run_graph: Transition 22 (Complete=4, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pengine/pe-warn-101.bz2): Complete<br>Feb 23 11:25:48 node1 crmd: [1629]: info: te_graph_trigger: Transition 22 is now complete<br>
Feb 23 11:25:48 node1 crmd: [1629]: info: notify_crmd: Transition 22 status: done - <null><br>Feb 23 11:25:48 node1 crmd: [1629]: info: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]<br>
Feb 23 11:25:48 node1 crmd: [1629]: info: do_state_transition: Starting PEngine Recheck Timer<br>Feb 23 11:27:11 node1 cib: [1625]: info: cib_stats: Processed 88 operations (8295.00us average, 0% utilization) in the last 10min<br>
------------------------------------------------------------<br><br>crm_mon snippet<br>------------------------------------------<br>============<br>Last updated: Tue Feb 23 11:27:56 2010<br>Stack: Heartbeat<br>Current DC: node1 (ac87f697-5b44-4720-a8af-12a6f2295930) - partition with quorum<br>
Version: 1.0.5-3840e6b5a305ccb803d29b468556739e75532d56<br>2 Nodes configured, unknown expected votes<br>5 Resources configured.<br>============<br><br>Online: [ node2 node1 ]<br><br>Clone Set: pgclone<br> Started: [ node2 node1 ]<br>
Clone Set: pingclone<br> Started: [ node2 node1 ]<br>-------------------------------------------------------------<br><br><br><div class="gmail_quote">On Tue, Feb 23, 2010 at 9:38 AM, Jayakrishnan <span dir="ltr"><<a href="mailto:jayakrishnanlll@gmail.com">jayakrishnanlll@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">Sir,<br>I am afraid to ask you but how can I tell pacemaker to compare as number instead of string.<br>
I changed -inf: to -10000 in pingd location constarint but same problem persists.<br>I also changer the global resource stickness to 10000. but still not working.<br>
<br>With thanks,<br><font color="#888888">Jayakrishnan.L</font><div><div></div><div class="h5"><br><br><div class="gmail_quote">On Tue, Feb 23, 2010 at 1:04 AM, Andrew Beekhof <span dir="ltr"><<a href="mailto:andrew@beekhof.net" target="_blank">andrew@beekhof.net</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div><div></div><div>On Mon, Feb 22, 2010 at 6:46 PM, Jayakrishnan <<a href="mailto:jayakrishnanlll@gmail.com" target="_blank">jayakrishnanlll@gmail.com</a>> wrote:<br>
> Sir,<br>
> I have setup a 2 node cluster with heartbeat 2.99 pacemaker 1.05. I am<br>
> using Ubuntu 9.1. Both the packages are installed from ubuntu karmic<br>
> repository.<br>
> My packages are:<br>
><br>
> heartbeat 2.99.2+sles11r9-5ubuntu1<br>
> heartbeat-common 2.99.2+sles11r9-5ubuntu1<br>
> heartbeat-common-dev 2.99.2+sles11r9-5ubuntu1<br>
> heartbeat-dev 2.99.2+sles11r9-5ubuntu1<br>
> libheartbeat2 2.99.2+sles11r9-5ubuntu1<br>
> libheartbeat2-dev 2.99.2+sles11r9-5ubuntu1<br>
> pacemaker-heartbeat 1.0.5+hg20090813-0ubuntu4<br>
> pacemaker-heartbeat-dev 1.0.5+hg20090813-0ubuntu4<br>
><br>
> My <a href="http://ha.cf" target="_blank">ha.cf</a> file, crm configuration are all attached in the mail.<br>
><br>
> I am making a postgres database cluster with slony replication. eth1 is my<br>
> heartbeat link, a cross over cable is connected between the servers in eth1.<br>
> eth0 is my external network where my cluster IP get assigned.<br>
> server1--> hostname node1<br>
> node 1 192.168.10.129 eth1<br>
> 192.168.1.1-->eth0<br>
><br>
><br>
> servver2 --> hostname node2<br>
> node2 192.168.10.130 eth1<br>
> 192.168.1.2 --> eth0<br>
><br>
> Now when I pull out my eth1 cable, I need to make a failover to the other<br>
> node. For that i have configured pingd as follows. But it is not working. My<br>
> resources are not at all starting when I give rule as<br>
> rule -inf: not_defined pingd or pingd lte0<br>
<br>
</div></div>You need to get 1.0.7 or tell pacemaker to do the comparison as a<br>
number instead of as a string.<br>
<div><div></div><div><br>
><br>
> I tried changing the -inf: to inf: then the resources got started but<br>
> resource failover is not taking place when i pull out the eth1 cable.<br>
><br>
> Please check my configuration and kindly point out where I am missing.<br>
> PLease see that I am using default resource stickness as INFINITY which is<br>
> compulsory for slony replication.<br>
><br>
> MY <a href="http://ha.cf" target="_blank">ha.cf</a> file<br>
> ------------------------------------------------------------------<br>
><br>
> autojoin none<br>
> keepalive 2<br>
> deadtime 15<br>
> warntime 10<br>
> initdead 64<br>
> initdead 64<br>
> bcast eth1<br>
> auto_failback off<br>
> node node1<br>
> node node2<br>
> crm respawn<br>
> use_logd yes<br>
> ____________________________________________<br>
><br>
> My crm configuration<br>
><br>
> node $id="3952b93e-786c-47d4-8c2f-a882e3d3d105" node2 \<br>
> attributes standby="off"<br>
> node $id="ac87f697-5b44-4720-a8af-12a6f2295930" node1 \<br>
> attributes standby="off"<br>
> primitive pgsql lsb:postgresql-8.4 \<br>
> meta target-role="Started" resource-stickness="inherited" \<br>
> op monitor interval="15s" timeout="25s" on-fail="standby"<br>
> primitive pingd ocf:pacemaker:pingd \<br>
> params name="pingd" hostlist="192.168.10.1 192.168.10.75" \<br>
> op monitor interval="15s" timeout="5s"<br>
> primitive slony-fail lsb:slony_failover \<br>
> meta target-role="Started"<br>
> primitive slony-fail2 lsb:slony_failover2 \<br>
> meta target-role="Started"<br>
> primitive vir-ip ocf:heartbeat:IPaddr2 \<br>
> params ip="192.168.10.10" nic="eth0" cidr_netmask="24"<br>
> broadcast="192.168.10.255" \<br>
> op monitor interval="15s" timeout="25s" on-fail="standby" \<br>
> meta target-role="Started"<br>
> clone pgclone pgsql \<br>
> meta notify="true" globally-unique="false" interleave="true"<br>
> target-role="Started"<br>
> clone pingclone pingd \<br>
> meta globally-unique="false" clone-max="2" clone-node-max="1"<br>
> location vir-ip-with-pingd vir-ip \<br>
> rule $id="vir-ip-with-pingd-rule" inf: not_defined pingd or pingd<br>
> lte 0<br>
> meta globally-unique="false" clone-max="2" clone-node-max="1"<br>
> colocation ip-with-slony inf: slony-fail vir-ip<br>
> colocation ip-with-slony2 inf: slony-fail2 vir-ip<br>
> order ip-b4-slony2 inf: vir-ip slony-fail2<br>
> order slony-b4-ip inf: vir-ip slony-fail<br>
> property $id="cib-bootstrap-options" \<br>
> dc-version="1.0.5-3840e6b5a305ccb803d29b468556739e75532d56" \<br>
> cluster-infrastructure="Heartbeat" \<br>
> no-quorum-policy="ignore" \<br>
> stonith-enabled="false" \<br>
> last-lrm-refresh="1266851027"<br>
> rsc_defaults $id="rsc-options" \<br>
> resource-stickiness="INFINITY"<br>
><br>
> _____________________________________<br>
><br>
> My crm status:<br>
> __________________________<br>
><br>
> crm(live)# status<br>
><br>
><br>
> ============<br>
> Last updated: Mon Feb 22 23:15:56 2010<br>
> Stack: Heartbeat<br>
> Current DC: node2 (3952b93e-786c-47d4-8c2f-a882e3d3d105) - partition with<br>
> quorum<br>
> Version: 1.0.5-3840e6b5a305ccb803d29b468556739e75532d56<br>
> 2 Nodes configured, unknown expected votes<br>
> 5 Resources configured.<br>
> ============<br>
><br>
> Online: [ node2 node1 ]<br>
><br>
> Clone Set: pgclone<br>
> Started: [ node1 node2 ]<br>
> Clone Set: pingclone<br>
> Started: [ node2 node1 ]<br>
><br>
> ============================<br>
><br>
> please help me out.<br>
> --<br></div></div></blockquote></div><br>
</div></div></blockquote></div><br><br clear="all"><br>-- <br>Regards,<br><br>Jayakrishnan. L<br><br>Visit: <a href="http://www.jayakrishnan.bravehost.com">www.jayakrishnan.bravehost.com</a><br><br>