<div dir="ltr">Hi,<div><br></div><div>I used the blackbox feature which showed the reason for failure.</div><div>As I am cross-compiling pacemaker on a build machine and later moving the binaries to the target, few binaries were missing. After fixing that and bunch of other errors/warning, I am able to get pacemaker started though not completely running fine.</div><div><br></div><div>The node is not getting added:</div><div>airv_cu        cib:    error: xml_log:<span class="" style="white-space:pre"> </span>Element node failed to validate attributes<br></div><div><br></div><div>I suppose it is because of this error:</div><div>crmd:    error: node_list_update_callback:<span class="" style="white-space:pre">     </span>Node update 4 failed: Update does not conform to the configured schema (-203)<br></div><div><br></div><div>I am suspecting this is caused because of validate-with="pacemaker-0.7" in the cib. In another installation this is being set to '"pacemaker-2.0"'</div><div><br></div><div><div><div>[root@airv_cu pacemaker]# pcs cluster cib</div><div><cib crm_feature_set="3.0.10" validate-with="pacemaker-0.7" epoch="3" num_updates="0" admin_epoch="0" cib-last-written="Fri May  6 09:28:10 2016" have-quorum="1"></div><div>  <configuration></div><div>    <crm_config></div><div>      <cluster_property_set id="cib-bootstrap-options"></div><div>        <nvpair id="cib-bootstrap-options-have-watchdog" name="have-watchdog" value="true"/></div><div>        <nvpair id="cib-bootstrap-options-dc-version" name="dc-version" value="1.1.14-5a6cdd1"/></div><div>        <nvpair id="cib-bootstrap-options-cluster-infrastructure" name="cluster-infrastructure" value="corosync"/></div><div>      </cluster_property_set></div><div>    </crm_config></div><div>    <nodes/></div><div>    <resources/></div><div>    <constraints/></div><div>  </configuration></div><div>  <status/></div><div></cib></div></div></div><div><br></div><div>Any idea why/where this is being set to 0.7. I am using latest pacemaker from GitHub.</div><div><br></div><div><div>[root@airv_cu pacemaker]# pacemakerd --version</div><div>Pacemaker 1.1.14</div><div>Written by Andrew Beekhof</div></div><div><br></div><div>Attaching the corosync.log and corosync.conf file. </div><div><br></div><div>-Thanks</div><div>Nikhil</div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, May 5, 2016 at 10:21 PM, Ken Gaillot <span dir="ltr"><<a href="mailto:kgaillot@redhat.com" target="_blank">kgaillot@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On 05/05/2016 11:25 AM, Nikhil Utane wrote:<br>

> Thanks Ken for your quick response as always.<br>

><br>

> But what if I don't want to use quorum? I just want to bring up<br>

> pacemaker + corosync on 1 node to check that it all comes up fine.<br>

> I added corosync_votequorum as you suggested. Additionally I also added<br>

> these 2 lines:<br>

><br>

> expected_votes: 2<br>

> two_node: 1<br>

<br>

</span>There's actually nothing wrong with configuring a single-node cluster.<br>

You can list just one node in corosync.conf and leave off the above.<br>

<span class=""><br>

> However still pacemaker is not able to run.<br>

<br>

</span>There must be other issues involved. Even if pacemaker doesn't have<br>

quorum, it will still run, it just won't start resources.<br>

<span class=""><br>

> [root@airv_cu root]# pcs cluster start<br>

> Starting Cluster...<br>

> Starting Pacemaker Cluster Manager[FAILED]<br>

><br>

> Error: unable to start pacemaker<br>

><br>

> Corosync.log:<br>

</span>> *May 05 16:15:20 [16294] airv_cu pacemakerd:     info:<br>

> pcmk_quorum_notification: Membership 240: quorum still lost (1)*<br>

<span class="im HOEnZb">> May 05 16:15:20 [16259] airv_cu corosync debug   [QB    ] Free'ing<br>

> ringbuffer: /dev/shm/qb-cmap-request-16259-16294-21-header<br>

> May 05 16:15:20 [16294] airv_cu pacemakerd:   notice:<br>

> crm_update_peer_state_iter:       pcmk_quorum_notification: Node<br>

> airv_cu[181344357] - state is now member (was (null))<br>

> May 05 16:15:20 [16294] airv_cu pacemakerd:     info:<br>

> pcmk_cpg_membership:      Node 181344357 joined group pacemakerd<br>

> (counter=0.0)<br>

> May 05 16:15:20 [16294] airv_cu pacemakerd:     info:<br>

> pcmk_cpg_membership:      Node 181344357 still member of group<br>

> pacemakerd (peer=airv_cu, counter=0.0)<br>

> May 05 16:15:20 [16294] airv_cu pacemakerd:  warning: pcmk_child_exit:<br>

>  The cib process (16353) can no longer be respawned, shutting the<br>

> cluster down.<br>

> May 05 16:15:20 [16294] airv_cu pacemakerd:   notice:<br>

> pcmk_shutdown_worker:     Shutting down Pacemaker<br>

><br>

> The log and conf file is attached.<br>

><br>

> -Regards<br>

> Nikhil<br>

><br>

> On Thu, May 5, 2016 at 8:04 PM, Ken Gaillot <<a href="mailto:kgaillot@redhat.com">kgaillot@redhat.com</a><br>

</span><div class="HOEnZb"><div class="h5">> <mailto:<a href="mailto:kgaillot@redhat.com">kgaillot@redhat.com</a>>> wrote:<br>

><br>

>     On 05/05/2016 08:36 AM, Nikhil Utane wrote:<br>

>     > Hi,<br>

>     ><br>

>     > Continuing with my adventure to run Pacemaker & Corosync on our<br>

>     > big-endian system, I managed to get past the corosync issue for now. But<br>

>     > facing an issue in running Pacemaker.<br>

>     ><br>

>     > Seeing following messages in corosync.log.<br>

>     >  pacemakerd:  warning: pcmk_child_exit:  The cib process (20000) can no<br>

>     > longer be respawned, shutting the cluster down.<br>

>     >  pacemakerd:  warning: pcmk_child_exit:  The stonith-ng process (20001)<br>

>     > can no longer be respawned, shutting the cluster down.<br>

>     >  pacemakerd:  warning: pcmk_child_exit:  The lrmd process (20002) can no<br>

>     > longer be respawned, shutting the cluster down.<br>

>     >  pacemakerd:  warning: pcmk_child_exit:  The attrd process (20003) can<br>

>     > no longer be respawned, shutting the cluster down.<br>

>     >  pacemakerd:  warning: pcmk_child_exit:  The pengine process (20004) can<br>

>     > no longer be respawned, shutting the cluster down.<br>

>     >  pacemakerd:  warning: pcmk_child_exit:  The crmd process (20005) can no<br>

>     > longer be respawned, shutting the cluster down.<br>

>     ><br>

>     > I see following error before these messages. Not sure if this is the cause.<br>

>     > May 05 11:26:24 [19998] airv_cu pacemakerd:    error:<br>

>     > cluster_connect_quorum:   Corosync quorum is not configured<br>

>     ><br>

>     > I tried removing the quorum block (which is anyways blank) from the conf<br>

>     > file but still had the same error.<br>

><br>

>     Yes, that is the issue. Pacemaker can't do anything if it can't ask<br>

>     corosync about quorum. I don't know what the issue is at the corosync<br>

>     level, but your corosync.conf should have:<br>

><br>

>     quorum {<br>

>         provider: corosync_votequorum<br>

>     }<br>

><br>

><br>

>     > Attaching the log and conf files. Please let me know if there is any<br>

>     > obvious mistake or how to investigate it further.<br>

>     ><br>

>     > I am using pcs cluster start command to start the cluster<br>

>     ><br>

>     > -Thanks<br>

>     > Nikhil<br>

</div></div></blockquote></div><br></div>