<font><font face="garamond,serif">All Concerned;<br><br>I have been getting slapped around all day with this problem - I can't solve it.<br><br>The system is only half done - I have not yet implemented the nfs portion - but drbd part is not yet cooperating with corosync.<br>

<br>It appears to be working OK - but when I stop corosync on the DC - the other node does not start drbd?<br><br>Here is how I am setting things up....<br><br><br></font></font>


        <style type="text/css">

        <!--

                @page { margin: 0.79in }

                PRE.ctl { font-family: "Lohit Devanagari", monospace }

                P { margin-bottom: 0.08in }

                A:link { so-language: zxx }

        -->

        </style>


<p><a name="fnt__1"></a><a name="fnt__2"></a>Configure quorum<a href="http://docs.homelinux.org/doku.php?id=create_high-available_drbd_device_with_pacemaker#fn__1" name="fnt__1"></a>

and stonith<a href="http://docs.homelinux.org/doku.php?id=create_high-available_drbd_device_with_pacemaker#fn__2" name="fnt__2"></a>

</p>

<pre class="western">property no-quorum-policy="ignore"

property stonith-enabled="false"</pre><p>

On wms1 onfigure DRBD resource 

</p>

<pre class="western">primitive drbd_drbd0 ocf:linbit:drbd \

                    params drbd_resource="drbd0" \

                    op monitor interval="30s"</pre><p>

Configure DRBD Master/Slave 

</p>

<pre class="western">ms ms_drbd_drbd0 drbd_drbd0 \

                    meta master-max="1" master-node-max="1" \

                         clone-max="2" clone-node-max="1" \

                         notify="true"</pre><p>

Configure filesystem mountpoint 

</p>

<pre class="western">primitive fs_ftpdata ocf:heartbeat:Filesystem \

                    params device="/dev/drbd0" \

                    directory="/mnt/drbd0" fstype="ext3"</pre>


<br>When I check the status on the DC....<br><br>[root@wms2 ~]# crm<br>crm(live)# status<br>============<br>Last updated: Wed May 30 23:58:43 2012<br>Last change: Wed May 30 23:52:42 2012 via cibadmin on wms1<br>Stack: openais<br>

Current DC: wms2 - partition with quorum<br>Version: 1.1.6-3.el6-a02c0f19a00c1eb2527ad38f146ebc0834814558<br>2 Nodes configured, 2 expected votes<br>3 Resources configured.<br>============<br><br>Online: [ wms1 wms2 ]<br>

<br> Master/Slave Set: ms_drbd_drbd0 [drbd_drbd0]<br>     Masters: [ wms2 ]<br>     Slaves: [ wms1 ]<br> fs_ftpdata    (ocf::heartbeat:Filesystem):    Started wms2<br><br>[root@wms2 ~]# mount -l | grep drbd<br><br>/dev/drbd0 on /mnt/drbd0 type ext3 (rw)<br>

<br>So I stop corosync - but the other node...<br><br>[root@wms1 ~]# crm<br>crm(live)# status<br>============<br>Last updated: Thu May 31 00:12:17 2012<br>Last change: Wed May 30 23:52:42 2012 via cibadmin on wms1<br>Stack: openais<br>

Current DC: wms1 - partition WITHOUT quorum<br>Version: 1.1.6-3.el6-a02c0f19a00c1eb2527ad38f146ebc0834814558<br>2 Nodes configured, 2 expected votes<br>3 Resources configured.<br>============<br><br>Online: [ wms1 ]<br>OFFLINE: [ wms2 ]<br>

<br> Master/Slave Set: ms_drbd_drbd0 [drbd_drbd0]<br>     Masters: [ wms1 ]<br>     Stopped: [ drbd_drbd0:1 ]<br><br>Fails to mount /dev/drbd0?<br><br>Any ideas?<br><br>I tailed /var/log/cluster/corosync.log and get this....<br>

<br>May 31 00:02:36 wms1 attrd: [1266]: WARN: attrd_cib_callback: Update 22 for master-drbd_drbd0:0=5 failed: Remote node did not respond<br>May 31 00:03:06 wms1 attrd: [1266]: WARN: attrd_cib_callback: Update 25 for master-drbd_drbd0:0=5 failed: Remote node did not respond<br>

May 31 00:03:10 wms1 crmd: [1268]: WARN: cib_rsc_callback: Resource update 15 failed: (rc=-41) Remote node did not respond<br>May 31 00:03:36 wms1 attrd: [1266]: WARN: attrd_cib_callback: Update 28 for master-drbd_drbd0:0=5 failed: Remote node did not respond<br>

May 31 00:04:06 wms1 attrd: [1266]: WARN: attrd_cib_callback: Update 31 for master-drbd_drbd0:0=5 failed: Remote node did not respond<br>May 31 00:04:10 wms1 attrd: [1266]: WARN: attrd_cib_callback: Update 34 for master-drbd_drbd0:0=5 failed: Remote node did not respond<br>

May 31 00:04:10 wms1 attrd: [1266]: WARN: attrd_cib_callback: Update 37 for master-drbd_drbd0:0=5 failed: Remote node did not respond<br>May 31 00:04:10 wms1 attrd: [1266]: WARN: attrd_cib_callback: Update 40 for master-drbd_drbd0:0=5 failed: Remote node did not respond<br>

May 31 00:08:02 wms1 cib: [1257]: info: cib_stats: Processed 58 operations (0.00us average, 0% utilization) in the last 10min<br>May 31 00:08:02 wms1 cib: [1264]: info: cib_stats: Processed 117 operations (256.00us average, 0% utilization) in the last 10min<br>

<br>[root@wms2 ~]# tail /var/log/cluster/corosync.log<br>May 31 00:02:16 corosync [pcmk  ] info: update_member: Node wms2 now has process list: 00000000000000000000000000000002 (2)<br>May 31 00:02:16 corosync [pcmk  ] notice: pcmk_shutdown: Shutdown complete<br>

May 31 00:02:16 corosync [SERV  ] Service engine unloaded: Pacemaker Cluster Manager 1.1.6<br>May 31 00:02:16 corosync [SERV  ] Service engine unloaded: corosync extended virtual synchrony service<br>May 31 00:02:16 corosync [SERV  ] Service engine unloaded: corosync configuration service<br>

May 31 00:02:16 corosync [SERV  ] Service engine unloaded: corosync cluster closed process group service v1.01<br>May 31 00:02:16 corosync [SERV  ] Service engine unloaded: corosync cluster config database access v1.01<br>

May 31 00:02:16 corosync [SERV  ] Service engine unloaded: corosync profile loading service<br>May 31 00:02:16 corosync [SERV  ] Service engine unloaded: corosync cluster quorum service v0.1<br>May 31 00:02:16 corosync [MAIN  ] Corosync Cluster Engine exiting with status 0 at main.c:1858.<br>

<br><br><br>the example that I am working from talks about doing the following....<br><br><br clear="all">


        <style type="text/css">

        <!--

                @page { margin: 0.79in }

                PRE.ctl { font-family: "Lohit Devanagari", monospace }

                P { margin-bottom: 0.08in }

        -->

        </style>


<pre class="western" style="margin-bottom:0.2in">group services fs_drbd0</pre>


But this fails miserable...  services being undefined?<br><br>-- <br>Steven Silk<br>CSC<br>303 497 3112<br><br>