[Pacemaker] Question about Dual Primary DRBD + OCFS2

rvm at free.fr rvm at free.fr
Wed Mar 24 07:33:25 EDT 2010


Hi and thanks for you answer.
Here my hb_report with 2 tests :
* node standby / node online = Inconsistent drbd (resolved by drbdadm verify or reboot of the node)
* ifdown / kill of dlm/corosync = instant reboot => I can't find any trace of this problem exept my screen dump.
I hope you'll see something interresting in my logs.

Thanks you again for your help,
Regards


----- Mail Original -----
De: "Andrew Beekhof" <andrew at beekhof.net>
À: "The Pacemaker cluster resource manager" <pacemaker at oss.clusterlabs.org>
Envoyé: Mardi 23 Mars 2010 20h12:58 GMT +01:00 Amsterdam / Berlin / Berne / Rome / Stockholm / Vienne
Objet: Re: [Pacemaker] Question about Dual Primary DRBD + OCFS2

We'd need a stack trace, that screen dump doesn't help much I'm afraid.
Try using hb_report to grab the logs etc.  It also includes backtraces
from any cores it finds.

On Tue, Mar 23, 2010 at 6:55 PM,  <rvm at free.fr> wrote:
> Hi,
>
> Some tests today...
> If I switch off my network interface (ifdown eth0) or if i kill (-9) corosync, i've got a segfault of dlm_controld and the node reboot.
> Is it normal ? My tests are too hard ?
>
> Thanks a lot ;-)
>
> Regards
>
> ----- Mail Original -----
> De: rvm at free.fr
> À: pacemaker at oss.clusterlabs.org
> Envoyé: Lundi 22 Mars 2010 18h03:49 GMT +01:00 Amsterdam / Berlin / Berne / Rome / Stockholm / Vienne
> Objet: [Pacemaker] Question about Dual Primary DRBD + OCFS2
>
> Hi all,
>
> Following this doc http://www.clusterlabs.org/wiki/Dual_Primary_DRBD_%2B_OCFS2, I've just installed 2 nodes (with some minors adjustements) and now I'm testing my setup.
> If I set one node in standby and bring it online again, the other node sees this node "Inconsistent". The node just back from standby mode is UpToDate for him.
> I've not this problem when I reboot a node (reboot).
> I think that the problem is (from my log) :
> ERROR: r0: Called drbdadm -c /etc/drbd.conf secondary r0
> State change failed: (-12) Device is held open by someone
>
> I've no STONITH system :-( Is it a problem for my tests ?
>
> Thanks to all, sorry for my english.
> Regards.
>
> _______________________________________________
> Pacemaker mailing list
> Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> _______________________________________________
> Pacemaker mailing list
> Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
>

_______________________________________________
Pacemaker mailing list
Pacemaker at oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker
-------------- next part --------------
A non-text attachment was scrubbed...
Name: report.tar.bz2
Type: application/x-bzip
Size: 125591 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20100324/78c4f918/attachment-0001.bin>


More information about the Pacemaker mailing list