[Pacemaker] stonithd segfault

Andrew Beekhof andrew at beekhof.net
Wed May 8 18:19:22 EDT 2013


On 08/05/2013, at 10:33 PM, Pavel <free.lan.c2.718r at gmail.com> wrote:

> Hello everyone
> 
> Can anyone, please assist me with the following problem. In syslog I get the following messages:
> 
> kernel: stonithd[2029]: segfault at 0 ip 00000000004047ed sp 00007fffe886c8c0 error 4 in stonithd[400000+17000]

We need the full stack trace (we can't use the core, you'll have to open it with gdb and type "where")

> pacemakerd[2025]: notice: pcmk_child_exit: Child process stonith-ng terminated with signal 11 (pid=2029, core=128)
> 
> Then pacemakerd tries to respawn stonith-ng, but it fails again and this goes infinitely.
> 
> I have found a very similar problem in the mailing list archives, but it was already fixed and was related to Heartbeat only, while I'm using Corosync.
> 
> What I have noticed is that this is somehow related to DRBD that I configure. With empty configuration (no RAs) or some other RAs (IPaddr2, ...), stonithd is running without any problem.
> At the same time, despite the issue, DRBD Master / Slave resource seems to work correctly.
> 
> Here is my configuration:
> 
>> node $id="1" fio-node1 \
>>    attributes standby="off"
>> node $id="2" fio-node2 \
>>    attributes standby="off"
>> rsc_template drbd-r ocf:linbit:drbd \
>>    op start interval="0" timeout="240" \
>>    op promote interval="0" timeout="90" \
>>    op demote interval="0" timeout="90" \
>>    op notify interval="0" timeout="90" \
>>    op stop interval="0" timeout="100" \
>>    op monitor interval="20" role="Slave" timeout="20" \
>>    op monitor interval="10" role="Master" timeout="20"
>> primitive drbd-r1 @drbd-r \
>>    params drbd_resource="r1"
>> ms ms-r1 drbd-r1 \
>>    meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"
>> property $id="cib-bootstrap-options" \
>>    dc-version="1.1.9-2a917dd" \
>>    cluster-infrastructure="corosync" \
>>    stonith-enabled="false" \
>>    last-lrm-refresh="1366018562"
> 
> and here is drbd.conf:
> 
>> include "drbd.d/global_common.conf";
>> include "drbd.d/*.res";
>> 
>> resource r1 {
>>    device /dev/drbd1;
>>    disk /dev/vg-bio/lv1;
>>    meta-disk internal;
>>    on fio-node1 {
>>    address 172.17.68.128:7789;
>>    }
>>    on fio-node2 {
>>    address 172.17.68.129:7789;
>>    }
>> }
> 
> You can download full configuration (cib, corosync.conf, drbd.conf, drbd.d/global-common.conf) here - http://up.iteam.ua/download/152101/50aa518a439747e72/.
> 
> I'm using Pacemaker 1.1.9 with Corosync 2.3.0 and crmsh 1.2.5 all built from source on Ubuntu Server 12.10 x64.
> Build options for the above are:
> pacemaker: ./configure --with-corosync --with-cs-quorum --without-ais --without-heartbeat --without-cman --with-snmp
> corosync: ./configure --disable-rdma --disable-testagents --disable-dbus --enable-snmp --enable-qdevices
> crmsh: ./configure
> 
> Any help or guidance is highly appreciated. Thanks!
> 
> 
> 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org





More information about the Pacemaker mailing list