[Pacemaker] Master Slave resource

Thu Dec 9 12:22:29 EST 2010

On Thu, Dec 9, 2010 at 3:06 PM, ruslan usifov <ruslan.usifov at gmail.com> wrote:
> Hi! Thanks for your reply!
>
> I make some mistakes in configuration, and now i have againt segfault on
> lastest sources(pacemaker-1-0_b0266dd5ffa9) :
>
> Dec  9 16:51:29 storage0 kernel: [  407.923417] pengine[891]: segfault at 8
> ip b77289b8 sp bfe38120 error 4 in libpengine.so.3.0.0[b771d000+33000]
> Dec  9 16:52:32 storage0 kernel: [  470.943739] pengine[958]: segfault at 8
> ip b78479b8 sp bff05cd0 error 4 in libpengine.so.3.0.0[b783c000+33000]
> Dec  9 16:53:35 storage0 kernel: [  534.044403] pengine[962]: segfault at 8
> ip b77fc9b8 sp bfd1fd50 error 4 in libpengine.so.3.0.0[b77f1000+33000]

Thats not good at all.
Version? Stack trace?

> i do foollow:
>       order o1 inf: ms_drbd_web iscsi
>
> After that, pacemaker go to the  segfault on one node(in my case storage1).
> As i understated pacemaker try to commit bad changes and fault, how can i
> discard this changes?
>
>
>
> 2010/12/9 Andrew Beekhof <andrew at beekhof.net>
>>
>> On Wed, Dec 8, 2010 at 12:26 PM, ruslan usifov <ruslan.usifov at gmail.com>
>> wrote:
>> > hello
>> >
>> > I have 2 node cluster with follow conf:
>> > node storage0
>> > node storage1
>> > primitive drbd_web ocf:linbit:drbd \
>> >         params drbd_resource="web" \
>> >         op monitor interval="30s" timeout="60s"
>> > primitive iscsi_ip ocf:heartbeat:IPaddr2 \
>> >         params ip="192.168.17.19" nic="eth1:1" cidr_netmask="24" \
>> >         op monitor interval="10s" \
>> >         meta target-role="Started"
>> > primitive iscsi_web_target ocf:heartbeat:iSCSITarget \
>> >         params iqn="iqn.2010-06.playrix.local:san.web"
>> > implementation="iet"
>> > \
>> >         op monitor interval="10s" timeout="30s" depth="0" \
>> >         meta target-role="Started"
>> > primitive iscsi_web_target_lun1 ocf:heartbeat:iSCSILogicalUnit \
>> >         params lun="1" path="/dev/drbd1"
>> > target_iqn="iqn.2010-06.playrix.local:san.web" implementation="iet" \
>> >         op monitor interval="10s" timeout="30s"
>> > group iscsi iscsi_ip iscsi_web_target iscsi_web_target_lun1
>> > ms ms_drbd_web drbd_web \
>> >         meta master-max="1" master-node-max="1" clone-max="2"
>> > clone-node-max="1" notify="true" target-role="Started"
>> > colocation iscsi_on_drbd inf: ms_drbd_web:Master iscsi
>> > order iscsi_target_after_drbd inf: ms_drbd_web:promote iscsi_web_target
>> > order iscsi_target_lun_after_iscsi_target inf: iscsi_web_target
>> > iscsi_web_target_lun1
>> > property $id="cib-bootstrap-options" \
>> >         dc-version="1.0.10-b0266dd5ffa9c51377c68b1f29d6bc84367f51dd" \
>> >         cluster-infrastructure="openais" \
>> >         expected-quorum-votes="2" \
>> >         stonith-enabled="false" \
>> >         no-quorum-policy="ignore"
>> > rsc_defaults $id="rsc-options" \
>> >         resource-stickiness="100"
>> >
>> >
>> > after some throbles with pacemaker(segfault in older version in ubuntu)
>> > I
>> > can not get to work ms_drbd_web. It always show only slaves status:
>>
>> This says only promote drbd where iscsi group is running:
>>    colocation iscsi_on_drbd inf: ms_drbd_web:Master iscsi
>>
>> And since its only partially active, drbd wont be made a master.
>> Perhaps you want:
>>    colocation iscsi_on_drbd inf: iscsi ms_drbd_web:Master
>>
>> > ============
>> > Last updated: Tue Dec  7 14:00:13 2010
>> > Stack: openais
>> > Current DC: storage0 - partition with quorum
>> > Version: 1.0.10-b0266dd5ffa9c51377c68b1f29d6bc84367f51dd
>> > 2 Nodes configured, 2 expected votes
>> > 2 Resources configured.
>> > ============
>> >
>> > Online: [ storage1 storage0 ]
>> >
>> >  Master/Slave Set: ms_drbd_web
>> >      Slaves: [ storage0 storage1 ]
>> >  Resource Group: iscsi
>> >      iscsi_ip   (ocf::heartbeat:IPaddr2):       Started storage1
>> >      iscsi_web_target   (ocf::heartbeat:iSCSITarget):   Started storage1
>> >      iscsi_web_target_lun1      (ocf::heartbeat:iSCSILogicalUnit):
>> > Stopped
>> >
>> > Failed actions:
>> >     iscsi_web_target_lun1_monitor_0 (node=storage0, call=5, rc=5,
>> > status=complete): not installed
>> >     iscsi_web_target_monitor_0 (node=storage0, call=4, rc=5,
>> > status=complete): not installed
>> >     iscsi_web_target_lun1_start_0 (node=storage1, call=13, rc=1,
>> > status=complete): unknown error
>> >
>> > but no split brain situation(nothing about in logs)
>> > If i do master selection myself,
>> >
>> > drbdadm -- --overwrite-data-of-peer primary all
>> >
>> > pacemaker still switch move to [Slave, Slave]
>> >
>> > I found follow errors in my corosync.log:
>> >
>> > ERROR: clone_rsc_order_rh_non_clone: Unknown action:
>> > iscsi_web_target_demote_0
>> >
>> >
>> > What i do wrong, and how can i restore drdb to work?
>> >
>> > _______________________________________________
>> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>> >
>> > Project Home: http://www.clusterlabs.org
>> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> > Bugs:
>> >
>> > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>> >
>> >
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs:
>> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs:
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>
>