[Pacemaker] R: Stonith external/sbd problem

Vit Pelcak vpelcak at suse.cz
Thu Apr 29 10:06:54 EDT 2010


Dne 29.4.2010 15:46, Nicola Sabatelli napsal(a):
>
> I have done exactly the configuration in the SBD_Fencing documentation.
>
> That is:
>
> /etc/sysconfig/sbd
>
> SBD_DEVICE="/dev/mapper/mpath1p1"
>
> SBD_OPTS="-W"
>
> And I start the demon in this manner:
>
> /usr/sbin/sbd -d /dev/mapper/mpath1p1 -D -W watch
>
> Is correct?
>

IMHO no.

You must create sbd first

sbd -d /dev/shared_disk create

and then allocate

sbd -d /dev/shared_disk allocate your_machine


> Ciao, Nicola.
>
> ------------------------------------------------------------------------
>
> *Da:* Vit Pelcak [mailto:vpelcak at suse.cz]
> *Inviato:* giovedì 29 aprile 2010 15.02
> *A:* pacemaker at oss.clusterlabs.org
> *Oggetto:* Re: [Pacemaker] Stonith external/sbd problem
>
>  
>
> cat /etc/sysconfig/sbd
>
> SBD_DEVICE="/dev/sda1"
> SBD_OPTS="-W"
>
>
> sbd -d /dev/shared_disk create
> sbd -d /dev/shared_disk allocate your_machine
>
>
> Dne 29.4.2010 14:55, Michael Brown napsal(a):
>
> Oh, I forgot a piece: I had simular trouble until I actually properly
> started sbd and then it worked.
>
> M.
>
> ------------------------------------------------------------------------
>
> *From*: Michael Brown
> *To*: pacemaker at oss.clusterlabs.org
> <mailto:pacemaker at oss.clusterlabs.org>
> *Sent*: Thu Apr 29 08:53:32 2010
> *Subject*: Re: [Pacemaker] Stonith external/sbd problem
>
> I just set this up myself and it worked fine for me.
>
> Did you follow the guide? You need to configure the sbd daemon to run
> on bootup with appropriate options before external/sbd can use it.
>
> M.
>
> ------------------------------------------------------------------------
>
> *From*: Nicola Sabatelli
> *To*: pacemaker at oss.clusterlabs.org
> <mailto:pacemaker at oss.clusterlabs.org>
> *Sent*: Thu Apr 29 08:47:04 2010
> *Subject*: [Pacemaker] Stonith external/sbd problem
>
> I have a problem with STONITH plugin external/sbd.
>
> I have configured the system in according to directive that I find at
> url http://www.linux-ha.org/wiki/SBD_Fencing, and the device that I
> use is configured with multipath software because this disk is
> residend on a storage system.
>
> I have create a resurse on my cluster using clove directive.
>
> But when I try to start the resurse I have these errors:
>
>  
>
> from ha-log file:
>
>  
>
> Apr 29 14:37:51 clover-h stonithd: [16811]: info: external_run_cmd:
> Calling '/usr/lib64/stonith/plugins/external/sbd status' returned 256
>
> Apr 29 14:37:51 clover-h stonithd: [16811]: CRIT: external_status:
> 'sbd status' failed with rc 256
>
> Apr 29 14:37:51 clover-h stonithd: [10615]: WARN: start
> stonith_external_sbd_LOCK_LUN:0 failed, because its hostlist is empty
>
>  
>
> from crm_verify:
>
>  
>
> crm_verify[18607]: 2010/04/29_14:39:27 info: main: =#=#=#=#= Getting
> XML =#=#=#=#=
>
> crm_verify[18607]: 2010/04/29_14:39:27 info: main: Reading XML from:
> live cluster
>
> crm_verify[18607]: 2010/04/29_14:39:27 notice: unpack_config: On loss
> of CCM Quorum: Ignore
>
> crm_verify[18607]: 2010/04/29_14:39:27 info: unpack_config: Node
> scores: 'red' = -INFINITY, 'yellow' = 0, 'green' = 0
>
> crm_verify[18607]: 2010/04/29_14:39:27 info: determine_online_status:
> Node clover-a.rsr.rupar.puglia.it is online
>
> crm_verify[18607]: 2010/04/29_14:39:27 WARN: unpack_rsc_op: Processing
> failed op stonith_external_sbd_LOCK_LUN:1_start_0 on
> clover-a.rsr.rupar.puglia.it: unknown error (1)
>
> crm_verify[18607]: 2010/04/29_14:39:27 info: find_clone: Internally
> renamed stonith_external_sbd_LOCK_LUN:0 on
> clover-a.rsr.rupar.puglia.it to stonith_external_sbd_LOCK_LUN:2 (ORPHAN)
>
> crm_verify[18607]: 2010/04/29_14:39:27 info: determine_online_status:
> Node clover-h.rsr.rupar.puglia.it is online
>
> crm_verify[18607]: 2010/04/29_14:39:27 WARN: unpack_rsc_op: Processing
> failed op stonith_external_sbd_LOCK_LUN:0_start_0 on
> clover-h.rsr.rupar.puglia.it: unknown error (1)
>
> crm_verify[18607]: 2010/04/29_14:39:27 notice: clone_print: 
> Master/Slave Set: ms_drbd_1
>
> crm_verify[18607]: 2010/04/29_14:39:27 notice: short_print:     
> Stopped: [ res_drbd_1:0 res_drbd_1:1 ]
>
> crm_verify[18607]: 2010/04/29_14:39:27 notice: native_print:
> res_Filesystem_TEST        (ocf::heartbeat:Filesystem):    Stopped
>
> crm_verify[18607]: 2010/04/29_14:39:27 notice: native_print:
> res_IPaddr2_ip_clover      (ocf::heartbeat:IPaddr2):       Stopped
>
> crm_verify[18607]: 2010/04/29_14:39:27 notice: clone_print:  Clone
> Set: cl_external_sbd_1
>
> crm_verify[18607]: 2010/04/29_14:39:27 notice: native_print:     
> stonith_external_sbd_LOCK_LUN:0       (stonith:external/sbd): Started
> clover-h.rsr.rupar.puglia.it FAILED
>
> crm_verify[18607]: 2010/04/29_14:39:27 notice: native_print:     
> stonith_external_sbd_LOCK_LUN:1       (stonith:external/sbd): Started
> clover-a.rsr.rupar.puglia.it FAILED
>
> crm_verify[18607]: 2010/04/29_14:39:27 info: get_failcount:
> cl_external_sbd_1 has failed 1000000 times on clover-h.rsr.rupar.puglia.it
>
> crm_verify[18607]: 2010/04/29_14:39:27 WARN: common_apply_stickiness:
> Forcing cl_external_sbd_1 away from clover-h.rsr.rupar.puglia.it after
> 1000000 failures (max=1000000)
>
> crm_verify[18607]: 2010/04/29_14:39:27 info: get_failcount:
> cl_external_sbd_1 has failed 1000000 times on clover-a.rsr.rupar.puglia.it
>
> crm_verify[18607]: 2010/04/29_14:39:27 WARN: common_apply_stickiness:
> Forcing cl_external_sbd_1 away from clover-a.rsr.rupar.puglia.it after
> 1000000 failures (max=1000000)
>
> crm_verify[18607]: 2010/04/29_14:39:27 info: native_merge_weights:
> ms_drbd_1: Rolling back scores from res_Filesystem_TEST
>
> crm_verify[18607]: 2010/04/29_14:39:27 WARN: native_color: Resource
> res_drbd_1:0 cannot run anywhere
>
> crm_verify[18607]: 2010/04/29_14:39:27 WARN: native_color: Resource
> res_drbd_1:1 cannot run anywhere
>
> crm_verify[18607]: 2010/04/29_14:39:27 info: native_merge_weights:
> ms_drbd_1: Rolling back scores from res_Filesystem_TEST
>
> crm_verify[18607]: 2010/04/29_14:39:27 info: master_color: ms_drbd_1:
> Promoted 0 instances of a possible 1 to master
>
> crm_verify[18607]: 2010/04/29_14:39:27 info: master_color: ms_drbd_1:
> Promoted 0 instances of a possible 1 to master
>
> crm_verify[18607]: 2010/04/29_14:39:27 info: native_merge_weights:
> res_Filesystem_TEST: Rolling back scores from res_IPaddr2_ip_clover
>
> crm_verify[18607]: 2010/04/29_14:39:27 WARN: native_color: Resource
> res_Filesystem_TEST cannot run anywhere
>
> crm_verify[18607]: 2010/04/29_14:39:27 WARN: native_color: Resource
> res_IPaddr2_ip_clover cannot run anywhere
>
> crm_verify[18607]: 2010/04/29_14:39:27 WARN: native_color: Resource
> stonith_external_sbd_LOCK_LUN:0 cannot run anywhere
>
> crm_verify[18607]: 2010/04/29_14:39:27 WARN: native_color: Resource
> stonith_external_sbd_LOCK_LUN:1 cannot run anywhere
>
> crm_verify[18607]: 2010/04/29_14:39:27 notice: LogActions: Leave
> resource res_drbd_1:0  (Stopped)
>
> crm_verify[18607]: 2010/04/29_14:39:27 notice: LogActions: Leave
> resource res_drbd_1:1  (Stopped)
>
> crm_verify[18607]: 2010/04/29_14:39:27 notice: LogActions: Leave
> resource res_Filesystem_TEST   (Stopped)
>
> crm_verify[18607]: 2010/04/29_14:39:27 notice: LogActions: Leave
> resource res_IPaddr2_ip_clover (Stopped)
>
> crm_verify[18607]: 2010/04/29_14:39:27 notice: LogActions: Stop
> resource stonith_external_sbd_LOCK_LUN:0       
> (clover-h.rsr.rupar.puglia.it)
>
> crm_verify[18607]: 2010/04/29_14:39:27 notice: LogActions: Stop
> resource stonith_external_sbd_LOCK_LUN:1       
> (clover-a.rsr.rupar.puglia.it)
>
> Warnings found during check: config may not be valid
>
>  
>
> and from crm_mon:
>
>  
>
> ============
>
> Last updated: Thu Apr 29 14:39:57 2010
>
> Stack: Heartbeat
>
> Current DC: clover-h.rsr.rupar.puglia.it
> (e39bb201-2a6f-457a-a308-be6bfe71309c) - partition with quorum
>
> Version: 1.0.8-9881a7350d6182bae9e8e557cf20a3cc5dac3ee7
>
> 2 Nodes configured, unknown expected votes
>
> 4 Resources configured.
>
> ============
>
>  
>
> Online: [ clover-h.rsr.rupar.puglia.it clover-a.rsr.rupar.puglia.it ]
>
>  
>
>  Clone Set: cl_external_sbd_1
>
>      stonith_external_sbd_LOCK_LUN:0    (stonith:external/sbd):
> Started clover-h.rsr.rupar.puglia.it FAILED
>
>      stonith_external_sbd_LOCK_LUN:1    (stonith:external/sbd):
> Started clover-a.rsr.rupar.puglia.it FAILED
>
>  
>
> Operations:
>
> * Node clover-a.rsr.rupar.puglia.it:
>
>    stonith_external_sbd_LOCK_LUN:1: migration-threshold=1000000
> fail-count=1000000
>
>     + (24) start: rc=1 (unknown error)
>
> * Node clover-h.rsr.rupar.puglia.it:
>
>    stonith_external_sbd_LOCK_LUN:0: migration-threshold=1000000
> fail-count=1000000
>
>     + (25) start: rc=1 (unknown error)
>
>  
>
> Failed actions:
>
>     stonith_external_sbd_LOCK_LUN:1_start_0
> (node=clover-a.rsr.rupar.puglia.it, call=24, rc=1, status=complete):
> unknown error
>
>     stonith_external_sbd_LOCK_LUN:0_start_0
> (node=clover-h.rsr.rupar.puglia.it, call=25, rc=1, status=complete):
> unknown error
>
>  
>
>  
>
>  
>
>  
>
> Ciao, Nicola.
>
>  
>
>  
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org <mailto:Pacemaker at oss.clusterlabs.org>
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>  
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>
>  
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20100429/a8c18f75/attachment-0001.html>


More information about the Pacemaker mailing list