[Pacemaker] DRBD 2 node cluster and STONITH configuration help required.

Andrew Beekhof andrew at beekhof.net
Thu Feb 4 10:30:26 EST 2010


Have you seen:
   http://www.clusterlabs.org/doc/crm_fencing.html

Should answer most of your questions

On Thu, Feb 4, 2010 at 11:43 AM, Tom Pride <tom.pride at gmail.com> wrote:
> Hi there,
>
> I have successfully configured a 2 node DRBD pacemaker cluster using the
> instructions provided by LINBIT here:
> http://www.drbd.org/users-guide-emb/ch-pacemaker.html.  The cluster works
> perfectly and I can migrate the resources back and forth between the two
> nodes without a problem.  However, when simulating certain cluster
> communication failures, I am having problems preventing the DRBD cluster
> from entering a split brain state.  I have been led to believe that STONITH
> will help prevent split brain situations, but the LINBIT instructions do not
> provide any guidance on how to conifgure STONITH in the pacemaker cluster.
> The only thing I can find in LINBITs documentation is where it talks about
> the resource fencing options within the /etc/drbd.conf of which I have
> configured:
>
>
> resource r0 {
>   disk {
>     fencing resource-only;
>   }
>   handlers {
>     fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
>     after-resync-target "/usr/lib/drbd/crm-unfence-peer.sh";
>   }
>
> I'm still at a loss to understand what actually triggers DRBD to run the
> above fencing scripts or how to tell if it has run them.
>
> I've searched the internet high and low for example pacemaker configs that
> show you how to configure STONITH resources for DRBD, but I can't find
> anything useful.
>
> Whilst hunting the Internet I did find this howto: (
> http://www.howtoforge.com/installation-and-setup-guide-for-drbd-openais-pacemaker-xen-on-opensuse-11.1
> )  that spells out how to configure a DRBD pacemaker cluster and even states
> the following: "STONITH is disabled in this [example] configuration though
> it is highly-recommended in any production environment to eliminate the risk
> of divergent data." Infuriatingly it doesn't tell you how to configure
> STONITH!
>
> Could someone you please, please, please give me some pointers or some
> helpful examples on how I go about configuring STONITH and or modifying my
> pacemaker configuration in any other ways to get it into a production ready
> state?  My current configuration is listed below:
>
> The cluster is built on 2 Redhat EL 5.3 servers running the following
> software versions:
> drbd-8.3.6-1
> pacemaker-1.0.5-4.1
> openais-0.80.5-15.1
>
>
> root at mq001:~# crm configure show
> node mq001.back.live.cwwtf.local
> node mq002.back.live.cwwtf.local
> primitive activemq-emp lsb:bbc-activemq-emp
> primitive activemq-forge-services lsb:bbc-activemq-forge-
> services
> primitive activemq-social lsb:activemq-social
> primitive drbd_activemq ocf:linbit:drbd \
>     params drbd_resource="r0" \
>     op monitor interval="15s"
> primitive fs_activemq ocf:heartbeat:Filesystem \
>     params device="/dev/drbd1" directory="/drbd" fstype="ext3"
> primitive ip_activemq ocf:heartbeat:IPaddr2 \
>     params ip="172.23.8.71" nic="eth0"
> group activemq fs_activemq ip_activemq activemq-forge-services activemq-emp
> activemq-social
> ms ms_drbd_activemq drbd_activemq \
>     meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1"
> notify="true"
> colocation activemq_on_drbd inf: activemq ms_drbd_activemq:Master
> order activemq_after_drbd inf: ms_drbd_activemq:promote activemq:start
> property $id="cib-bootstrap-options" \
>     dc-version="1.0.5-462f1569a43740667daf7b0f6b521742e9eb8fa7" \
>     cluster-infrastructure="openais" \
>     expected-quorum-votes="2" \
>     no-quorum-policy="ignore" \
>     last-lrm-refresh="1260809203"
>
> /etc/drbd.conf
>
> global {
>   usage-count no;
> }
> common {
>   protocol C;
> }
> resource r0 {
>   disk {
>     fencing resource-only;
>   }
>   handlers {
>     fence-peer "/usr/lib/drbd/crm-fence-peer.
> sh";
>     after-resync-target "/usr/lib/drbd/crm-unfence-peer.sh";
>   }
>   syncer {
>     rate 40M;
>   }
>   on mq001.back.live.cwwtf.local {
>     device    /dev/drbd1;
>     disk      /dev/cciss/c0d0p1;
>     address   172.23.8.69:7789;
>     meta-disk internal;
>   }
>   on mq002.back.live.cwwtf.local {
>     device    /dev/drbd1;
>     disk      /dev/cciss/c0d0p1;
>     address   172.23.8.70:7789;
>     meta-disk internal;
>   }
> }
>
>
> root at mq001:~# cat /etc/ais/openais.conf
> totem {
>   version: 2
>   token: 3000
>   token_retransmits_before_loss_const: 10
>   join: 60
>   consensus: 1500
>   vsftype: none
>   max_messages: 20
>   clear_node_high_bit: yes
>   secauth: on
>   threads: 0
>   rrp_mode: passive
>   interface {
>     ringnumber: 0
>     bindnetaddr: 172.59.60.0
>     mcastaddr: 239.94.1.1
>     mcastport: 5405
>   }
>   interface {
>     ringnumber: 1
>     bindnetaddr: 172.23.8.0
>     mcastaddr: 239.94.2.1
>     mcastport: 5405
>   }
> }
> logging {
>   to_stderr: yes
>   debug: on
>   timestamp: on
>   to_file: no
>   to_syslog: yes
>   syslog_facility: daemon
> }
> amf {
>   mode: disabled
> }
> service {
>   ver:       0
>   name:      pacemaker
>   use_mgmtd: yes
> }
> aisexec {
>   user:   root
>   group:  root
> }
>
> Many Thanks,
> Tom
>
>
>
> _______________________________________________
> Pacemaker mailing list
> Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
>




More information about the Pacemaker mailing list