[Pacemaker] Cluster split brain on vmware VSphere
    Torresani, Roberto 
    roberto.torresani at unitn.it
       
    Mon Jun 21 07:47:48 UTC 2010
    
    
  
Sure.
Give me some time (I'm very busy at the moment).
In the next few days I will share the scripts (awfully written... ;-)) with you.
Regards,
Roberto 
> -----Original Message-----
> From: Koch, Sebastian [mailto:Sebastian.Koch at netzwerk.de] 
> Sent: Friday, June 11, 2010 4:30 PM
> To: The Pacemaker cluster resource manager
> Subject: Re: [Pacemaker] Cluster split brain on vmware VSphere
> 
> Hi,
> 
> i read your entry and you wrote that you used some adopted 
> xen stonith scripts to get it up and running under vmware. 
> Could you share your expierences and hjow you solved thatr issue?
> 
> Thanks in advance.
> 
> Sebastian Koch
>                                                          
> -----Ursprüngliche Nachricht-----
> Von: Torresani, Roberto [mailto:roberto.torresani at unitn.it] 
> Gesendet: Freitag, 11. Juni 2010 15:22
> An: The Pacemaker cluster resource manager
> Betreff: Re: [Pacemaker] Cluster split brain on vmware VSphere
> 
>  
> 
> > -----Original Message-----
> > From: Dejan Muhamedagic [mailto:dejanmm at fastmail.fm] 
> > Sent: Wednesday, June 09, 2010 2:23 PM
> > To: The Pacemaker cluster resource manager
> > Subject: Re: [Pacemaker] Cluster split brain on vmware VSphere
> > 
> > Hi,
> > 
> > On Wed, Jun 09, 2010 at 12:11:09PM +0200, Torresani, Roberto wrote:
> > > Well... it seem to be SOLVED!!!
> > > Thank you Dejan.
> > > In the next few days I will load the cluster and then see 
> > how it behaves.
> > > 
> > > I simply raise the token value to 10000 msec, leave all the 
> > others to the defaults.
> > 
> > You should also raise the consensus value to 12000. corosync
> > would even refuse to start in this case.
> 
> 
> Yes, I simply leave corosync to determine the value as 1.2*token=12000
> 
> Thank you again
> Best regards
> Roberto
> 
> 
> 
> > 
> > Thanks,
> > 
> > Dejan
> > 
> > > 
> > > Thank you again.
> > > Regards,
> > > Roberto
> > > 
> > >  
> > > 
> > > > -----Original Message-----
> > > > From: Dejan Muhamedagic [mailto:dejanmm at fastmail.fm] 
> > > > Sent: Tuesday, June 08, 2010 6:42 PM
> > > > To: The Pacemaker cluster resource manager
> > > > Subject: Re: [Pacemaker] Cluster split brain on vmware VSphere
> > > > 
> > > > Hi,
> > > > 
> > > > On Mon, Jun 07, 2010 at 02:57:57PM +0200, Torresani, 
> > Roberto wrote:
> > > > > Sorry for have choosen the wrong ml... 
> > > > 
> > > > That's no problem. There's just better chance of getting help on
> > > > the other list.
> > > > 
> > > > > Here the corosync.conf used by one cluster, the other one is
> > > > > just the same provided by the epel repository packages.
> > > > > 
> > > > > I will try to raise the token value to 10000 as you 
> suggest. Is
> > > > > there a theoretical or a best practice to set this value ?
> > > > 
> > > > No, but 5000 should be OK for most. Ultimately, it depends on
> > > > your network. I forgot what was exactly the case here, but it
> > > > seems like you had some heavy processing (backup?) which used
> > > > most of resources. That may be really hard to predict. You can
> > > > use sar or similar to monitor the load.
> > > > 
> > > > Thanks,
> > > > 
> > > > Dejan
> > > > 
> > > > > I will keep you informed as it goes, and open a thread on the
> > > > > corosync ml if necessary.
> > > > > 
> > > > > Thank you.
> > > > > 
> > > > > 
> > > > > # Please read the corosync.conf.5 manual page
> > > > > compatibility: whitetank
> > > > > 
> > > > > totem {
> > > > >         version: 2
> > > > >         secauth: off
> > > > >         threads: 0
> > > > >         token:          1000
> > > > >         hold: 180
> > > > >         token_retransmits_before_loss_const: 20
> > > > >         join:           60
> > > > >         consensus:      4800
> > > > >         vsftype:        none
> > > > >         max_messages:   20
> > > > >         interface {
> > > > >                 ringnumber: 0
> > > > >                 bindnetaddr: 192.168.206.0
> > > > >                 mcastaddr: 226.94.1.1
> > > > >                 mcastport: 5405
> > > > >         }
> > > > > }
> > > > > 
> > > > > logging {
> > > > >         fileline: off
> > > > >         to_stderr: yes
> > > > >         to_logfile: yes
> > > > >         to_syslog: yes
> > > > >         logfile: /tmp/corosync.log
> > > > >         debug: off
> > > > >         timestamp: on
> > > > >         logger_subsys {
> > > > >                 subsys: AMF
> > > > >                 debug: off
> > > > >         }
> > > > > }
> > > > > 
> > > > > amf {
> > > > >         mode: disabled
> > > > > }
> > > > > 
> > > > > aisexec {
> > > > >     user:  root
> > > > >     group: root
> > > > > }
> > > > > 
> > > > > service {
> > > > >     name: pacemaker
> > > > >     ver: 0
> > > > > }
> > > > > _______________________________________________
> > > > > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > > > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > > > > 
> > > > > Project Home: http://www.clusterlabs.org
> > > > > Getting started: 
> > > > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > > > > Bugs: 
> > > > http://developerbugs.linux-foundation.org/enter_bug.cgi?produc
> > > t=Pacemaker
> > > > 
> > > > _______________________________________________
> > > > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > > > 
> > > > Project Home: http://www.clusterlabs.org
> > > > Getting started: 
> > > > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > > > Bugs: 
> > > > http://developerbugs.linux-foundation.org/enter_bug.cgi?produc
> > > t=Pacemaker
> > > > 
> > > _______________________________________________
> > > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > > 
> > > Project Home: http://www.clusterlabs.org
> > > Getting started: 
> > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > > Bugs: 
> > http://developerbugs.linux-foundation.org/enter_bug.cgi?produc
> t=Pacemaker
> > 
> > _______________________________________________
> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > 
> > Project Home: http://www.clusterlabs.org
> > Getting started: 
> > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: 
> > http://developerbugs.linux-foundation.org/enter_bug.cgi?produc
> t=Pacemaker
> > 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: 
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: 
> http://developerbugs.linux-foundation.org/enter_bug.cgi?produc
> t=Pacemaker
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: 
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: 
> http://developerbugs.linux-foundation.org/enter_bug.cgi?produc
> t=Pacemaker
> 
    
    
More information about the Pacemaker
mailing list