[ClusterLabs] Tuchanka

Fri Oct 2 09:15:37 EDT 2020

On Fri, 2 Oct 2020 15:18:18 +0300
Олег Самойлов <splarv at ya.ru> wrote:

> > On 29 Sep 2020, at 11:34, Jehan-Guillaume de Rorthais <jgdr at dalibo.com>
> > wrote:
> > 
> > 
> > Vagrant use virtualbox by default, which supports softdog, but it support
> > many other virtualization plateform, including eg. libvirt/kvm where you
> > can use virtualized watchdog card.
> >   
> >>   
> > 
> > Vagrant can use Chef, Ansible, Salt, puppet, and others to provision VM:
> > 
> >  https://www.vagrantup.com/docs/provisioning
> > 
> > 
> > There many many available vagrant images:
> > https://app.vagrantup.com/boxes/search There's many vagrant image...because
> > building vagrant image is easy. I built some when RH8 wasn't available yet.
> > So if you need special box, with eg. some predefined setup, you can do it
> > quite fast.  
> 
> My english is poor, I'll try to find other words. My primary and main task
> was to create a prototype for an automatic deploy system. So I used only the
> same technique that will be used on the real hardware servers: RedHat dvd
> image + kickstart. And to test such deploying too. That's why I do not use
> any special image for virtual machines.

How exactly using a vagrant box you built yourself is different with
virtualbox where you clone (I suppose) an existing VM you built?

> > Watchdog is kind of a self-fencing method. Cluster with quorum+watchdog, or
> > SBD+watchdog or quorum+SBD+watchdog are fine...without "active" fencing.  
> 
> quorum+watchdog or SBD+watchdog are useless. Quorum+SBD+watchdog is a
> solution, but also has some drawback, so this is not perfect or fine yet.

Well, by "SBD", I meant "Storage Based Death": using a shared storage to poison
pill other nodes. Not just the sbd daemon, that is used for SBD and watchdog.
Sorry for the shortcut and the confusion.

> I'll write about it below.
>   
> >>> Now, in regard with your multi-site clusters and how you deal with it
> >>> using quorum, did you read the chapter about the Cluster Ticket Registry
> >>> in Pacemaker doc ? See:
> >>> 
> >>> https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html/Pacemaker_Explained/ch15.html    
> >> 
> >> Yep, I read the whole documentation two years ago. Yep, the ticket system
> >> was looked interesting at first glance, but I didn't see a method how to
> >> use it with PAF. :)  
> > 
> > It could be interesting to have detailed feedback about that. Could you
> > share your experience?  
> 
> Heh, I don't have experience of using the ticket system because I can't even
> imaging how to use the ticket system with PAF.

OK

> As about pacemaker without STONITH the idea was simple: quorum + SBD as
> watchdog daemon.

(this was what I describe as "quorum+watchdog", again sorry for the
confusion :))

> More precisely described in the README. Proved by my test
> system this is mostly works. :)
> 
> What are possible caveats. First of all softdog is not good for this (only
> for testing), and system will heavily depend on reliability of the watchdog
> device.

+1

> SBD is not good as watchdog daemon. In my version it does not check
> that the corosync and any processes of the pacemaker are not frozen (for
> instance by kill -STOP). Looked like checking for corosync have been already
> done: https://github.com/ClusterLabs/sbd/pull/83

Good.

> Don't know what about checking all processes of the pacemaker.

This moves toward the good direction I would say:

  https://lists.clusterlabs.org/pipermail/users/2020-August/027602.html

The main Pacemaker process is now checked by sbd. Maybe other processes will be
included in futur releases as "more in-depth health checks" as written in this
email.

Regards,