[ClusterLabs] Tuchanka

Олег Самойлов splarv at ya.ru
Fri Sep 25 10:20:28 EDT 2020


Sorry for the late reply. I was on leave and after this some problems at my work.

> On 3 Sep 2020, at 17:23, Jehan-Guillaume de Rorthais <jgdr at dalibo.com> wrote:
> 
> Hi,
> 
> Thanks for sharing.
> 
> I had a very quick glance at your project. I wonder if you were aware of some
> existing projects/scripts that would have save you a lot of time. Or maybe you
> know them but they did not fit your needs? Here are some pointers:
> 
> # PAF vagrant files
> 
>  PAF repository have 3 different Vagrant files able to build 3 different kind
>  of clusters using libvirt.
>  I'm sure you can use Vagrant with Virtualbox for your needs.
> 
>  for a demo:
>  https://blog.ioguix.net/postgresql/2019/01/24/Build-a-PostreSQL-Automated-Failover-in-5-minutes.html

Vagrand was the secondary my attempt after Docker. I didn't not use it, because I didn't know that it can be used with libvirt and pure virtual machines. I need a pure VM, because I need in my schemas a watchdog device, at least the softdog.

Also one of the my tasks was to create a prototype for an automatic installation system, which latterly can be converted to Ansible, Salt, Puppet or Chef (sysadmins didn't know what to prefer). So the prototype of the automatic installation system was written on the pure bash. Installation is performed by the standard installation CentOS DVD image (or may be other RedHat compatible) and RedHats so called "kickstart" (implemented by VirtualBox). But Vagrant need the special preinstalled linux image, as far as I can understand, so it can not be used for prototyping an automatic installation system for real servers.

As for the automatic test system, yes, I think it can be rewritten to work with libvirt instead of VirtualBox. I don't see reasons why not. "PAF vagrant files" doesn't have an automatic test system, there is only possibility for manual testing. An automatic test system is important to look for low probable instability, to check new version of software or to play with setup parameters. So I can say that this step I already passed 2 years ago.
 
> # CTS
> 
>  Cluster Test Suite is provided with pacemaker to run some pre-defined failure
>  scenario against any kind of pacemaker-based cluster. I use it for basic
>  tests with PAF and wrote some doc about how to run it from one of the Vagrant
>  environment provided in PAF repo.
> 
>  See:
>  https://github.com/ClusterLabs/PAF/blob/master/extra/vagrant/README.md#cluster-test-suite

Interesting, this is looked like attempt to achieve the same goal, but with different method. What is the differences:

CTS uses Vagrant, while I imitate a kickstart automatic installation on real servers. CTS is written on python, I use bash. They concentrate on testing the pacemaker functionality, for instance start/stop nodes in different orders. While I concentrate on tests that imitate hardware failures (for instance unlink) or other catastrophic failures (out of space, etc). They wrote an universal pacemaker test, but my tests more special for the pacemaker+PAF+PostgreSQL cluster. They use STONITH based clusters, while I use quorum based clusters to survive a black out of whole datacenter. Using clusters without STONITH is forbidden in the RedHat documentation and is not recommended in the ClusterLabs documentation, that why I created this test bed to test such non-STONITH clusters. And I have a pretty tmux UI well suited for a presentation. :)

I glad to see that there is a concurrent project CTS. This is not bad, IMHO, they are complement each other in some way.

> # ra-tester:
> 
>  Damien Ciabrini from RH give a talk about ra-tester, which seems to extends
>  CTS with customs test, but I hadn't time to give it a look yet. Slides are
>  available here:
>  https://wiki.clusterlabs.org/wiki/File:CL2020-slides-Ciabrini-ra-tester.pdf

As I can see this is a python framework that will improve CTS somehow.

> Now, in regard with your multi-site clusters and how you deal with it using
> quorum, did you read the chapter about the Cluster Ticket Registry in Pacemaker
> doc ? See:
> 
> https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html/Pacemaker_Explained/ch15.html

Yep, I read the whole documentation two years ago. Yep, the ticket system was looked interesting at first glance, but I didn't see a method how to use it with PAF. :)



More information about the Users mailing list