[ClusterLabs] Ubuntu 18.04 and corosync-qdevice

Nickle, Richard rnickle at holycross.edu
Fri Aug 9 11:46:42 EDT 2019


I've built a two-node DRBD cluster with SBD and STONITH, following advice
from ClusterLabs, LinBit, Beekhof's blog on SBD.

I still cannot get automated failover when I down one of the nodes.  I
thought that perhaps I needed to have an odd-numbered quorum so I attempted
to follow the corosync-qdevice instructions here:

https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/high_availability_add-on_reference/s1-quorumdev-haar

Ubuntu's init.d scripts don't work right out of the box, but I was able to
fix that.  corosync-qdevice starts but immediately terminates with an
error, so I don't see the qdevice.

$ sudo pcs property
> Cluster Properties:
>  cluster-infrastructure: corosync
>  cluster-name: hanfsweb
>  dc-version: 1.1.18-2b07d5c5a9
>  have-watchdog: true
>  no-quorum-policy: stop
>  stonith-enabled: true
>  stonith-timeout: 120s
>  stonith-watchdog-timeout: 10
>

$ sudo pcs quorum status
> Quorum information
> ------------------
> Date:             Fri Aug  9 11:34:55 2019
> Quorum provider:  corosync_votequorum
> Nodes:            2
> Node ID:          1
> Ring ID:          1/464
> Quorate:          Yes
> Votequorum information
> ----------------------
> Expected votes:   3
> Highest expected: 3
> Total votes:      2
> Quorum:           2 Activity blocked
> Flags:            WaitForAll
>
> Membership information
> ----------------------
>     Nodeid      Votes    Qdevice Name
>          1          1         NR hanfsweb2.holycross.edu (local)
>          2          1         NR hanfsweb4.holycross.edu




'corosync-qdevice' does not generate *ANY* debug output:

$ sudo corosync-qdevice -f -d


 But it is trying to use IPC and send messages:

$ sudo strace corosync-qdevice -f -d 2>&1 | tail -15
> openat(AT_FDCWD, "/dev/shm/qb-votequorum-event-12248-24916-30-header",
> O_RDWR) = 9
> ftruncate(9, 8248)                      = 0
> mmap(NULL, 8248, PROT_READ|PROT_WRITE, MAP_SHARED, 9, 0) = 0x7fbf6df67000
> openat(AT_FDCWD, "/dev/shm/qb-votequorum-event-12248-24916-30-data",
> O_RDWR) = 10
> ftruncate(10, 1052672)                  = 0
> getpid()                                = 24916
> sendto(11, "<30>Aug  9 11:44:56 corosync-qde"..., 102, MSG_NOSIGNAL, NULL,
> 0) = 102
> mmap(NULL, 2105344, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
> 0x7fbf6a4c7000
> mmap(0x7fbf6a4c7000, 1052672, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_FIXED,
> 10, 0) = 0x7fbf6a4c7000
> mmap(0x7fbf6a5c8000, 1052672, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_FIXED,
> 10, 0) = 0x7fbf6a5c8000
> close(10)                               = 0
> close(9)                                = 0
> sendto(8, "\20", 1, MSG_NOSIGNAL, NULL, 0) = 1
> exit_group(1)                           = ?
> +++ exited with 1 +++


I can't tell the version of corosync-qdevice that Ubuntu 18.04 has, but my
Corosync is 2.4.3.

Thanks,

Rick
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20190809/45470b9e/attachment-0001.html>


More information about the Users mailing list