[ClusterLabs] corosync-qdevice doesn't daemonize (or stay running)

Jason Gauthier jagauthier at gmail.com
Tue Jun 19 06:44:59 EDT 2018


On Tue, Jun 19, 2018 at 3:25 AM Christine Caulfield <ccaulfie at redhat.com> wrote:
>
> On 19/06/18 02:46, Jason Gauthier wrote:
> > Greetings,
> >
> >    I've just discovered corosync-qdevice and corosync-qnet.
> > (Thanks Ken Gaillot) . Set up was pretty quick.
> >
> > I enabled qnet off cluster.  I followed the steps presented by
> > corosync-qdevice-net-certutil.    However, when running
> > corosync-qdevice it exits.  Even with -f -d there isn't a single
> > output presented.
> >
>
> It sounds like the first time you ran it (without -d -f)
> corosync-qdevice started up and daemonised itself. The second time you
> tried (with -d -f) it couldn't run because there was already one
> running. There's a good argument for it printing an error if it's
> already running I think!
>

The process doesn't stay running.  I've showed in output of qnet below
that it launches, connected, and disconnects. I've rebooted several
times since then (testing stonith). I can provide strace output if
it's helpful.

>
> > But, if I run qnet with -f -d I can see the qdevices are connecting.
> >
> > Jun 18 21:19:32 debug   Initializing nss
> > Jun 18 21:19:32 debug   Initializing local socket
> > Jun 18 21:19:32 debug   Creating listening socket
> > Jun 18 21:19:32 debug   Registering algorithms
> > Jun 18 21:19:32 debug   QNetd ready to provide service
> > Jun 18 21:19:36 debug   New client connected
> > Jun 18 21:19:36 debug     cluster name = zeta
> > Jun 18 21:19:36 debug     tls started = 1
> > Jun 18 21:19:36 debug     tls peer certificate verified = 1
> > Jun 18 21:19:36 debug     node_id = 1084772368
> > Jun 18 21:19:36 debug     pointer = 0x55b1b0416d70
> > Jun 18 21:19:36 debug     addr_str = ::ffff:192.168.80.16:51024
> > Jun 18 21:19:36 debug     ring id = (40a85010.88ac)
> > Jun 18 21:19:36 debug     cluster dump:
> > Jun 18 21:19:36 debug       client = ::ffff:192.168.80.16:51024,
> > node_id = 1084772368
> > Jun 18 21:19:36 debug   Client ::ffff:192.168.80.16:51024 (cluster
> > zeta, node_id 1084772368) sent initial node list.
> > Jun 18 21:19:36 debug     msg seq num 4
> > Jun 18 21:19:36 debug     node list:
> > Jun 18 21:19:36 error   ffsplit: Received empty config node list for
> > client ::ffff:192.168.80.16:51024
> > Jun 18 21:19:36 error   Algorithm returned error code. Sending error reply.
> > Jun 18 21:19:36 debug   Client ::ffff:192.168.80.16:51024 (cluster
> > zeta, node_id 1084772368) sent membership node list.
> > Jun 18 21:19:36 debug     msg seq num 5
> > Jun 18 21:19:36 debug     ring id = (40a85010.88ac)
> > Jun 18 21:19:36 debug     node list:
> > Jun 18 21:19:36 debug       node_id = 1084772368, data_center_id = 0,
> > node_state = not set
> > Jun 18 21:19:36 debug       node_id = 1084772369, data_center_id = 0,
> > node_state = not set
> > Jun 18 21:19:36 debug   Algorithm result vote is Ask later
> > Jun 18 21:19:36 debug   Client ::ffff:192.168.80.16:51024 (cluster
> > zeta, node_id 1084772368) sent quorum node list.
> > Jun 18 21:19:36 debug     msg seq num 6
> > Jun 18 21:19:36 debug     quorate = 1
> > Jun 18 21:19:36 debug     node list:
> > Jun 18 21:19:36 debug       node_id = 1084772368, data_center_id = 0,
> > node_state = member
> > Jun 18 21:19:36 debug       node_id = 1084772369, data_center_id = 0,
> > node_state = member
> > Jun 18 21:19:36 debug   Algorithm result vote is No change
> > Jun 18 21:19:36 debug   Client closed connection
> > Jun 18 21:19:36 debug   Client ::ffff:192.168.80.16:51024
> > (init_received 1, cluster zeta, node_id 1084772368) disconnect
> > Jun 18 21:19:36 debug   ffsplit: Membership for cluster zeta is now stable
> > Jun 18 21:19:36 debug   ffsplit: No quorate partition was selected
> > Jun 18 21:19:36 debug   ffsplit: No client gets NACK
> > Jun 18 21:19:36 debug   ffsplit: No client gets ACK
> >
> > Since it's categorized as a daemon, I thought this would stay running,
> > and keep a constant connection.
> >
> > corosyn.conf quorum look like
> > quorum {
> >         # Enable and configure quorum subsystem (default: off)
> >         # see also corosync.conf.5 and votequorum.5
> > #       two_node: 1
> >         provider: corosync_votequorum
> >         expected_votes: 3
> >         device {
> >             votes: 1
> >             model: net
> >             net {
> >               host: delta
> >               }
> >         }
> > }
> >
> > Thanks!
> > _______________________________________________
> > Users mailing list: Users at clusterlabs.org
> > https://lists.clusterlabs.org/mailman/listinfo/users
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> >
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org



More information about the Users mailing list