[ClusterLabs] corosync - CS_ERR_BAD_HANDLE when multiple nodes are starting up

Jan Friesse jfriesse at redhat.com
Thu Oct 1 03:23:51 EDT 2015


Hi,

Thomas Lamprecht napsal(a):
> Hello,
>
> we are using corosync version needle (2.3.5) for our cluster filesystem
> (pmxcfs).
> The situation is the following. First we start up the pmxcfs, which is
> an fuse fs. And if there is an cluster configuration, we start also
> corosync.
> This allows the filesystem to exist on one node 'cluster's or forcing it
> in an local mode. We use CPG to send our messages to all members,
> the filesystem is in the RAM and all fs operations are sent 'over the
> wire'.
>
> The problem is now the following:
> When we're restarting all (in my test case 3) nodes at the same time, I
> get in 1 from 10 cases only CS_ERR_BAD_HANDLE back when calling

I'm really unsure how to understand what are you doing. You are 
restarting all nodes and get CS_ERR_BAD_HANDLE? I mean, if you are 
restarting all nodes, which node returns CS_ERR_BAD_HANDLE? Or are you 
restarting just pmxcfs? Or just coorsync?

> cpg_mcast_joined to send out the data, but only one node.
> corosyn-quorumtool shows that we have quorum, and the logs are also
> showing a healthy connect to the corosync cluster. The failing handle is
> initialized once at the initialization of our filesystem. Should it be
> reinitialized on every reconnect?

Again, I'm unsure what you mean by reconnect. On Corosync shudown you 
have to reconnect (I believe this is not the case because you are 
getting error only with 10% probability).

> Restarting the filesystem solves this problem, the strange thing is that
> isn't clearly reproduce-able and often works just fine.
>
> Are there some known problems or steps we should look for?

Hard to tell but generally:
- Make sure cpg_init really returns CS_OK. If not, returned handle is 
invalid
- Make sure there is no memory corruption and handle is really valid 
(valgrind may be helpful).

Regards,
   Honza

>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org





More information about the Users mailing list