[ClusterLabs] Antw: Re: Q: cluster-dlm[4494]: setup_cpg_daemon: daemon cpg_join error retrying
Ulrich Windl
Ulrich.Windl at rz.uni-regensburg.de
Fri Mar 3 09:11:21 EST 2017
>>> emmanuel segura <emi2fast at gmail.com> schrieb am 03.03.2017 um 14:22 in
Nachricht
<CAE7pJ3A=oTkWwaz9t0JFTAL1t5G7hmhxpv-ywTUG4JFD9MymAw at mail.gmail.com>:
> your cluster was in maintenance state?
No, it wasn't? Should it?
>
> 2017-03-03 13:59 GMT+01:00 Ulrich Windl <Ulrich.Windl at rz.uni-regensburg.de>:
>> Hello!
>>
>> After Update and reboot of 2nd of three nodes (SLES11 SP4) I see a
> "cluster-dlm[4494]: setup_cpg_daemon: daemon cpg_join error retrying"
> message when I expected the node to joint the cluster. What can be the
> reasons for this?
>> In fact this seems to have killed cluster communication, because I saw that
> "DLM start" timed out. The other nodes were unable to use DLM during that
> time (while the node could not join).
>>
>> I saw that corosync starts before the firewall in SLES11 SP4; maybe that's a
> problem.
>>
>> I tried an "rcopenais stop" of the problem node, which in tun caused a node
> fence (DLM stop timed out, too), and then the other nodes were able to
> communicate again. During boot the problem node was able to join the cluster
> as before. In the meantime I had also updated the third node without a
> problem, so it looks like a rare race condition to me.
>> ANy insights?
>>
>> Could the problem be related to one of these messages?
>> crmd[3656]: notice: get_node_name: Could not obtain a node name for
> classic openais (with plugin) nodeid 739512321
>> corosync[3646]: [pcmk ] info: update_member: 0x64bc90 Node 739512325
> ((null)) born on: 3352
>> stonith-ng[3652]: notice: get_node_name: Could not obtain a node name for
> classic openais (with plugin) nodeid 739512321
>> crmd[3656]: notice: get_node_name: Could not obtain a node name for
> classic openais (with plugin) nodeid 739512330
>> cib[3651]: notice: get_node_name: Could not obtain a node name for classic
> openais (with plugin) nodeid 739512321
>> cib[3651]: notice: crm_update_peer_state: plugin_handle_membership: Node
> (null)[739512321] - state is now member (was (null))
>>
>> crmd: info: crm_get_peer: Created entry
> 8a7d6859-5ab1-404b-95a0-ba28064763fb/0x7a81f0 for node (null)/739512321
> (2 total)
>> crmd: info: crm_get_peer: Cannot obtain a UUID for node
> 739512321/(null)
>> crmd: info: crm_update_peer: plugin_handle_membership: Node (null):
> id=739512321 state=member addr=r(0) ip(172.20.16.1) r(1) ip(10.2.2.1) (new)
> votes=0 born=0 seen=3352 proc=00000000000000000000000000000000
>>
>>
>> Regards,
>> Ulrich
>>
>>
>>
>>
>>
>> _______________________________________________
>> Users mailing list: Users at clusterlabs.org
>> http://lists.clusterlabs.org/mailman/listinfo/users
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
>
>
> --
> .~.
> /V\
> // \\
> /( )\
> ^`~'^
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
More information about the Users
mailing list