[ClusterLabs] Antw: Re: Q: cluster-dlm[4494]: setup_cpg_daemon: daemon cpg_join error retrying

Ulrich Windl Ulrich.Windl at rz.uni-regensburg.de
Fri Mar 3 09:11:21 EST 2017


>>> emmanuel segura <emi2fast at gmail.com> schrieb am 03.03.2017 um 14:22 in
Nachricht
<CAE7pJ3A=oTkWwaz9t0JFTAL1t5G7hmhxpv-ywTUG4JFD9MymAw at mail.gmail.com>:
> your cluster was in maintenance state?

No, it wasn't? Should it?

> 
> 2017-03-03 13:59 GMT+01:00 Ulrich Windl <Ulrich.Windl at rz.uni-regensburg.de>:
>> Hello!
>>
>> After Update and reboot of 2nd of three nodes (SLES11 SP4) I see a 
> "cluster-dlm[4494]: setup_cpg_daemon: daemon cpg_join error retrying" 
> message when I expected the node to joint the cluster. What can be the 
> reasons for this?
>> In fact this seems to have killed cluster communication, because I saw that 
> "DLM start" timed out. The other nodes were unable to use DLM during that 
> time (while the node could not join).
>>
>> I saw that corosync starts before the firewall in SLES11 SP4; maybe that's a 
> problem.
>>
>> I tried an "rcopenais stop" of the problem node, which in tun caused a node 
> fence (DLM stop timed out, too), and then the other nodes were able to 
> communicate again. During boot the problem node was able to join the cluster 
> as before. In the meantime I had also updated the third node without a 
> problem, so it looks like a rare race condition to me.
>> ANy insights?
>>
>> Could the problem be related to one of these messages?
>> crmd[3656]:   notice: get_node_name: Could not obtain a node name for 
> classic openais (with plugin) nodeid 739512321
>> corosync[3646]:  [pcmk  ] info: update_member: 0x64bc90 Node 739512325 
> ((null)) born on: 3352
>> stonith-ng[3652]:   notice: get_node_name: Could not obtain a node name for 
> classic openais (with plugin) nodeid 739512321
>> crmd[3656]:   notice: get_node_name: Could not obtain a node name for 
> classic openais (with plugin) nodeid 739512330
>> cib[3651]:   notice: get_node_name: Could not obtain a node name for classic 
> openais (with plugin) nodeid 739512321
>> cib[3651]:   notice: crm_update_peer_state: plugin_handle_membership: Node 
> (null)[739512321] - state is now member (was (null))
>>
>> crmd:     info: crm_get_peer:     Created entry 
> 8a7d6859-5ab1-404b-95a0-ba28064763fb/0x7a81f0 for node (null)/739512321 
> (2 total)
>> crmd:     info: crm_get_peer:     Cannot obtain a UUID for node 
> 739512321/(null)
>> crmd:     info: crm_update_peer:  plugin_handle_membership: Node (null): 
> id=739512321 state=member addr=r(0) ip(172.20.16.1) r(1) ip(10.2.2.1)  (new) 
> votes=0 born=0 seen=3352 proc=00000000000000000000000000000000
>>
>>
>> Regards,
>> Ulrich
>>
>>
>>
>>
>>
>> _______________________________________________
>> Users mailing list: Users at clusterlabs.org 
>> http://lists.clusterlabs.org/mailman/listinfo/users 
>>
>> Project Home: http://www.clusterlabs.org 
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>> Bugs: http://bugs.clusterlabs.org 
> 
> 
> 
> -- 
>   .~.
>   /V\
>  //  \\
> /(   )\
> ^`~'^
> 
> _______________________________________________
> Users mailing list: Users at clusterlabs.org 
> http://lists.clusterlabs.org/mailman/listinfo/users 
> 
> Project Home: http://www.clusterlabs.org 
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
> Bugs: http://bugs.clusterlabs.org 








More information about the Users mailing list