[ClusterLabs] Antw: Re: Antw: Re: Q: cluster-dlm[4494]: setup_cpg_daemon: daemon cpg_join error retrying
Ulrich Windl
Ulrich.Windl at rz.uni-regensburg.de
Fri Mar 3 10:02:42 EST 2017
>>> emmanuel segura <emi2fast at gmail.com> schrieb am 03.03.2017 um 15:35 in
Nachricht
<CAE7pJ3BVwnbWoPRQzg8K=NnNxUzxO16dsL2KyYsMuVS3FWWbMg at mail.gmail.com>:
> I think is a good idea to put your cluster in maintenance mode, when
> you do an update.
You should know that I stopped the cluster services on the node in order to install updates (and reboot). This caused all resources to be moved away from that node. I think it would be counter-productive to boot the node with resources running in maintenance mode. Do you disagree?
>
> 2017-03-03 15:11 GMT+01:00 Ulrich Windl <Ulrich.Windl at rz.uni-regensburg.de>:
>>>>> emmanuel segura <emi2fast at gmail.com> schrieb am 03.03.2017 um 14:22 in
>> Nachricht
>> <CAE7pJ3A=oTkWwaz9t0JFTAL1t5G7hmhxpv-ywTUG4JFD9MymAw at mail.gmail.com>:
>>> your cluster was in maintenance state?
>>
>> No, it wasn't? Should it?
>>
>>>
>>> 2017-03-03 13:59 GMT+01:00 Ulrich Windl <Ulrich.Windl at rz.uni-regensburg.de>:
>>>> Hello!
>>>>
>>>> After Update and reboot of 2nd of three nodes (SLES11 SP4) I see a
>>> "cluster-dlm[4494]: setup_cpg_daemon: daemon cpg_join error retrying"
>>> message when I expected the node to joint the cluster. What can be the
>>> reasons for this?
>>>> In fact this seems to have killed cluster communication, because I saw that
>>> "DLM start" timed out. The other nodes were unable to use DLM during that
>>> time (while the node could not join).
>>>>
>>>> I saw that corosync starts before the firewall in SLES11 SP4; maybe that's a
>>> problem.
>>>>
>>>> I tried an "rcopenais stop" of the problem node, which in tun caused a node
>>> fence (DLM stop timed out, too), and then the other nodes were able to
>>> communicate again. During boot the problem node was able to join the cluster
>>> as before. In the meantime I had also updated the third node without a
>>> problem, so it looks like a rare race condition to me.
>>>> ANy insights?
>>>>
>>>> Could the problem be related to one of these messages?
>>>> crmd[3656]: notice: get_node_name: Could not obtain a node name for
>>> classic openais (with plugin) nodeid 739512321
>>>> corosync[3646]: [pcmk ] info: update_member: 0x64bc90 Node 739512325
>>> ((null)) born on: 3352
>>>> stonith-ng[3652]: notice: get_node_name: Could not obtain a node name for
>>> classic openais (with plugin) nodeid 739512321
>>>> crmd[3656]: notice: get_node_name: Could not obtain a node name for
>>> classic openais (with plugin) nodeid 739512330
>>>> cib[3651]: notice: get_node_name: Could not obtain a node name for classic
>>> openais (with plugin) nodeid 739512321
>>>> cib[3651]: notice: crm_update_peer_state: plugin_handle_membership: Node
>>> (null)[739512321] - state is now member (was (null))
>>>>
>>>> crmd: info: crm_get_peer: Created entry
>>> 8a7d6859-5ab1-404b-95a0-ba28064763fb/0x7a81f0 for node (null)/739512321
>>> (2 total)
>>>> crmd: info: crm_get_peer: Cannot obtain a UUID for node
>>> 739512321/(null)
>>>> crmd: info: crm_update_peer: plugin_handle_membership: Node (null):
>>> id=739512321 state=member addr=r(0) ip(172.20.16.1) r(1) ip(10.2.2.1) (new)
>>> votes=0 born=0 seen=3352 proc=00000000000000000000000000000000
>>>>
>>>>
>>>> Regards,
>>>> Ulrich
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Users mailing list: Users at clusterlabs.org
>>>> http://lists.clusterlabs.org/mailman/listinfo/users
>>>>
>>>> Project Home: http://www.clusterlabs.org
>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>> Bugs: http://bugs.clusterlabs.org
>>>
>>>
>>>
>>> --
>>> .~.
>>> /V\
>>> // \\
>>> /( )\
>>> ^`~'^
>>>
>>> _______________________________________________
>>> Users mailing list: Users at clusterlabs.org
>>> http://lists.clusterlabs.org/mailman/listinfo/users
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>
>>
>>
>>
>>
>> _______________________________________________
>> Users mailing list: Users at clusterlabs.org
>> http://lists.clusterlabs.org/mailman/listinfo/users
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
>
>
> --
> .~.
> /V\
> // \\
> /( )\
> ^`~'^
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
More information about the Users
mailing list