[ClusterLabs] Antw: Re: Antw: Re: Antw: Re: Q: cluster-dlm[4494]: setup_cpg_daemon: daemon cpg_join error retrying

Ulrich Windl Ulrich.Windl at rz.uni-regensburg.de
Mon Mar 6 08:14:29 UTC 2017


>>> emmanuel segura <emi2fast at gmail.com> schrieb am 03.03.2017 um 17:43 in
Nachricht
<CAE7pJ3Aj8_xYDKYPcmYqomMKm2ku3s04vsskciyJ6+TVniiRrQ at mail.gmail.com>:
> use something like standby?

Hi!

What is the benefit of using "standby" compared to stopping the whole cluster stack on the node when you intend to update the cluster software, the kernel, and perform a reboot? The only difference I see that after reboot the node wouldn't start to run services, so I'll ahve to "online" the node again.

Regards,
Ulrich

> 
> 2017-03-03 16:02 GMT+01:00 Ulrich Windl <Ulrich.Windl at rz.uni-regensburg.de>:
>>>>> emmanuel segura <emi2fast at gmail.com> schrieb am 03.03.2017 um 15:35 in
>> Nachricht
>> <CAE7pJ3BVwnbWoPRQzg8K=NnNxUzxO16dsL2KyYsMuVS3FWWbMg at mail.gmail.com>:
>>> I think is a good idea to put your cluster in maintenance mode, when
>>> you do an update.
>>
>> You should know that I stopped the cluster services on the node in order to 
> install updates (and reboot). This caused all resources to be moved away from 
> that node. I think it would be counter-productive to boot the node with 
> resources running in maintenance mode. Do you disagree?
>>
>>>
>>> 2017-03-03 15:11 GMT+01:00 Ulrich Windl <Ulrich.Windl at rz.uni-regensburg.de>:
>>>>>>> emmanuel segura <emi2fast at gmail.com> schrieb am 03.03.2017 um 14:22 in
>>>> Nachricht
>>>> <CAE7pJ3A=oTkWwaz9t0JFTAL1t5G7hmhxpv-ywTUG4JFD9MymAw at mail.gmail.com>:
>>>>> your cluster was in maintenance state?
>>>>
>>>> No, it wasn't? Should it?
>>>>
>>>>>
>>>>> 2017-03-03 13:59 GMT+01:00 Ulrich Windl <Ulrich.Windl at rz.uni-regensburg.de>:
>>>>>> Hello!
>>>>>>
>>>>>> After Update and reboot of 2nd of three nodes (SLES11 SP4) I see a
>>>>> "cluster-dlm[4494]: setup_cpg_daemon: daemon cpg_join error retrying"
>>>>> message when I expected the node to joint the cluster. What can be the
>>>>> reasons for this?
>>>>>> In fact this seems to have killed cluster communication, because I saw that
>>>>> "DLM start" timed out. The other nodes were unable to use DLM during that
>>>>> time (while the node could not join).
>>>>>>
>>>>>> I saw that corosync starts before the firewall in SLES11 SP4; maybe that's a
>>>>> problem.
>>>>>>
>>>>>> I tried an "rcopenais stop" of the problem node, which in tun caused a node
>>>>> fence (DLM stop timed out, too), and then the other nodes were able to
>>>>> communicate again. During boot the problem node was able to join the cluster
>>>>> as before. In the meantime I had also updated the third node without a
>>>>> problem, so it looks like a rare race condition to me.
>>>>>> ANy insights?
>>>>>>
>>>>>> Could the problem be related to one of these messages?
>>>>>> crmd[3656]:   notice: get_node_name: Could not obtain a node name for
>>>>> classic openais (with plugin) nodeid 739512321
>>>>>> corosync[3646]:  [pcmk  ] info: update_member: 0x64bc90 Node 739512325
>>>>> ((null)) born on: 3352
>>>>>> stonith-ng[3652]:   notice: get_node_name: Could not obtain a node name for
>>>>> classic openais (with plugin) nodeid 739512321
>>>>>> crmd[3656]:   notice: get_node_name: Could not obtain a node name for
>>>>> classic openais (with plugin) nodeid 739512330
>>>>>> cib[3651]:   notice: get_node_name: Could not obtain a node name for classic
>>>>> openais (with plugin) nodeid 739512321
>>>>>> cib[3651]:   notice: crm_update_peer_state: plugin_handle_membership: Node
>>>>> (null)[739512321] - state is now member (was (null))
>>>>>>
>>>>>> crmd:     info: crm_get_peer:     Created entry
>>>>> 8a7d6859-5ab1-404b-95a0-ba28064763fb/0x7a81f0 for node (null)/739512321
>>>>> (2 total)
>>>>>> crmd:     info: crm_get_peer:     Cannot obtain a UUID for node
>>>>> 739512321/(null)
>>>>>> crmd:     info: crm_update_peer:  plugin_handle_membership: Node (null):
>>>>> id=739512321 state=member addr=r(0) ip(172.20.16.1) r(1) ip(10.2.2.1)  (new)
>>>>> votes=0 born=0 seen=3352 proc=00000000000000000000000000000000
>>>>>>
>>>>>>
>>>>>> Regards,
>>>>>> Ulrich
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Users mailing list: Users at clusterlabs.org 
>>>>>> http://lists.clusterlabs.org/mailman/listinfo/users 
>>>>>>
>>>>>> Project Home: http://www.clusterlabs.org 
>>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>>>>>> Bugs: http://bugs.clusterlabs.org 
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>   .~.
>>>>>   /V\
>>>>>  //  \\
>>>>> /(   )\
>>>>> ^`~'^
>>>>>
>>>>> _______________________________________________
>>>>> Users mailing list: Users at clusterlabs.org 
>>>>> http://lists.clusterlabs.org/mailman/listinfo/users 
>>>>>
>>>>> Project Home: http://www.clusterlabs.org 
>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>>>>> Bugs: http://bugs.clusterlabs.org 
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Users mailing list: Users at clusterlabs.org 
>>>> http://lists.clusterlabs.org/mailman/listinfo/users 
>>>>
>>>> Project Home: http://www.clusterlabs.org 
>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>>>> Bugs: http://bugs.clusterlabs.org 
>>>
>>>
>>>
>>> --
>>>   .~.
>>>   /V\
>>>  //  \\
>>> /(   )\
>>> ^`~'^
>>>
>>> _______________________________________________
>>> Users mailing list: Users at clusterlabs.org 
>>> http://lists.clusterlabs.org/mailman/listinfo/users 
>>>
>>> Project Home: http://www.clusterlabs.org 
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>>> Bugs: http://bugs.clusterlabs.org 
>>
>>
>>
>>
>> _______________________________________________
>> Users mailing list: Users at clusterlabs.org 
>> http://lists.clusterlabs.org/mailman/listinfo/users 
>>
>> Project Home: http://www.clusterlabs.org 
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>> Bugs: http://bugs.clusterlabs.org 
> 
> 
> 
> -- 
>   .~.
>   /V\
>  //  \\
> /(   )\
> ^`~'^
> 
> _______________________________________________
> Users mailing list: Users at clusterlabs.org 
> http://lists.clusterlabs.org/mailman/listinfo/users 
> 
> Project Home: http://www.clusterlabs.org 
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
> Bugs: http://bugs.clusterlabs.org 







More information about the Users mailing list