[ClusterLabs] Antw: Re: Antw: Re: Q: cluster-dlm[4494]: setup_cpg_daemon: daemon cpg_join error retrying

emmanuel segura emi2fast at gmail.com
Fri Mar 3 16:43:19 UTC 2017


use something like standby?

2017-03-03 16:02 GMT+01:00 Ulrich Windl <Ulrich.Windl at rz.uni-regensburg.de>:
>>>> emmanuel segura <emi2fast at gmail.com> schrieb am 03.03.2017 um 15:35 in
> Nachricht
> <CAE7pJ3BVwnbWoPRQzg8K=NnNxUzxO16dsL2KyYsMuVS3FWWbMg at mail.gmail.com>:
>> I think is a good idea to put your cluster in maintenance mode, when
>> you do an update.
>
> You should know that I stopped the cluster services on the node in order to install updates (and reboot). This caused all resources to be moved away from that node. I think it would be counter-productive to boot the node with resources running in maintenance mode. Do you disagree?
>
>>
>> 2017-03-03 15:11 GMT+01:00 Ulrich Windl <Ulrich.Windl at rz.uni-regensburg.de>:
>>>>>> emmanuel segura <emi2fast at gmail.com> schrieb am 03.03.2017 um 14:22 in
>>> Nachricht
>>> <CAE7pJ3A=oTkWwaz9t0JFTAL1t5G7hmhxpv-ywTUG4JFD9MymAw at mail.gmail.com>:
>>>> your cluster was in maintenance state?
>>>
>>> No, it wasn't? Should it?
>>>
>>>>
>>>> 2017-03-03 13:59 GMT+01:00 Ulrich Windl <Ulrich.Windl at rz.uni-regensburg.de>:
>>>>> Hello!
>>>>>
>>>>> After Update and reboot of 2nd of three nodes (SLES11 SP4) I see a
>>>> "cluster-dlm[4494]: setup_cpg_daemon: daemon cpg_join error retrying"
>>>> message when I expected the node to joint the cluster. What can be the
>>>> reasons for this?
>>>>> In fact this seems to have killed cluster communication, because I saw that
>>>> "DLM start" timed out. The other nodes were unable to use DLM during that
>>>> time (while the node could not join).
>>>>>
>>>>> I saw that corosync starts before the firewall in SLES11 SP4; maybe that's a
>>>> problem.
>>>>>
>>>>> I tried an "rcopenais stop" of the problem node, which in tun caused a node
>>>> fence (DLM stop timed out, too), and then the other nodes were able to
>>>> communicate again. During boot the problem node was able to join the cluster
>>>> as before. In the meantime I had also updated the third node without a
>>>> problem, so it looks like a rare race condition to me.
>>>>> ANy insights?
>>>>>
>>>>> Could the problem be related to one of these messages?
>>>>> crmd[3656]:   notice: get_node_name: Could not obtain a node name for
>>>> classic openais (with plugin) nodeid 739512321
>>>>> corosync[3646]:  [pcmk  ] info: update_member: 0x64bc90 Node 739512325
>>>> ((null)) born on: 3352
>>>>> stonith-ng[3652]:   notice: get_node_name: Could not obtain a node name for
>>>> classic openais (with plugin) nodeid 739512321
>>>>> crmd[3656]:   notice: get_node_name: Could not obtain a node name for
>>>> classic openais (with plugin) nodeid 739512330
>>>>> cib[3651]:   notice: get_node_name: Could not obtain a node name for classic
>>>> openais (with plugin) nodeid 739512321
>>>>> cib[3651]:   notice: crm_update_peer_state: plugin_handle_membership: Node
>>>> (null)[739512321] - state is now member (was (null))
>>>>>
>>>>> crmd:     info: crm_get_peer:     Created entry
>>>> 8a7d6859-5ab1-404b-95a0-ba28064763fb/0x7a81f0 for node (null)/739512321
>>>> (2 total)
>>>>> crmd:     info: crm_get_peer:     Cannot obtain a UUID for node
>>>> 739512321/(null)
>>>>> crmd:     info: crm_update_peer:  plugin_handle_membership: Node (null):
>>>> id=739512321 state=member addr=r(0) ip(172.20.16.1) r(1) ip(10.2.2.1)  (new)
>>>> votes=0 born=0 seen=3352 proc=00000000000000000000000000000000
>>>>>
>>>>>
>>>>> Regards,
>>>>> Ulrich
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Users mailing list: Users at clusterlabs.org
>>>>> http://lists.clusterlabs.org/mailman/listinfo/users
>>>>>
>>>>> Project Home: http://www.clusterlabs.org
>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>> Bugs: http://bugs.clusterlabs.org
>>>>
>>>>
>>>>
>>>> --
>>>>   .~.
>>>>   /V\
>>>>  //  \\
>>>> /(   )\
>>>> ^`~'^
>>>>
>>>> _______________________________________________
>>>> Users mailing list: Users at clusterlabs.org
>>>> http://lists.clusterlabs.org/mailman/listinfo/users
>>>>
>>>> Project Home: http://www.clusterlabs.org
>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>> Bugs: http://bugs.clusterlabs.org
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Users mailing list: Users at clusterlabs.org
>>> http://lists.clusterlabs.org/mailman/listinfo/users
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>
>>
>>
>> --
>>   .~.
>>   /V\
>>  //  \\
>> /(   )\
>> ^`~'^
>>
>> _______________________________________________
>> Users mailing list: Users at clusterlabs.org
>> http://lists.clusterlabs.org/mailman/listinfo/users
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
>
>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org



-- 
  .~.
  /V\
 //  \\
/(   )\
^`~'^




More information about the Users mailing list