[ClusterLabs] Antw: Re: node name issues (Could not obtain a node name for corosync nodeid 739512332)
Ulrich Windl
Ulrich.Windl at rz.uni-regensburg.de
Mon Aug 26 02:59:32 EDT 2019
>>> Ken Gaillot <kgaillot at redhat.com> schrieb am 22.08.2019 um 17:38 in
Nachricht
<9bd1f0de082ff66a2a9b14d1d80cc95d7eff4bac.camel at redhat.com>:
> On Thu, 2019‑08‑22 at 09:07 +0200, Ulrich Windl wrote:
>> Hi!
>>
>> When starting pacemaker (1.1.19+20181105.ccd6b5b10‑3.10.1) on a node
>> that had been down for a while, I noticed some unexpected messages
>> about the node name:
>>
>> pacemakerd: notice: get_node_name: Could not obtain a node name
>> for corosync nodeid 739512332
>> pacemakerd: info: crm_get_peer: Created entry a21bf687‑045b‑
>> 4fd7‑9340‑0562ef595883/0x18752f0 for node (null)/739512332 (1 total)
>> pacemakerd: info: crm_get_peer: Node 739512332 has uuid
>> 739512332
>>
>> Seems UUID and node ID is mixed up in the message at least...
>
> "UUID" is a misnomer, for historical reasons. It was an actual UUID for
> heartbeat (originally the only supported cluster layer), but for
> corosync it's the node ID and for Pacemaker Remote nodes it's the node
> name.
>
> Ironically the string after "Created entry" is an actual UUID but
> that's not the "node UUID", just an internal hash table id.
>
> We should definitely update all those messages to reflect the current
> reality.
;-) +1 ("This cat's a dog for historical reasons")
[...]
>> cib: notice: crm_update_peer_state_iter: Node (null) state is
>> now member | nodeid=739512332 previous=unknown
>> source=crm_update_peer_proc
>> ...
>>
>> This doesn't look right in my eyes.
>
> Corosync by default provides only a corosync node ID when identifying
> nodes. The daemons have to learn the node names from cluster messages
> passed around by pacemaker. The exception is if "name:" is specified in
> corosync.conf, the daemons can learn the names at start‑up.
Can't there be a CIB event like "node ID is available now", and all the
clients needing a node ID silently wait until it is available instead of
creating all that noise?
>
> As for the "now online"/"now member", there are two stages of corosync
> membership: cluster membership (i.e. participating in the corosync
> token ring) and process group (CPG) membership (which is corosync's
> node‑to‑node messaging protocol). They generally happen very close to
> each other.
I thought CPG is somewhat a subset of the cluster.
[...]
>> I feel this mess with determining the node name is overly
>> complicated...
>>
>> Regards,
>> Ulrich
>
> Complicated, yes ‑‑ overly, depends on your point of view :)
>
> Putting "name:" in corosync.conf simplifies things.
Also see my earlier message. If adding the node name to corosync conf is
highly recommended, I wonder why SUSE's SLES procedure does not set it...
Thanks for your insights!
Regards,
Ulrich
More information about the Users
mailing list