[ClusterLabs] Antw: [EXT] Peer (slave) node deleting master's transient_attributes

Ulrich Windl Ulrich.Windl at rz.uni-regensburg.de
Mon Feb 1 02:10:35 EST 2021


>>> Stuart Massey <djangoschef at gmail.com> schrieb am 29.01.2021 um 18:37 in
Nachricht
<CABQ68NRHqbo3Y7yt-mbLpqsS3oCN-2yDJBFA3so=R9P4X_WH8A at mail.gmail.com>:
> Can someone help me with this?
> Background:
> 
> "node01" is failing, and has been placed in "maintenance" mode. It
> occasionally loses connectivity.
> 
> "node02" is able to run our resources
> 
> Consider the following messages from pacemaker.log on "node02", just after
> "node01" has rejoined the cluster (per "node02"):
> 
> Jan 28 14:48:03 [21933] node02.example.com        cib:     info:
> cib_perform_op:       --
> /cib/status/node_state[@id='2']/transient_attributes[@id='2']
> Jan 28 14:48:03 [21933] node02.example.com        cib:     info:
> cib_perform_op:       +  /cib:  @num_updates=309
> Jan 28 14:48:03 [21933] node02.example.com        cib:     info:
> cib_process_request:  Completed cib_delete operation for section
> //node_state[@uname='node02.example.com']/transient_attributes: OK (rc=0,
> origin=node01.example.com/crmd/3784, version=0.94.309)
> Jan 28 14:48:04 [21938] node02.example.com       crmd:     info:
> abort_transition_graph:       Transition aborted by deletion of
> transient_attributes[@id='2']: Transient attribute change | cib=0.94.309
> source=abort_unless_down:357
> path=/cib/status/node_state[@id='2']/transient_attributes[@id='2']
> complete=true
> Jan 28 14:48:05 [21937] node02.example.com    pengine:     info:
> master_color: ms_drbd_ourApp: Promoted 0 instances of a possible 1 to master
> 
> The implication, it seems to me, is that "node01" has asked "node02" to
> delete the transient-attributes for "node02". The transient-attributes
> should normally be:
>       <transient_attributes id="2">
>         <instance_attributes id="status-2">
>           <nvpair id="status-2-master-drbd_ourApp"
> name="master-drbd_ourApp" value="10000"/>
>           <nvpair id="status-2-pingd" name="pingd" value="100"/>
>         </instance_attributes>
>       </transient_attributes>
> 
> These attributes are necessary for "node02" to be Master/Primary, correct?
> 
> Why might this be happening and how do we prevent it?

Maybe show the actual state of you cluster; I frequently use "crm_mon -1Arfj" to see...

Regards,
Ulrich





More information about the Users mailing list