[ClusterLabs] Antw: Re: Antw: [EXT] Re: Peer (slave) node deleting master's transient_attributes

Tue Feb 9 09:08:51 EST 2021

>>> Stuart Massey <djangoschef at gmail.com> schrieb am 09.02.2021 um 14:07 in
Nachricht
<CABQ68NSgYgHdozo+1soYZ81-yyzpSbpcCHue5zwNqnFo1Vmu0g at mail.gmail.com>:
> Ulrich,
> Thank you for clarifying what single-node maintenance mode entails. It is
> surprising to learn that, though resource actions do not happen, a node
> would send a cib update that deletes the transient attributes that are
> maintained by resource monitoring activities. I think "maintenance mode"
> is a distraction at this point: You are implying that this problem is
> unrelated to having one node in maintenance mode, and that matches our
> experience.
> I don't want to do anything with the cluster with one node in maintenance
> mode. I just want the currently healthy node which has DC and retains DC
> and is running services not to get into this odd state where it shuts down
> services and does not restart them. I want to prevent that happening again,
> as similar problems and unexplained fail-overs or service restarts seem to
> happen occasionally, whether anything is in maintenance mode or not.

Hi!

It sounds a bit like an "XY-Problem" (You are suffering from X, thinking Y is the thing to prevent X. But as Y does not prevent X you are wondering what's wrong with Y, while you should ask: "What's wrong with X?")

Usually you can find a lot in the logs (talking about X). Also "crm_mon -1Arfj" gives you hints at "Failed Actions".
You should investigate those. If they were one-time fault, you should consider cleanuing up the error state with "crm_resource -C ...". A failed resource also increments a "fail count" that you may want to reset either manually or automatically (like after a day). So subsequent failures will be treated differently.
Once again: Get a clear image of your "X", then decide what Y should be ;-)

Regards,
Ulrich

> Regards,
> Stuart
> 
> On Tue, Feb 9, 2021 at 2:34 AM Ulrich Windl <
> Ulrich.Windl at rz.uni-regensburg.de> wrote:
> 
>> Hi!
>>
>> Maybe you just misunderstand what maintennce mode for a single node means:
>> CIBS updates will still be performed, but not the resource actions. If CIB
>> updates are sent to another node, that node will perform actions.
>>
>> Maybe just explain what you really want to do with one node in maintenance
>> mode.
>> Don't expect the cluster to behave normally in maintenance mode...
>>
>> Regards,
>> Ulrich
>>
>> >>> Stuart Massey <djangoschef at gmail.com> schrieb am 08.02.2021 um 18:01
>> in
>> Nachricht
>> <CABQ68NRhQ+h+CgSBC-rAnwCd0zHUnA7XnayZLMFpYaNvi6gKKw at mail.gmail.com>:
>> > I'm wondering if anyone can advise us on next steps here and/or correct
>> our
>> > understanding. This seems like a race condition that causes resources to
>> be
>> > stopped unnecessarily. Is there a way to prevent a node from processing
>> cib
>> > updates from a peer while DC negotiations are underway? Our "node02" is
>> > running resources fine, and since it winds up winning the DC election,
>> > would continue to run them uninterrupted if it just ignored or pended the
>> > cib updates it receives in the middle of the negotiation.
>> > Very much appreciate all the help and discussion available on this board.
>> > Regards,
>> > Stuart
>> >
>> > On Mon, Feb 1, 2021 at 11:43 AM Stuart Massey <djangoschef at gmail.com>
>> wrote:
>> >
>> >> Sequence seems to be:
>> >>
>> >>    - node02 is DC and master/primary, node01 is maintenance mode and
>> >>    slave/secondary
>> >>    - comms go down
>> >>    - node01 elects itself master, and deletes node01 status from its cib
>> >>    - comms come up
>> >>    - cluster starts reforming
>> >>    - node01 sends cib updates to node02
>> >>    - DC negotiations start, both nodes unset DC
>> >>    - node02 receives cib updates and process them, deleting its own
>> status
>> >>    - DC negotiations complete with node02 winning
>> >>    - node02, having lost it's status, believes it cannot host resources
>> >>    and stops them all
>> >>    - for whatever reason, perhaps somehow due to the completely missing
>> >>    transient_attributes, node02 nevers schedules a probe for itself
>> >>    - we have to "refresh" manually
>> >>
>> >>
>> >> On Mon, Feb 1, 2021 at 11:31 AM Ken Gaillot <kgaillot at redhat.com>
>> wrote:
>> >>
>> >>> On Mon, 2021-02-01 at 11:09 -0500, Stuart Massey wrote:
>> >>> > Hi Ken,
>> >>> > Thanks. In this case, transient_attributes for node02 in the cib on
>> >>> > node02 which never lost quorum seem to be deleted by a request from
>> >>> > node01 when node01 rejoins the cluster - IF I understand the
>> >>> > pacemaker.log correctly. This causes node02 to stop resources, which
>> >>> > will not be restarted until we manually refresh on node02.
>> >>>
>> >>> Good point, it depends on which node is DC. When a cluster splits, each
>> >>> side sees the other side as the one that left. When the split heals,
>> >>> whichever side has the newly elected DC is the one that clears the
>> >>> other.
>> >>>
>> >>> However the DC should schedule probes for the other side, and probes
>> >>> generally set the promotion score, so manual intervention shouldn't be
>> >>> needed. I'd make sure that probes were scheduled, then investigate how
>> >>> the agent sets the score.
>> >>>
>> >>> > On Mon, Feb 1, 2021 at 10:59 AM Ken Gaillot <kgaillot at redhat.com>
>> >>> > wrote:
>> >>> > > On Fri, 2021-01-29 at 12:37 -0500, Stuart Massey wrote:
>> >>> > > > Can someone help me with this?
>> >>> > > > Background:
>> >>> > > > > "node01" is failing, and has been placed in "maintenance" mode.
>> >>> > > It
>> >>> > > > > occasionally loses connectivity.
>> >>> > > > > "node02" is able to run our resources
>> >>> > > >
>> >>> > > > Consider the following messages from pacemaker.log on "node02",
>> >>> > > just
>> >>> > > > after "node01" has rejoined the cluster (per "node02"):
>> >>> > > > > Jan 28 14:48:03 [21933] node02.example.com        cib:
>> >>> > >  info:
>> >>> > > > > cib_perform_op:       --
>> >>> > > > > /cib/status/node_state[@id='2']/transient_attributes[@id='2']
>> >>> > > > > Jan 28 14:48:03 [21933] node02.example.com        cib:
>> >>> > >  info:
>> >>> > > > > cib_perform_op:       +  /cib:  @num_updates=309
>> >>> > > > > Jan 28 14:48:03 [21933] node02.example.com        cib:
>> >>> > >  info:
>> >>> > > > > cib_process_request:  Completed cib_delete operation for
>> >>> > > section
>> >>> > > > > //node_state[@uname='node02.example.com
>> ']/transient_attributes:
>> >>> > > OK
>> >>> > > > > (rc=0, origin=node01.example.com/crmd/3784, version=0.94.309)
>> >>> > > > > Jan 28 14:48:04 [21938] node02.example.com       crmd:
>> >>> > >  info:
>> >>> > > > > abort_transition_graph:       Transition aborted by deletion of
>> >>> > > > > transient_attributes[@id='2']: Transient attribute change |
>> >>> > > > > cib=0.94.309 source=abort_unless_down:357
>> >>> > > > >
>> >>> > > path=/cib/status/node_state[@id='2']/transient_attributes[@id='2']
>> >>> > > > > complete=true
>> >>> > > > > Jan 28 14:48:05 [21937] node02.example.com    pengine:
>> >>> > >  info:
>> >>> > > > > master_color: ms_drbd_ourApp: Promoted 0 instances of a
>> >>> > > possible 1
>> >>> > > > > to master
>> >>> > > > >
>> >>> > > > The implication, it seems to me, is that "node01" has asked
>> >>> > > "node02"
>> >>> > > > to delete the transient-attributes for "node02". The transient-
>> >>> > > > attributes should normally be:
>> >>> > > >       <transient_attributes id="2">
>> >>> > > >         <instance_attributes id="status-2">
>> >>> > > >           <nvpair id="status-2-master-drbd_ourApp" name="master-
>> >>> > > > drbd_ourApp" value="10000"/>
>> >>> > > >           <nvpair id="status-2-pingd" name="pingd" value="100"/>
>> >>> > > >         </instance_attributes>
>> >>> > > >       </transient_attributes>
>> >>> > > >
>> >>> > > > These attributes are necessary for "node02" to be Master/Primary,
>> >>> > > > correct?
>> >>> > > >
>> >>> > > > Why might this be happening and how do we prevent it?
>> >>> > >
>> >>> > > Transient attributes are always cleared when a node leaves the
>> >>> > > cluster
>> >>> > > (that's what makes them transient ...). It's probably coincidence
>> >>> > > it
>> >>> > > went through as the node rejoined.
>> >>> > >
>> >>> > > When the node rejoins, it will trigger another run of the
>> >>> > > scheduler,
>> >>> > > which will schedule a probe of all resources on the node. Those
>> >>> > > probes
>> >>> > > should reset the promotion score.
>> >>> > > _______________________________________________
>> >>> > > Manage your subscription:
>> >>> > > https://lists.clusterlabs.org/mailman/listinfo/users 
>> >>> > >
>> >>> > > ClusterLabs home: https://www.clusterlabs.org/ 
>> >>> --
>> >>> Ken Gaillot <kgaillot at redhat.com>
>> >>>
>> >>> _______________________________________________
>> >>> Manage your subscription:
>> >>> https://lists.clusterlabs.org/mailman/listinfo/users 
>> >>>
>> >>> ClusterLabs home: https://www.clusterlabs.org/ 
>> >>>
>> >>
>>
>>
>>
>> _______________________________________________
>> Manage your subscription:
>> https://lists.clusterlabs.org/mailman/listinfo/users 
>>
>> ClusterLabs home: https://www.clusterlabs.org/ 
>>