[ClusterLabs] Antw: [EXT] Re: Help understanding recover of promotable resource after a "pcs cluster stop ‑‑all"

Andrei Borzenkov arvidjaar at gmail.com
Tue May 3 04:19:12 EDT 2022


On 03.05.2022 10:40, Ulrich Windl wrote:
> Hi!
> 
> I don't use DRBD, but I can imagine:
> If DRBD does asynchronous replication, it may make sense not to promote the
> slave as master after an interrupte dconnection (such when the master died) (as
> this will cause some data loss).
> Probably it only wants to switch roles when both nodes are online to avoid
> that type of data loss (the master may have some newer data it wants to
> transfer first).
> 

Yes. See below.

>>>>
>>>> # sudo crm_mon ‑1A
>>>> ...
>>>> Node Attributes:
>>>>   * Node: server2:
>>>>     * master‑DRBDData                     : 10000
>>>
>>> In the scenario you described, only server1 is up. If there is no
>>> master score for server1, it cannot be master. It's up the resource
>>> agent to set it. I'm not familiar enough with that agent to know why it
>>> might not.
>>>
>>
>> I can trivially reproduce it. When pacemaker with slave drbd instance is
>> stopped, DRBD disk state is set to "outdated". When it comes up, it will
>> not be selected for promoting. Setting master score does not work, it
>> just results in failure attempt to bring up the outdated replica. When
>> former master comes up, its disk state is "consistent" so it is selected
>> for promotion, becomes primary and synchronized with secondary.
>>
>> DRBD RA has an option to force outdated state on stop, but this option
>> is off by default as far as I can tell.
>>
>> This is probably something in DRBD configuration, but I am not familiar
>> with it on this deep level. Manually forcing primary on outdated replica
>> works and is reflected on pacemaker level (resource goes in promoted
> state).

Without any agent involved, doing "drbdadm down" on secondary instance
with active connection to primary marks it "outdated". Which is correct,
as from now on we do not know anything about state of primary. Doing
"drbdadm down" on a single replica without active connections leaves it
in "consistent" state.

When DRBD connection is active, booth replicas have "consistent" state
and when cluster nodes reboot from crash, anyone can assume master role.

I guess it is the same operational issue as with pacemaker itself - can
we shutdown both sides of DRBD leaving them in consistent state? But
even if we can, because pacemaker itself does not provide any means to
initiate such cluster-wide shutdown it would not help at all.

OTOH it is not really a big problem. Cluster reboot is manual action -
so administrator will need to manually activate remaining replica IF
ADMINISTRATOR IS SURE IT IS UP TO DATE. Rebooting individual nodes
sequentially should be OK.

>>
>>>>
>>>>
>>>>
>>>> Atenciosamente/Kind regards,
>>>> Salatiel
>>>>
>>>> On Mon, May 2, 2022 at 12:26 PM Ken Gaillot <kgaillot at redhat.com>
>>>> wrote:
>>>>> On Mon, 2022‑05‑02 at 09:58 ‑0300, Salatiel Filho wrote:
>>>>>> Hi, I am trying to understand the recovering process of a
>>>>>> promotable
>>>>>> resource after "pcs cluster stop ‑‑all" and shutdown of both
>>>>>> nodes.
>>>>>> I have a two nodes + qdevice quorum with a DRBD resource.
>>>>>>
>>>>>> This is a summary of the resources before my test. Everything is
>>>>>> working just fine and server2 is the master of DRBD.
>>>>>>
>>>>>>  * fence‑server1    (stonith:fence_vmware_rest):     Started
>>>>>> server2
>>>>>>  * fence‑server2    (stonith:fence_vmware_rest):     Started
>>>>>> server1
>>>>>>  * Clone Set: DRBDData‑clone [DRBDData] (promotable):
>>>>>>    * Masters: [ server2 ]
>>>>>>    * Slaves: [ server1 ]
>>>>>>  * Resource Group: nfs:
>>>>>>    * drbd_fs    (ocf::heartbeat:Filesystem):     Started server2
>>>>>>
>>>>>>
>>>>>>
>>>>>> then I issue "pcs cluster stop ‑‑all". The cluster will be
>>>>>> stopped on
>>>>>> both nodes as expected.
>>>>>> Now I restart server1( previously the slave ) and poweroff
>>>>>> server2 (
>>>>>> previously the master ). When server1 restarts it will fence
>>>>>> server2
>>>>>> and I can see that server2 is starting on vcenter, but I just
>>>>>> pressed
>>>>>> any key on grub to make sure the server2 would not restart,
>>>>>> instead
>>>>>> it
>>>>>> would just be "paused" on grub screen.
>>>>>>
>>>>>> SSH'ing to server1 and running pcs status I get:
>>>>>>
>>>>>> Cluster name: cluster1
>>>>>> Cluster Summary:
>>>>>>   * Stack: corosync
>>>>>>   * Current DC: server1 (version 2.1.0‑8.el8‑7c3f660707) ‑
>>>>>> partition
>>>>>> with quorum
>>>>>>   * Last updated: Mon May  2 09:52:03 2022
>>>>>>   * Last change:  Mon May  2 09:39:22 2022 by root via cibadmin
>>>>>> on
>>>>>> server1
>>>>>>   * 2 nodes configured
>>>>>>   * 11 resource instances configured
>>>>>>
>>>>>> Node List:
>>>>>>   * Online: [ server1 ]
>>>>>>   * OFFLINE: [ server2 ]
>>>>>>
>>>>>> Full List of Resources:
>>>>>>   * fence‑server1    (stonith:fence_vmware_rest):     Stopped
>>>>>>   * fence‑server2    (stonith:fence_vmware_rest):     Started
>>>>>> server1
>>>>>>   * Clone Set: DRBDData‑clone [DRBDData] (promotable):
>>>>>>     * Slaves: [ server1 ]
>>>>>>     * Stopped: [ server2 ]
>>>>>>   * Resource Group: nfs:
>>>>>>     * drbd_fs    (ocf::heartbeat:Filesystem):     Stopped
>>>>>>
>>>>>>
>>>>>> So I can see there is quorum, but the server1 is never promoted
>>>>>> as
>>>>>> DRBD master, so the remaining resources will be stopped until
>>>>>> server2
>>>>>> is back.
>>>>>> 1) What do I need to do to force the promotion and recover
>>>>>> without
>>>>>> restarting server2?
>>>>>> 2) Why if instead of rebooting server1 and power off server2 I
>>>>>> reboot
>>>>>> server2 and poweroff server1 the cluster can recover by itself?
>>>>>>
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>
>>>>> You shouldn't need to force promotion, that is the default behavior
>>>>> in
>>>>> that situation. There must be something else in the configuration
>>>>> that
>>>>> is preventing promotion.
>>>>>
>>>>> The DRBD resource agent should set a promotion score for the node.
>>>>> You
>>>>> can run "crm_mon ‑1A" to show all node attributes; there should be
>>>>> one
>>>>> like "master‑DRBDData" for the active node.
>>>>>
>>>>> You can also show the constraints in the cluster to see if there is
>>>>> anything relevant to the master role.
>>>
>>
>> _______________________________________________
>> Manage your subscription:
>> https://lists.clusterlabs.org/mailman/listinfo/users 
>>
>> ClusterLabs home: https://www.clusterlabs.org/ 
> 
> 
> 
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/



More information about the Users mailing list