[ClusterLabs] DRBD 2-node M/S doesn't want to promote new master, Centos 8
jeneral9 at gmail.com
Sun Jan 17 13:37:49 EST 2021
I have already verified DRBD works. As always, I use the OCF to manage
DRBD; never started initially. I added auto-promote to see if that made
a difference (it didn't). So I'm still at a loss as to why it won't
promote. It behaves as if DRBD gets unleaded on the master node before
it demoting to allows the secondary to promote.
For example (I'm doing this manually w/o the cluster running).
On master node (as primary), I stop DRBD (normally you'd demote it first)
On secondary node (as secondary):
drbdadm primary all
crm-fence-peer.9.sh: (ipc_post_disconnect) #011info:
Disconnected from controller IPC API
crm-fence-peer.9.sh: (pcmk_free_ipc_api) #011debug: Releasing
controller IPC API
crm-fence-peer.9.sh: (crm_xml_cleanup) #011info: Cleaning up
memory from libxml2
crm-fence-peer.9.sh: (crm_exit) #011info: Exiting crm_node |
with status 0
crm-fence-peer.9.sh: Could not connect to the CIB: No such
device or address
crm-fence-peer.9.sh: Init failed, could not perform requested
crm-fence-peer.9.sh: WARNING DATA INTEGRITY at RISK: could not
place the fencing constraint!
kernel: drbd r0 nfs6: helper command: /sbin/drbdadm fence-peer exit code
kernel: drbd r0 nfs6: fence-peer helper broken, returned 1
All of this is EXPECTED BEHAVIOR. With DRBD unloaded PRIOR to demoting
on the primary node, the other node CANNOT promote to primary. This is
the same thing I'm experiencing when running in the cluster. It looks
like DRBD is being unloaded PRIOR to the master being demoted. WHY???
I'm pretty sure I'm using the basic configurations. It seems as though
there's a bug somewhere.
Note my versions:
On 1/16/2021 11:07 AM, Strahil Nikolov wrote:
> В 14:10 -0700 на 15.01.2021 (пт), Brent Jensen написа:
>> Problem: When performing "pcs node standby" on the current master,
>> this node demotes fine but the slave doesn't promote to master. It
>> keeps looping the same error including "Refusing to be Primary while
>> peer is not outdated" and "Could not connect to the CIB." At this
>> point the old master has already unloaded drbd. The only way to fix
>> it is to start drbd on the standby node (e.g. drbdadm r0 up). Logs
>> contained herein are from the node trying to be master.
> In order to debug, stop the cluster and verify that drbd is running
> properly. Promote one of the nodes, then demote and promote another one...
>> I have done this on DRBD9/Centos7/Pacemaker1 w/o any problems. So I
>> don't know were the issue is (crm-fence-peer.9.sh
>> Another odd data point: On the slave if I do a "pcs node standby" &
>> then unstandby, DRBD is loaded again; HOWEVER, when I do this on the
>> master (which should then be slave), DRBD doesn't get loaded.
>> Stonith/Fencing doesn't seem to make a difference. Not sure if
>> auto-promote is required.
> Quote from official documentation
> If you are employing the DRBD OCF resource agent, it is recommended
> that you defer DRBD startup, shutdown, promotion, and demotion
> /exclusively/ to the OCF resource agent. That means that you should
> disable the DRBD init script:
> So remove the autopromote and disable the drbd service at all.
> Best Regards, Strahil Nikolov
> Manage your subscription:
> ClusterLabs home: https://www.clusterlabs.org/
This email has been checked for viruses by Avast antivirus software.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Users