[ClusterLabs] No DRBD resource promoted to master in Active/Passive setup

Ken Gaillot kgaillot at redhat.com
Tue Sep 20 16:00:23 UTC 2016


On 09/20/2016 07:15 AM, Auer, Jens wrote:
> Hi,
> 
> I did some more tests after updating DRBD to the latest version. The behavior does not change, but I found out that
> - everything works fine when I physically unplug the network cables instead of ifdown'ing the device

BTW that's a more accurate simulation of a network failure.

> - I can see in the log files that the device gets promoted after stopping the initial master node, but then gets immediately demoted. I don't understand why this happens:
> Sep 20 12:08:03 MDA1PFP-S02 crmd[2354]:  notice: Operation ACTIVE_start_0: ok (node=MDA1PFP-PCS02, call=29, rc=0, cib-update=21, confirmed=true)
> Sep 20 12:08:03 MDA1PFP-S02 crmd[2354]:  notice: Operation drbd1_notify_0: ok (node=MDA1PFP-PCS02, call=28, rc=0, cib-update=0, confirmed=true)
> Sep 20 12:08:04 MDA1PFP-S02 kernel: block drbd1: peer( Primary -> Secondary ) 
> Sep 20 12:08:04 MDA1PFP-S02 IPaddr2(mda-ip)[3528]: INFO: Adding inet address 192.168.120.20/32 with broadcast address 192.168.120.255 to device bond0
> Sep 20 12:08:04 MDA1PFP-S02 avahi-daemon[1084]: Registering new address record for 192.168.120.20 on bond0.IPv4.
> Sep 20 12:08:04 MDA1PFP-S02 IPaddr2(mda-ip)[3528]: INFO: Bringing device bond0 up
> Sep 20 12:08:04 MDA1PFP-S02 IPaddr2(mda-ip)[3528]: INFO: /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p /var/run/resource-agents/send_arp-192.168.120.20 bond0 192.168.120.20 auto not_used not_used
> Sep 20 12:08:04 MDA1PFP-S02 crmd[2354]:  notice: Operation mda-ip_start_0: ok (node=MDA1PFP-PCS02, call=31, rc=0, cib-update=23, confirmed=true)
> Sep 20 12:08:04 MDA1PFP-S02 crmd[2354]:  notice: Operation drbd1_notify_0: ok (node=MDA1PFP-PCS02, call=32, rc=0, cib-update=0, confirmed=true)
> Sep 20 12:08:04 MDA1PFP-S02 crmd[2354]:  notice: Operation drbd1_notify_0: ok (node=MDA1PFP-PCS02, call=34, rc=0, cib-update=0, confirmed=true)
> Sep 20 12:08:04 MDA1PFP-S02 kernel: drbd shared_fs: peer( Secondary -> Unknown ) conn( Connected -> TearDown ) pdsk( UpToDate -> DUnknown ) 
> Sep 20 12:08:04 MDA1PFP-S02 kernel: drbd shared_fs: ack_receiver terminated
> Sep 20 12:08:04 MDA1PFP-S02 kernel: drbd shared_fs: Terminating drbd_a_shared_f
> Sep 20 12:08:04 MDA1PFP-S02 kernel: drbd shared_fs: Connection closed
> Sep 20 12:08:04 MDA1PFP-S02 kernel: drbd shared_fs: conn( TearDown -> Unconnected ) 
> Sep 20 12:08:04 MDA1PFP-S02 kernel: drbd shared_fs: receiver terminated
> Sep 20 12:08:04 MDA1PFP-S02 kernel: drbd shared_fs: Restarting receiver thread
> Sep 20 12:08:04 MDA1PFP-S02 kernel: drbd shared_fs: receiver (re)started
> Sep 20 12:08:04 MDA1PFP-S02 kernel: drbd shared_fs: conn( Unconnected -> WFConnection ) 
> Sep 20 12:08:04 MDA1PFP-S02 crmd[2354]:  notice: Operation drbd1_notify_0: ok (node=MDA1PFP-PCS02, call=35, rc=0, cib-update=0, confirmed=true)
> Sep 20 12:08:04 MDA1PFP-S02 crmd[2354]:  notice: Operation drbd1_notify_0: ok (node=MDA1PFP-PCS02, call=36, rc=0, cib-update=0, confirmed=true)
> Sep 20 12:08:04 MDA1PFP-S02 kernel: drbd shared_fs: helper command: /sbin/drbdadm fence-peer shared_fs
> Sep 20 12:08:04 MDA1PFP-S02 crm-fence-peer.sh[3779]: invoked for shared_fs
> Sep 20 12:08:04 MDA1PFP-S02 crm-fence-peer.sh[3779]: INFO peer is not reachable, my disk is UpToDate: placed constraint 'drbd-fence-by-handler-shared_fs-drbd1_sync'
> Sep 20 12:08:04 MDA1PFP-S02 kernel: drbd shared_fs: helper command: /sbin/drbdadm fence-peer shared_fs exit code 5 (0x500)
> Sep 20 12:08:04 MDA1PFP-S02 kernel: drbd shared_fs: fence-peer helper returned 5 (peer is unreachable, assumed to be dead)
> Sep 20 12:08:04 MDA1PFP-S02 kernel: drbd shared_fs: pdsk( DUnknown -> Outdated ) 
> Sep 20 12:08:04 MDA1PFP-S02 kernel: block drbd1: role( Secondary -> Primary ) 

>From these logs, I don't see any request by Pacemaker for DRBD to be
promoted, so I'm wondering if DRBD decided to promote itself here.

> Sep 20 12:08:04 MDA1PFP-S02 kernel: block drbd1: new current UUID 098EF9936C4F4D27:5157BB476E60F5AA:6BC19D97CF96E5D2:6BC09D97CF96E5D2
> Sep 20 12:08:04 MDA1PFP-S02 crmd[2354]:   error: pcmkRegisterNode: Triggered assert at xml.c:594 : node->type == XML_ELEMENT_NODE
> Sep 20 12:08:04 MDA1PFP-S02 crmd[2354]:  notice: Operation drbd1_promote_0: ok (node=MDA1PFP-PCS02, call=37, rc=0, cib-update=25, confirmed=true)
> Sep 20 12:08:04 MDA1PFP-S02 crmd[2354]:  notice: Operation drbd1_notify_0: ok (node=MDA1PFP-PCS02, call=38, rc=0, cib-update=0, confirmed=true)
> Sep 20 12:08:04 MDA1PFP-S02 crmd[2354]:  notice: Our peer on the DC (MDA1PFP-PCS01) is dead

Here, Pacemaker lost corosync connectivity to its peer. Isn't corosync
traffic on a separate interface? Or is this a different test than before?

> Sep 20 12:08:04 MDA1PFP-S02 crmd[2354]:  notice: State transition S_NOT_DC -> S_ELECTION [ input=I_ELECTION cause=C_CRMD_STATUS_CALLBACK origin=peer_update_callback ]
> Sep 20 12:08:04 MDA1PFP-S02 crmd[2354]:  notice: State transition S_ELECTION -> S_INTEGRATION [ input=I_ELECTION_DC cause=C_TIMER_POPPED origin=election_timeout_popped ]
> Sep 20 12:08:04 MDA1PFP-S02 attrd[2351]:  notice: crm_update_peer_proc: Node MDA1PFP-PCS01[1] - state is now lost (was member)
> Sep 20 12:08:04 MDA1PFP-S02 attrd[2351]:  notice: Removing all MDA1PFP-PCS01 attributes for attrd_peer_change_cb
> Sep 20 12:08:04 MDA1PFP-S02 attrd[2351]:  notice: Lost attribute writer MDA1PFP-PCS01
> Sep 20 12:08:04 MDA1PFP-S02 attrd[2351]:  notice: Removing MDA1PFP-PCS01/1 from the membership list
> Sep 20 12:08:04 MDA1PFP-S02 attrd[2351]:  notice: Purged 1 peers with id=1 and/or uname=MDA1PFP-PCS01 from the membership cache
> Sep 20 12:08:04 MDA1PFP-S02 stonith-ng[2349]:  notice: crm_update_peer_proc: Node MDA1PFP-PCS01[1] - state is now lost (was member)
> Sep 20 12:08:04 MDA1PFP-S02 stonith-ng[2349]:  notice: Removing MDA1PFP-PCS01/1 from the membership list
> Sep 20 12:08:04 MDA1PFP-S02 stonith-ng[2349]:  notice: Purged 1 peers with id=1 and/or uname=MDA1PFP-PCS01 from the membership cache
> Sep 20 12:08:04 MDA1PFP-S02 cib[2348]:  notice: crm_update_peer_proc: Node MDA1PFP-PCS01[1] - state is now lost (was member)
> Sep 20 12:08:04 MDA1PFP-S02 cib[2348]:  notice: Removing MDA1PFP-PCS01/1 from the membership list
> Sep 20 12:08:04 MDA1PFP-S02 cib[2348]:  notice: Purged 1 peers with id=1 and/or uname=MDA1PFP-PCS01 from the membership cache
> Sep 20 12:08:04 MDA1PFP-S02 crmd[2354]: warning: FSA: Input I_ELECTION_DC from do_election_check() received in state S_INTEGRATION
> Sep 20 12:08:04 MDA1PFP-S02 crmd[2354]:  notice: Notifications disabled
> Sep 20 12:08:04 MDA1PFP-S02 crmd[2354]:   error: pcmkRegisterNode: Triggered assert at xml.c:594 : node->type == XML_ELEMENT_NODE
> Sep 20 12:08:04 MDA1PFP-S02 pengine[2353]:  notice: On loss of CCM Quorum: Ignore
> Sep 20 12:08:04 MDA1PFP-S02 pengine[2353]:  notice: Demote  drbd1:0	(Master -> Slave MDA1PFP-PCS02)
> Sep 20 12:08:04 MDA1PFP-S02 pengine[2353]:  notice: Calculated Transition 0: /var/lib/pacemaker/pengine/pe-input-1813.bz2
> Sep 20 12:08:04 MDA1PFP-S02 crmd[2354]:  notice: Initiating action 55: notify drbd1_pre_notify_demote_0 on MDA1PFP-PCS02 (local)
> Sep 20 12:08:04 MDA1PFP-S02 crmd[2354]:  notice: Operation drbd1_notify_0: ok (node=MDA1PFP-PCS02, call=39, rc=0, cib-update=0, confirmed=true)
> Sep 20 12:08:04 MDA1PFP-S02 crmd[2354]:  notice: Initiating action 18: demote drbd1_demote_0 on MDA1PFP-PCS02 (local)

The demote is requested by Pacemaker.

You can get more info from the pe-input-1813.bz2 file referenced above,
e.g. "crm_simulate -Ssx /var/lib/pacemaker/pengine/pe-input-1813.bz2"
should show the scores and planned actions. It's not the easiest to read
but it has some good info.

> 
> Best wishes,
>   Jens
> 
> --
> Jens Auer | CGI | Software-Engineer
> CGI (Germany) GmbH & Co. KG
> Rheinstraße 95 | 64295 Darmstadt | Germany
> T: +49 6151 36860 154
> jens.auer at cgi.com
> Unsere Pflichtangaben gemäß § 35a GmbHG / §§ 161, 125a HGB finden Sie unter de.cgi.com/pflichtangaben.
> 
> CONFIDENTIALITY NOTICE: Proprietary/Confidential information belonging to CGI Group Inc. and its affiliates may be contained in this message. If you are not a recipient indicated or intended in this message (or responsible for delivery of this message to such person), or you think for any reason that this message may have been addressed to you in error, you may not use or copy or deliver this message to anyone else. In such case, you should destroy this message and are asked to notify the sender by reply e-mail.
> 
> ________________________________________
> Von: Ken Gaillot [kgaillot at redhat.com]
> Gesendet: Montag, 19. September 2016 17:27
> An: Auer, Jens; Cluster Labs - All topics related to open-source clustering welcomed
> Betreff: Re: AW: [ClusterLabs] No DRBD resource promoted to master in Active/Passive setup
> 
> On 09/19/2016 09:48 AM, Auer, Jens wrote:
>> Hi,
>>
>>> Is the network interface being taken down here used for corosync
>>> communication? If so, that is a node-level failure, and pacemaker will
>>> fence.
>>
>> We have different connections on each server:
>> - A bonded 10GB network card for data traffic that will be accessed via a virtual ip managed by pacemaker in 192.168.120.1/24. In the cluster nodes MDA1PFP-S01 and MDA1PFP-S02 are assigned to 192.168.120.10 and 192.168.120.11.
>>
>> - A dedicated back-to-back connection for corosync heartbeats in 192.168.121.1/24. MDA1PFP-PCS01 and MDA1PFP-S02 are assigned to 192.168.121.10 and 192.168.121.11. When the cluster is created, we use these as primary node names and use the 10GB device as a second backup connection for increased reliability: pcs cluster setup --name MDA1PFP MDA1PFP-PCS01,MDA1PFP-S01 MDA1PFP-PCS02,MDA1PFP-S02
>>
>> - A dedicated back-to-back connection for drbd in 192.168.122.1/24. Hosts MDA1PFP-DRBD01 and MDA1PFP-DRBD02 are assigned 192.168.23.10 and 192.168.123.11.
> 
> Ah, nice.
> 
>> Given that I think it is not a node-level failure. pcs status also reports the nodes as online. I think this should not trigger fencing from pacemaker.
>>
>>> When DRBD is configured with 'fencing resource-only' and 'fence-peer
>>> "/usr/lib/drbd/crm-fence-peer.sh";', and DRBD detects a network outage,
>>> it will try to add a constraint that prevents the other node from
>>> becoming master. It removes the constraint when connectivity is restored.
>>
>>> I am not familiar with all the under-the-hood details, but IIUC, if
>>> pacemaker actually fences the node, then the other node can still take
>>> over the DRBD. But if there is a network outage and no pacemaker
>>> fencing, then you'll see the behavior you describe -- DRBD prevents
>>> master takeover, to avoid stale data being used.
>>
>> This is my understanding as well, but there should be no network outage for DRBD. I can reproduce the behavior by stopping cluster nodes which DRBD seems to interpret as network outages since it cannot communicate with the shutdown node anymore. Maybe I should ask on the DRBD mailing list?
> 
> OK, I think I follow you now: you're ifdown'ing the data traffic
> interface, but the interfaces for both corosync and DRBD traffic are
> still up. So, pacemaker detects the virtual IP failure on the traffic
> interface, and correctly recovers the IP on the other node, but the DRBD
> master role is not recovered.
> 
> If the behavior goes away when you remove the DRBD fencing config, then
> it sounds like DRBD is seeing it as a network outage, and is adding the
> constraint to prevent a stale master. Yes, I think that would be worth
> bringing up on the DRBD list, though there might be some DRBD users here
> who can chime in, too.
> 
>> Cheers,
>>   Jens
>> --
>> Jens Auer | CGI | Software-Engineer
>> CGI (Germany) GmbH & Co. KG
>> Rheinstraße 95 | 64295 Darmstadt | Germany
>> T: +49 6151 36860 154
>> jens.auer at cgi.com
>> Unsere Pflichtangaben gemäß § 35a GmbHG / §§ 161, 125a HGB finden Sie unter de.cgi.com/pflichtangaben.
>>
>> CONFIDENTIALITY NOTICE: Proprietary/Confidential information belonging to CGI Group Inc. and its affiliates may be contained in this message. If you are not a recipient indicated or intended in this message (or responsible for delivery of this message to such person), or you think for any reason that this message may have been addressed to you in error, you may not use or copy or deliver this message to anyone else. In such case, you should destroy this message and are asked to notify the sender by reply e-mail.
>>
>> ________________________________________
>> Von: Ken Gaillot [kgaillot at redhat.com]
>> Gesendet: Montag, 19. September 2016 16:28
>> An: Auer, Jens; Cluster Labs - All topics related to open-source clustering welcomed
>> Betreff: Re: [ClusterLabs] No DRBD resource promoted to master in Active/Passive setup
>>
>> On 09/19/2016 02:31 AM, Auer, Jens wrote:
>>> Hi,
>>>
>>> I am not sure that pacemaker should do any fencing here. In my setting, corosync is configured to use a back-to-back connection for heartbeats. This is different subnet then used by the ping resource that checks the network connectivity and detects a failure. In my test, I bring down the network device used by ping and this triggers the failover. The node status is known by pacemaker since it receives heartbeats and it only a resource failure. I asked for fencing conditions a few days ago, and basically was asserted that resource failure should not trigger STONITH actions if not explicitly configured.
>>
>> Is the network interface being taken down here used for corosync
>> communication? If so, that is a node-level failure, and pacemaker will
>> fence.
>>
>> There is a bit of a distinction between DRBD fencing and pacemaker
>> fencing. The DRBD configuration is designed so that DRBD's fencing
>> method is to go through pacemaker.
>>
>> When DRBD is configured with 'fencing resource-only' and 'fence-peer
>> "/usr/lib/drbd/crm-fence-peer.sh";', and DRBD detects a network outage,
>> it will try to add a constraint that prevents the other node from
>> becoming master. It removes the constraint when connectivity is restored.
>>
>> I am not familiar with all the under-the-hood details, but IIUC, if
>> pacemaker actually fences the node, then the other node can still take
>> over the DRBD. But if there is a network outage and no pacemaker
>> fencing, then you'll see the behavior you describe -- DRBD prevents
>> master takeover, to avoid stale data being used.
>>
>>
>>> I am also wondering why this is "sticky". After a failover test the DRBD resources are not working even if I restart the cluster on all nodes.
>>>
>>> Best wishes,
>>>   Jens
>>>
>>> --
>>> Dr. Jens Auer | CGI | Software Engineer
>>> CGI Deutschland Ltd. & Co. KG
>>> Rheinstraße 95 | 64295 Darmstadt | Germany
>>> T: +49 6151 36860 154
>>> jens.auer at cgi.com
>>> Unsere Pflichtangaben gemäß § 35a GmbHG / §§ 161, 125a HGB finden Sie unter de.cgi.com/pflichtangaben.
>>>
>>> CONFIDENTIALITY NOTICE: Proprietary/Confidential information belonging to CGI Group Inc. and its affiliates may be contained in this message. If you are not a recipient indicated or intended in this message (or responsible for delivery of this message to such person), or you think for any reason that this message may have been addressed to you in error, you may not use or copy or deliver this message to anyone else. In such case, you should destroy this message and are asked to notify the sender by reply e-mail.
>>>
>>>> -----Original Message-----
>>>> From: Ken Gaillot [mailto:kgaillot at redhat.com]
>>>> Sent: 16 September 2016 17:56
>>>> To: users at clusterlabs.org
>>>> Subject: Re: [ClusterLabs] No DRBD resource promoted to master in Active/Passive
>>>> setup
>>>>
>>>> On 09/16/2016 10:02 AM, Auer, Jens wrote:
>>>>> Hi,
>>>>>
>>>>> I have an Active/Passive configuration with a drbd mast/slave resource:
>>>>>
>>>>> MDA1PFP-S01 14:40:27 1803 0 ~ # pcs status Cluster name: MDA1PFP
>>>>> Last updated: Fri Sep 16 14:41:18 2016        Last change: Fri Sep 16
>>>>> 14:39:49 2016 by root via cibadmin on MDA1PFP-PCS01
>>>>> Stack: corosync
>>>>> Current DC: MDA1PFP-PCS02 (version 1.1.13-10.el7-44eb2dd) - partition
>>>>> with quorum
>>>>> 2 nodes and 7 resources configured
>>>>>
>>>>> Online: [ MDA1PFP-PCS01 MDA1PFP-PCS02 ]
>>>>>
>>>>> Full list of resources:
>>>>>
>>>>>  Master/Slave Set: drbd1_sync [drbd1]
>>>>>      Masters: [ MDA1PFP-PCS02 ]
>>>>>      Slaves: [ MDA1PFP-PCS01 ]
>>>>>  mda-ip    (ocf::heartbeat:IPaddr2):    Started MDA1PFP-PCS02
>>>>>  Clone Set: ping-clone [ping]
>>>>>      Started: [ MDA1PFP-PCS01 MDA1PFP-PCS02 ]
>>>>>  ACTIVE    (ocf::heartbeat:Dummy):    Started MDA1PFP-PCS02
>>>>>  shared_fs    (ocf::heartbeat:Filesystem):    Started MDA1PFP-PCS02
>>>>>
>>>>> PCSD Status:
>>>>>   MDA1PFP-PCS01: Online
>>>>>   MDA1PFP-PCS02: Online
>>>>>
>>>>> Daemon Status:
>>>>>   corosync: active/disabled
>>>>>   pacemaker: active/disabled
>>>>>   pcsd: active/enabled
>>>>>
>>>>> MDA1PFP-S01 14:41:19 1804 0 ~ # pcs resource --full
>>>>>  Master: drbd1_sync
>>>>>   Meta Attrs: master-max=1 master-node-max=1 clone-max=2
>>>>> clone-node-max=1 notify=true
>>>>>   Resource: drbd1 (class=ocf provider=linbit type=drbd)
>>>>>    Attributes: drbd_resource=shared_fs
>>>>>    Operations: start interval=0s timeout=240 (drbd1-start-interval-0s)
>>>>>                promote interval=0s timeout=90 (drbd1-promote-interval-0s)
>>>>>                demote interval=0s timeout=90 (drbd1-demote-interval-0s)
>>>>>                stop interval=0s timeout=100 (drbd1-stop-interval-0s)
>>>>>                monitor interval=60s (drbd1-monitor-interval-60s)
>>>>>  Resource: mda-ip (class=ocf provider=heartbeat type=IPaddr2)
>>>>>   Attributes: ip=192.168.120.20 cidr_netmask=32 nic=bond0
>>>>>   Operations: start interval=0s timeout=20s (mda-ip-start-interval-0s)
>>>>>               stop interval=0s timeout=20s (mda-ip-stop-interval-0s)
>>>>>               monitor interval=1s (mda-ip-monitor-interval-1s)
>>>>>  Clone: ping-clone
>>>>>   Resource: ping (class=ocf provider=pacemaker type=ping)
>>>>>    Attributes: dampen=5s multiplier=1000 host_list=pf-pep-dev-1
>>>>> timeout=1 attempts=3
>>>>>    Operations: start interval=0s timeout=60 (ping-start-interval-0s)
>>>>>                stop interval=0s timeout=20 (ping-stop-interval-0s)
>>>>>                monitor interval=1 (ping-monitor-interval-1)
>>>>>  Resource: ACTIVE (class=ocf provider=heartbeat type=Dummy)
>>>>>   Operations: start interval=0s timeout=20 (ACTIVE-start-interval-0s)
>>>>>               stop interval=0s timeout=20 (ACTIVE-stop-interval-0s)
>>>>>               monitor interval=10 timeout=20
>>>>> (ACTIVE-monitor-interval-10)
>>>>>  Resource: shared_fs (class=ocf provider=heartbeat type=Filesystem)
>>>>>   Attributes: device=/dev/drbd1 directory=/shared_fs fstype=xfs
>>>>>   Operations: start interval=0s timeout=60 (shared_fs-start-interval-0s)
>>>>>               stop interval=0s timeout=60 (shared_fs-stop-interval-0s)
>>>>>               monitor interval=20 timeout=40
>>>>> (shared_fs-monitor-interval-20)
>>>>>
>>>>> MDA1PFP-S01 14:41:35 1805 0 ~ # pcs constraint --full Location
>>>>> Constraints:
>>>>>   Resource: mda-ip
>>>>>     Enabled on: MDA1PFP-PCS01 (score:50)
>>>>> (id:location-mda-ip-MDA1PFP-PCS01-50)
>>>>>     Constraint: location-mda-ip
>>>>>       Rule: score=-INFINITY boolean-op=or  (id:location-mda-ip-rule)
>>>>>         Expression: pingd lt 1  (id:location-mda-ip-rule-expr)
>>>>>         Expression: not_defined pingd
>>>>> (id:location-mda-ip-rule-expr-1) Ordering Constraints:
>>>>>   start ping-clone then start mda-ip (kind:Optional)
>>>>> (id:order-ping-clone-mda-ip-Optional)
>>>>>   promote drbd1_sync then start shared_fs (kind:Mandatory)
>>>>> (id:order-drbd1_sync-shared_fs-mandatory)
>>>>> Colocation Constraints:
>>>>>   ACTIVE with mda-ip (score:INFINITY) (id:colocation-ACTIVE-mda-ip-INFINITY)
>>>>>   drbd1_sync with mda-ip (score:INFINITY) (rsc-role:Master)
>>>>> (with-rsc-role:Started) (id:colocation-drbd1_sync-mda-ip-INFINITY)
>>>>>   shared_fs with drbd1_sync (score:INFINITY) (rsc-role:Started)
>>>>> (with-rsc-role:Master) (id:colocation-shared_fs-drbd1_sync-INFINITY)
>>>>>
>>>>> The cluster starts fine, except resources starting not on the
>>>>> preferred host. I asked this in a different question to keep things separated.
>>>>> The status after starting is:
>>>>> Last updated: Fri Sep 16 14:39:57 2016          Last change: Fri Sep 16
>>>>> 14:39:49 2016 by root via cibadmin on MDA1PFP-PCS01
>>>>> Stack: corosync
>>>>> Current DC: MDA1PFP-PCS02 (version 1.1.13-10.el7-44eb2dd) - partition
>>>>> with quorum
>>>>> 2 nodes and 7 resources configured
>>>>>
>>>>> Online: [ MDA1PFP-PCS01 MDA1PFP-PCS02 ]
>>>>>
>>>>>  Master/Slave Set: drbd1_sync [drbd1]
>>>>>      Masters: [ MDA1PFP-PCS02 ]
>>>>>      Slaves: [ MDA1PFP-PCS01 ]
>>>>> mda-ip  (ocf::heartbeat:IPaddr2):    Started MDA1PFP-PCS02
>>>>>  Clone Set: ping-clone [ping]
>>>>>      Started: [ MDA1PFP-PCS01 MDA1PFP-PCS02 ] ACTIVE
>>>>> (ocf::heartbeat:Dummy): Started MDA1PFP-PCS02
>>>>> shared_fs    (ocf::heartbeat:Filesystem):    Started MDA1PFP-PCS02
>>>>>
>>>>> From this state, I did two tests to simulate a cluster failover:
>>>>> 1. Shutdown the cluster node with the master with pcs cluster stop 2.
>>>>> Disable the network device for the virtual ip with ifdown and wait
>>>>> until ping detects it
>>>>>
>>>>> In both cases, the failover is executed but the drbd is not promoted
>>>>> to master on the new active node:
>>>>> Last updated: Fri Sep 16 14:43:33 2016          Last change: Fri Sep 16
>>>>> 14:43:31 2016 by root via cibadmin on MDA1PFP-PCS01
>>>>> Stack: corosync
>>>>> Current DC: MDA1PFP-PCS01 (version 1.1.13-10.el7-44eb2dd) - partition
>>>>> with quorum
>>>>> 2 nodes and 7 resources configured
>>>>>
>>>>> Online: [ MDA1PFP-PCS01 ]
>>>>> OFFLINE: [ MDA1PFP-PCS02 ]
>>>>>
>>>>>  Master/Slave Set: drbd1_sync [drbd1]
>>>>>      Slaves: [ MDA1PFP-PCS01 ]
>>>>> mda-ip  (ocf::heartbeat:IPaddr2):    Started MDA1PFP-PCS01
>>>>>  Clone Set: ping-clone [ping]
>>>>>      Started: [ MDA1PFP-PCS01 ]
>>>>> ACTIVE  (ocf::heartbeat:Dummy): Started MDA1PFP-PCS01
>>>>>
>>>>> I was able to trace this to the fencing in the drbd configuration
>>>>> MDA1PFP-S01 14:41:44 1806 0 ~ # cat /etc/drbd.d/shared_fs.res resource
>>>>> shared_fs {
>>>>> disk    /dev/mapper/rhel_mdaf--pf--pep--1-drbd;
>>>>>   disk {
>>>>>     fencing resource-only;
>>>>>   }
>>>>>   handlers {
>>>>>     fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
>>>>>     after-resync-target "/usr/lib/drbd/crm-unfence-peer.sh";
>>>>>   }
>>>>>     device    /dev/drbd1;
>>>>>     meta-disk internal;
>>>>>     on MDA1PFP-S01 {
>>>>>         address 192.168.123.10:7789;
>>>>>     }
>>>>>     on MDA1PFP-S02 {
>>>>>         address 192.168.123.11:7789;
>>>>>     }
>>>>> }
>>>>
>>>> This coordinates fencing between DRBD and pacemaker. You still have to configure
>>>> fencing in pacemaker. If pacemaker can't fence the unseen node, it can't be sure it's
>>>> safe to bring up master.
>>>>
>>>>> I am using drbd 8.4.7, drbd utils 8.9.5 and pacemaker 2.3.4-7.el7 with
>>>>> corosyinc 0.9.143-15.el7 from the Centos7 repositories.
>>>>>
>>>>> MDA1PFP-S01 15:00:20 1841 0 ~ # drbdadm --version
>>>>> DRBDADM_BUILDTAG=GIT-hash:\
>>>> 5d50d9fb2a967d21c0f5746370ccc066d3a67f7d\
>>>>> build\ by\ mockbuild@\,\ 2016-01-12\ 12:46:45
>>>>> DRBDADM_API_VERSION=1
>>>>> DRBD_KERNEL_VERSION_CODE=0x080407
>>>>> DRBDADM_VERSION_CODE=0x080905
>>>>> DRBDADM_VERSION=8.9.5
>>>>>
>>>>> If I disable the fencing scripts everything works as expected. If
>>>>> enabled, no node is promoted to master after failover. It seems to be
>>>>> a sticky modificaton because once a failover is simulated with fencing
>>>>> scripts activated I cannot get the cluster to work anymore. Even
>>>>> removing the setting from the DRBD configuration does not help.
>>>>>
>>>>> I captured the complete log from /var/log/messages from cluster start
>>>>> to failover if that helps:
>>>>> MDA1PFP-S01 14:48:37 1807 0 ~ # cat /var/log/messages Sep 16 14:40:16
>>>>> MDA1PFP-S01 rsyslogd: [origin software="rsyslogd"
>>>>> swVersion="7.4.7" x-pid="13857" x-info="http://www.rsyslog.com"] start
>>>>> Sep 16 14:40:16 MDA1PFP-S01 rsyslogd-2221: module 'imuxsock' already
>>>>> in this config, cannot be added  [try http://www.rsyslog.com/e/2221 ]
>>>>> Sep 16 14:40:16 MDA1PFP-S01 systemd: Stopping System Logging Service...
>>>>> Sep 16 14:40:16 MDA1PFP-S01 systemd: Starting System Logging Service...
>>>>> Sep 16 14:40:16 MDA1PFP-S01 systemd: Started System Logging Service.
>>>>> Sep 16 14:40:27 MDA1PFP-S01 systemd: Started Corosync Cluster Engine.
>>>>> Sep 16 14:40:27 MDA1PFP-S01 systemd: Started Pacemaker High
>>>>> Availability Cluster Manager.
>>>>> Sep 16 14:43:30 MDA1PFP-S01 crmd[13130]:  notice: Operation
>>>>> ACTIVE_start_0: ok (node=MDA1PFP-PCS01, call=33, rc=0, cib-update=22,
>>>>> confirmed=true)
>>>>> Sep 16 14:43:30 MDA1PFP-S01 crmd[13130]:  notice: Operation
>>>>> drbd1_notify_0: ok (node=MDA1PFP-PCS01, call=32, rc=0, cib-update=0,
>>>>> confirmed=true)
>>>>> Sep 16 14:43:30 MDA1PFP-S01 IPaddr2(mda-ip)[15321]: INFO: Adding inet
>>>>> address 192.168.120.20/32 with broadcast address 192.168.120.255 to
>>>>> device bond0 Sep 16 14:43:30 MDA1PFP-S01 avahi-daemon[912]:
>>>>> Registering new address record for 192.168.120.20 on bond0.IPv4.
>>>>> Sep 16 14:43:30 MDA1PFP-S01 IPaddr2(mda-ip)[15321]: INFO: Bringing
>>>>> device bond0 up Sep 16 14:43:30 MDA1PFP-S01 kernel: block drbd1: peer(
>>>>> Primary -> Secondary ) Sep 16 14:43:30 MDA1PFP-S01
>>>>> IPaddr2(mda-ip)[15321]: INFO:
>>>>> /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p
>>>>> /var/run/resource-agents/send_arp-192.168.120.20 bond0 192.168.120.20
>>>>> auto not_used not_used Sep 16 14:43:30 MDA1PFP-S01 crmd[13130]:
>>>>> notice: Operation
>>>>> mda-ip_start_0: ok (node=MDA1PFP-PCS01, call=35, rc=0, cib-update=24,
>>>>> confirmed=true)
>>>>> Sep 16 14:43:30 MDA1PFP-S01 crmd[13130]:  notice: Operation
>>>>> drbd1_notify_0: ok (node=MDA1PFP-PCS01, call=36, rc=0, cib-update=0,
>>>>> confirmed=true)
>>>>> Sep 16 14:43:30 MDA1PFP-S01 crmd[13130]:  notice: Operation
>>>>> drbd1_notify_0: ok (node=MDA1PFP-PCS01, call=38, rc=0, cib-update=0,
>>>>> confirmed=true)
>>>>> Sep 16 14:43:30 MDA1PFP-S01 kernel: drbd shared_fs: peer( Secondary ->
>>>>> Unknown ) conn( Connected -> TearDown ) pdsk( UpToDate -> DUnknown )
>>>>> Sep 16 14:43:30 MDA1PFP-S01 kernel: drbd shared_fs: ack_receiver
>>>>> terminated Sep 16 14:43:30 MDA1PFP-S01 kernel: drbd shared_fs:
>>>>> Terminating drbd_a_shared_f Sep 16 14:43:31 MDA1PFP-S01 kernel: drbd
>>>>> shared_fs: Connection closed Sep 16 14:43:31 MDA1PFP-S01 kernel: drbd
>>>>> shared_fs: conn( TearDown -> Unconnected ) Sep 16 14:43:31 MDA1PFP-S01
>>>>> kernel: drbd shared_fs: receiver terminated Sep 16 14:43:31
>>>>> MDA1PFP-S01 kernel: drbd shared_fs: Restarting receiver thread Sep 16
>>>>> 14:43:31 MDA1PFP-S01 kernel: drbd shared_fs: receiver (re)started Sep
>>>>> 16 14:43:31 MDA1PFP-S01 kernel: drbd shared_fs: conn( Unconnected ->
>>>>> WFConnection ) Sep 16 14:43:31 MDA1PFP-S01 crmd[13130]:  notice:
>>>>> Operation
>>>>> drbd1_notify_0: ok (node=MDA1PFP-PCS01, call=39, rc=0, cib-update=0,
>>>>> confirmed=true)
>>>>> Sep 16 14:43:31 MDA1PFP-S01 crmd[13130]:  notice: Operation
>>>>> drbd1_notify_0: ok (node=MDA1PFP-PCS01, call=40, rc=0, cib-update=0,
>>>>> confirmed=true)
>>>>> Sep 16 14:43:31 MDA1PFP-S01 kernel: drbd shared_fs: helper command:
>>>>> /sbin/drbdadm fence-peer shared_fs
>>>>> Sep 16 14:43:31 MDA1PFP-S01 crm-fence-peer.sh[15569]: invoked for
>>>>> shared_fs Sep 16 14:43:31 MDA1PFP-S01 crm-fence-peer.sh[15569]: INFO
>>>>> peer is not reachable, my disk is UpToDate: placed constraint
>>>>> 'drbd-fence-by-handler-shared_fs-drbd1_sync'
>>>>> Sep 16 14:43:31 MDA1PFP-S01 kernel: drbd shared_fs: helper command:
>>>>> /sbin/drbdadm fence-peer shared_fs exit code 5 (0x500) Sep 16 14:43:31
>>>>> MDA1PFP-S01 kernel: drbd shared_fs: fence-peer helper returned 5 (peer
>>>>> is unreachable, assumed to be dead) Sep 16 14:43:31 MDA1PFP-S01
>>>>> kernel: drbd shared_fs: pdsk( DUnknown -> Outdated ) Sep 16 14:43:31
>>>>> MDA1PFP-S01 kernel: block drbd1: role( Secondary -> Primary ) Sep 16
>>>>> 14:43:31 MDA1PFP-S01 kernel: block drbd1: new current UUID
>>>>>
>>>> B1FC3E9C008711DD:C02542C7B26F9B28:BCC6102B1FD69768:BCC5102B1FD697
>>>> 68
>>>>> Sep 16 14:43:31 MDA1PFP-S01 crmd[13130]:   error: pcmkRegisterNode:
>>>>> Triggered assert at xml.c:594 : node->type == XML_ELEMENT_NODE Sep 16
>>>>> 14:43:31 MDA1PFP-S01 crmd[13130]:  notice: Operation
>>>>> drbd1_promote_0: ok (node=MDA1PFP-PCS01, call=41, rc=0, cib-update=26,
>>>>> confirmed=true)
>>>>> Sep 16 14:43:31 MDA1PFP-S01 crmd[13130]:  notice: Operation
>>>>> drbd1_notify_0: ok (node=MDA1PFP-PCS01, call=42, rc=0, cib-update=0,
>>>>> confirmed=true)
>>>>> Sep 16 14:43:31 MDA1PFP-S01 crmd[13130]:  notice: Our peer on the DC
>>>>> (MDA1PFP-PCS02) is dead
>>>>> Sep 16 14:43:31 MDA1PFP-S01 crmd[13130]:  notice: State transition
>>>>> S_NOT_DC -> S_ELECTION [ input=I_ELECTION
>>>> cause=C_CRMD_STATUS_CALLBACK
>>>>> origin=peer_update_callback ] Sep 16 14:43:31 MDA1PFP-S01 crmd[13130]:
>>>>> notice: State transition S_ELECTION -> S_INTEGRATION [
>>>>> input=I_ELECTION_DC cause=C_TIMER_POPPED
>>>>> origin=election_timeout_popped ] Sep 16 14:43:31 MDA1PFP-S01
>>>>> attrd[13128]:  notice: crm_update_peer_proc:
>>>>> Node MDA1PFP-PCS02[2] - state is now lost (was member) Sep 16 14:43:31
>>>>> MDA1PFP-S01 attrd[13128]:  notice: Removing all
>>>>> MDA1PFP-PCS02 attributes for attrd_peer_change_cb Sep 16 14:43:31
>>>>> MDA1PFP-S01 attrd[13128]:  notice: Lost attribute writer
>>>>> MDA1PFP-PCS02
>>>>> Sep 16 14:43:31 MDA1PFP-S01 attrd[13128]:  notice: Removing
>>>>> MDA1PFP-PCS02/2 from the membership list Sep 16 14:43:31 MDA1PFP-S01
>>>>> attrd[13128]:  notice: Purged 1 peers with
>>>>> id=2 and/or uname=MDA1PFP-PCS02 from the membership cache Sep 16
>>>>> 14:43:31 MDA1PFP-S01 stonith-ng[13125]:  notice:
>>>>> crm_update_peer_proc: Node MDA1PFP-PCS02[2] - state is now lost (was
>>>>> member) Sep 16 14:43:31 MDA1PFP-S01 stonith-ng[13125]:  notice:
>>>>> Removing
>>>>> MDA1PFP-PCS02/2 from the membership list Sep 16 14:43:31 MDA1PFP-S01
>>>>> stonith-ng[13125]:  notice: Purged 1 peers with id=2 and/or
>>>>> uname=MDA1PFP-PCS02 from the membership cache Sep 16 14:43:31
>>>>> MDA1PFP-S01 cib[13124]:  notice: crm_update_peer_proc:
>>>>> Node MDA1PFP-PCS02[2] - state is now lost (was member) Sep 16 14:43:31
>>>>> MDA1PFP-S01 cib[13124]:  notice: Removing
>>>>> MDA1PFP-PCS02/2 from the membership list Sep 16 14:43:31 MDA1PFP-S01
>>>>> cib[13124]:  notice: Purged 1 peers with
>>>>> id=2 and/or uname=MDA1PFP-PCS02 from the membership cache Sep 16
>>>>> 14:43:31 MDA1PFP-S01 crmd[13130]: warning: FSA: Input I_ELECTION_DC
>>>>> from do_election_check() received in state S_INTEGRATION Sep 16
>>>>> 14:43:31 MDA1PFP-S01 crmd[13130]:  notice: Notifications disabled
>>>>> Sep 16 14:43:31 MDA1PFP-S01 crmd[13130]:   error: pcmkRegisterNode:
>>>>> Triggered assert at xml.c:594 : node->type == XML_ELEMENT_NODE Sep 16
>>>>> 14:43:31 MDA1PFP-S01 pengine[13129]:  notice: On loss of CCM
>>>>> Quorum: Ignore
>>>>> Sep 16 14:43:31 MDA1PFP-S01 pengine[13129]:  notice: Demote  drbd1:0
>>>>> (Master -> Slave MDA1PFP-PCS01)
>>>>> Sep 16 14:43:31 MDA1PFP-S01 pengine[13129]:  notice: Calculated
>>>>> Transition 0: /var/lib/pacemaker/pengine/pe-input-414.bz2
>>>>> Sep 16 14:43:31 MDA1PFP-S01 crmd[13130]:  notice: Initiating action 55:
>>>>> notify drbd1_pre_notify_demote_0 on MDA1PFP-PCS01 (local) Sep 16
>>>>> 14:43:31 MDA1PFP-S01 crmd[13130]:  notice: Operation
>>>>> drbd1_notify_0: ok (node=MDA1PFP-PCS01, call=43, rc=0, cib-update=0,
>>>>> confirmed=true)
>>>>> Sep 16 14:43:31 MDA1PFP-S01 crmd[13130]:  notice: Initiating action 8:
>>>>> demote drbd1_demote_0 on MDA1PFP-PCS01 (local) Sep 16 14:43:31
>>>>> MDA1PFP-S01 systemd-udevd: error: /dev/drbd1: Wrong medium type Sep 16
>>>>> 14:43:31 MDA1PFP-S01 kernel: block drbd1: role( Primary -> Secondary )
>>>>> Sep 16 14:43:31 MDA1PFP-S01 kernel: block drbd1: bitmap WRITE of 0
>>>>> pages took 0 jiffies Sep 16 14:43:31 MDA1PFP-S01 kernel: block drbd1:
>>>>> 0 KB (0 bits) marked out-of-sync by on disk bit-map.
>>>>> Sep 16 14:43:31 MDA1PFP-S01 systemd-udevd: error: /dev/drbd1: Wrong
>>>>> medium type
>>>>> Sep 16 14:43:31 MDA1PFP-S01 crmd[13130]:   error: pcmkRegisterNode:
>>>>> Triggered assert at xml.c:594 : node->type == XML_ELEMENT_NODE Sep 16
>>>>> 14:43:31 MDA1PFP-S01 crmd[13130]:  notice: Operation
>>>>> drbd1_demote_0: ok (node=MDA1PFP-PCS01, call=44, rc=0, cib-update=49,
>>>>> confirmed=true)
>>>>> Sep 16 14:43:31 MDA1PFP-S01 crmd[13130]:  notice: Initiating action 56:
>>>>> notify drbd1_post_notify_demote_0 on MDA1PFP-PCS01 (local) Sep 16
>>>>> 14:43:31 MDA1PFP-S01 crmd[13130]:  notice: Operation
>>>>> drbd1_notify_0: ok (node=MDA1PFP-PCS01, call=45, rc=0, cib-update=0,
>>>>> confirmed=true)
>>>>> Sep 16 14:43:31 MDA1PFP-S01 crmd[13130]:  notice: Initiating action 10:
>>>>> monitor drbd1_monitor_60000 on MDA1PFP-PCS01 (local) Sep 16 14:43:31
>>>>> MDA1PFP-S01 corosync[13019]: [TOTEM ] A new membership
>>>>> (192.168.121.10:988) was formed. Members left: 2 Sep 16 14:43:31
>>>>> MDA1PFP-S01 corosync[13019]: [QUORUM] Members[1]: 1 Sep 16 14:43:31
>>>>> MDA1PFP-S01 corosync[13019]: [MAIN  ] Completed service
>>>>> synchronization, ready to provide service.
>>>>> Sep 16 14:43:31 MDA1PFP-S01 pacemakerd[13113]:  notice:
>>>>> crm_reap_unseen_nodes: Node MDA1PFP-PCS02[2] - state is now lost (was
>>>>> member)
>>>>> Sep 16 14:43:31 MDA1PFP-S01 crmd[13130]:  notice: crm_reap_unseen_nodes:
>>>>> Node MDA1PFP-PCS02[2] - state is now lost (was member) Sep 16 14:43:31
>>>>> MDA1PFP-S01 crmd[13130]: warning: No match for shutdown action on 2
>>>>> Sep 16 14:43:31 MDA1PFP-S01 crmd[13130]:  notice: Stonith/shutdown of
>>>>> MDA1PFP-PCS02 not matched
>>>>> Sep 16 14:43:31 MDA1PFP-S01 crmd[13130]:  notice: Transition aborted:
>>>>> Node failure (source=peer_update_callback:252, 0)
>>>>> Sep 16 14:43:31 MDA1PFP-S01 crmd[13130]:   error: pcmkRegisterNode:
>>>>> Triggered assert at xml.c:594 : node->type == XML_ELEMENT_NODE Sep 16
>>>>> 14:43:31 MDA1PFP-S01 crmd[13130]:  notice: Transition 0 (Complete=10,
>>>>> Pending=0, Fired=0, Skipped=0, Incomplete=0,
>>>>> Source=/var/lib/pacemaker/pengine/pe-input-414.bz2): Complete Sep 16
>>>>> 14:43:31 MDA1PFP-S01 pengine[13129]:  notice: On loss of CCM
>>>>> Quorum: Ignore
>>>>> Sep 16 14:43:31 MDA1PFP-S01 pengine[13129]:  notice: Calculated
>>>>> Transition 1: /var/lib/pacemaker/pengine/pe-input-415.bz2
>>>>> Sep 16 14:43:31 MDA1PFP-S01 crmd[13130]:  notice: Transition 1
>>>>> (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0,
>>>>> Source=/var/lib/pacemaker/pengine/pe-input-415.bz2): Complete Sep 16
>>>>> 14:43:31 MDA1PFP-S01 crmd[13130]:  notice: State transition
>>>>> S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS
>>>>> cause=C_FSA_INTERNAL origin=notify_crmd ] Sep 16 14:48:48 MDA1PFP-S01
>>>>> chronyd[909]: Source 62.116.162.126 replaced with 46.182.19.75
>>>>>
>>>>> Any help appreciated,
>>>>>   Jens
>>>>>
>>>>>
>>>>> --
>>>>> *Jens Auer *| CGI | Software-Engineer
>>>>> CGI (Germany) GmbH & Co. KG
>>>>> Rheinstraße 95 | 64295 Darmstadt | Germany
>>>>> T: +49 6151 36860 154
>>>>> _jens.auer at cgi.com_ <mailto:jens.auer at cgi.com> Unsere Pflichtangaben
>>>>> gemäß § 35a GmbHG / §§ 161, 125a HGB finden Sie unter
>>>>> _de.cgi.com/pflichtangaben_ <http://de.cgi.com/pflichtangaben>.




More information about the Users mailing list