[ClusterLabs] Filesystem Resource Move Fails Because Underlying DRBD Resource Won't Move

Sun Feb 28 08:32:39 EST 2021

We see in the log on 001db01a...

Feb 28 07:33:50 [61707] 001db02a.ccnva.local    pengine:     info: master_color:        ms_drbd1: Promoted 1 instances of a possible 1 to master

...and then...

Feb 28 07:33:50 [61707] 001db02a.ccnva.local    pengine:   notice: LogAction:    * Move       p_fs_clust04          (     001db02b -> 001db02a )

...and then...

Feb 28 07:34:03 [61708] 001db02a.ccnva.local       crmd:   notice: te_rsc_command:      Initiating stop operation p_fs_clust04_stop_0 on 001db02b | action 69

..and then...

Feb 28 07:34:04 [61708] 001db02a.ccnva.local       crmd:     info: match_graph_event:   Action p_fs_clust04_stop_0 (69) confirmed on 001db02b (rc=0)

Feb 28 07:34:04 [61708] 001db02a.ccnva.local       crmd:   notice: te_rsc_command:      Initiating start operation p_fs_clust04_start_0 locally on 001db02a | action 70

...and finally...

Feb 28 07:34:04  Filesystem(p_fs_clust04)[15357]:    INFO: Running start for /dev/drbd1 on /ha02_mysql

Feb 28 07:34:09  Filesystem(p_fs_clust04)[15357]:    ERROR: Couldn't mount filesystem /dev/drbd1 on /ha02_mysql

Resource ms_drbd1 is not becoming master, so the filesystem won't mount.

Am I reading that right?

-Eric

From: Users <users-bounces at clusterlabs.org> On Behalf Of Eric Robinson
Sent: Sunday, February 28, 2021 6:56 AM
To: Cluster Labs - All topics related to open-source clustering welcomed <users at clusterlabs.org>
Subject: Re: [ClusterLabs] Filesystem Resource Move Fails Because Underlying DRBD Resource Won't Move

Oops, sorry, here are links to the text logs.

Node 001db02a: https://www.dropbox.com/s/ymbatz91x3y84wp/001db02a_log.txt?dl=0

Node 001db02b: https://www.dropbox.com/s/etq6mn460imdega/001db02b_log.txt?dl=0

-Eric

From: Users <users-bounces at clusterlabs.org<mailto:users-bounces at clusterlabs.org>> On Behalf Of Eric Robinson
Sent: Sunday, February 28, 2021 6:46 AM
To: Cluster Labs - All topics related to open-source clustering welcomed <users at clusterlabs.org<mailto:users at clusterlabs.org>>
Subject: [ClusterLabs] Filesystem Resource Move Fails Because Underlying DRBD Resource Won't Move

Beginning with this cluster status...

Cluster name: 001db02ab
Stack: corosync
Current DC: 001db02a (version 1.1.18-11.el7_5.3-2b07d5c5a9) - partition with quorum
Last updated: Sun Feb 28 07:24:31 2021
Last change: Sun Feb 28 07:19:51 2021 by hacluster via crmd on 001db02a

2 nodes configured
14 resources configured

Online: [ 001db02a 001db02b ]

Full list of resources:

Master/Slave Set: ms_drbd0 [p_drbd0]
     Masters: [ 001db02a ]
     Slaves: [ 001db02b ]
Master/Slave Set: ms_drbd1 [p_drbd1]
     Masters: [ 001db02b ]
     Slaves: [ 001db02a ]
p_fs_clust03   (ocf::heartbeat:Filesystem):    Started 001db02a
p_fs_clust04   (ocf::heartbeat:Filesystem):    Started 001db02b
p_mysql_009    (lsb:mysql_009):        Started 001db02a
p_mysql_010    (lsb:mysql_010):        Started 001db02a
p_mysql_011    (lsb:mysql_011):        Started 001db02a
p_mysql_012    (lsb:mysql_012):        Started 001db02a
p_mysql_014    (lsb:mysql_014):        Started 001db02b
p_mysql_015    (lsb:mysql_015):        Started 001db02b
p_mysql_016    (lsb:mysql_016):        Started 001db02b
stonith-001db02ab      (stonith:fence_azure_arm):      Started 001db02a

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

...and with these constraints...

Location Constraints:
Ordering Constraints:
  promote ms_drbd0 then start p_fs_clust03 (kind:Mandatory) (id:order-ms_drbd0-p_fs_clust03-mandatory)
  promote ms_drbd1 then start p_fs_clust04 (kind:Mandatory) (id:order-ms_drbd1-p_fs_clust04-mandatory)
  start p_fs_clust03 then start p_mysql_009 (kind:Mandatory) (id:order-p_fs_clust03-p_mysql_009-mandatory)
  start p_fs_clust03 then start p_mysql_010 (kind:Mandatory) (id:order-p_fs_clust03-p_mysql_010-mandatory)
  start p_fs_clust03 then start p_mysql_011 (kind:Mandatory) (id:order-p_fs_clust03-p_mysql_011-mandatory)
  start p_fs_clust03 then start p_mysql_012 (kind:Mandatory) (id:order-p_fs_clust03-p_mysql_012-mandatory)
  start p_fs_clust04 then start p_mysql_014 (kind:Mandatory) (id:order-p_fs_clust04-p_mysql_014-mandatory)
  start p_fs_clust04 then start p_mysql_015 (kind:Mandatory) (id:order-p_fs_clust04-p_mysql_015-mandatory)
  start p_fs_clust04 then start p_mysql_016 (kind:Mandatory) (id:order-p_fs_clust04-p_mysql_016-mandatory)
Colocation Constraints:
  p_fs_clust03 with ms_drbd0 (score:INFINITY) (id:colocation-p_fs_clust03-ms_drbd0-INFINITY)
  p_fs_clust04 with ms_drbd1 (score:INFINITY) (id:colocation-p_fs_clust04-ms_drbd1-INFINITY)
  p_mysql_009 with p_fs_clust03 (score:INFINITY) (id:colocation-p_mysql_009-p_fs_clust03-INFINITY)
  p_mysql_010 with p_fs_clust03 (score:INFINITY) (id:colocation-p_mysql_010-p_fs_clust03-INFINITY)
  p_mysql_011 with p_fs_clust03 (score:INFINITY) (id:colocation-p_mysql_011-p_fs_clust03-INFINITY)
  p_mysql_012 with p_fs_clust03 (score:INFINITY) (id:colocation-p_mysql_012-p_fs_clust03-INFINITY)
  p_mysql_014 with p_fs_clust04 (score:INFINITY) (id:colocation-p_mysql_014-p_fs_clust04-INFINITY)
  p_mysql_015 with p_fs_clust04 (score:INFINITY) (id:colocation-p_mysql_015-p_fs_clust04-INFINITY)
  p_mysql_016 with p_fs_clust04 (score:INFINITY) (id:colocation-p_mysql_016-p_fs_clust04-INFINITY)

...and this drbd status on node 001db02a...

ha01_mysql role:Primary
  disk:UpToDate
  001db02b role:Secondary
    peer-disk:UpToDate

ha02_mysql role:Secondary
  disk:UpToDate
  001db02b role:Primary
    peer-disk:UpToDate

...we issue the command...

# pcs resource move p_fs_clust04

...we get result...

Full list of resources:

Master/Slave Set: ms_drbd0 [p_drbd0]
     Masters: [ 001db02a ]
     Slaves: [ 001db02b ]
Master/Slave Set: ms_drbd1 [p_drbd1]
     Masters: [ 001db02b ]
     Slaves: [ 001db02a ]
p_fs_clust03   (ocf::heartbeat:Filesystem):    Started 001db02a
p_fs_clust04   (ocf::heartbeat:Filesystem):    Stopped
p_mysql_009    (lsb:mysql_009):        Started 001db02a
p_mysql_010    (lsb:mysql_010):        Started 001db02a
p_mysql_011    (lsb:mysql_011):        Started 001db02a
p_mysql_012    (lsb:mysql_012):        Started 001db02a
p_mysql_014    (lsb:mysql_014):        Stopped
p_mysql_015    (lsb:mysql_015):        Stopped
p_mysql_016    (lsb:mysql_016):        Stopped
stonith-001db02ab      (stonith:fence_azure_arm):      Started 001db02a

Failed Actions:
* p_fs_clust04_start_0 on 001db02a 'unknown error' (1): call=126, status=complete, exitreason='Couldn't mount filesystem /dev/drbd1 on /ha02_mysql',
    last-rc-change='Sun Feb 28 07:34:04 2021', queued=0ms, exec=5251ms

Here is the log from node 001db02a: https://www.dropbox.com/s/vq3ytcsuvvmqwe5/001db02a_log?dl=0

Here is the log from node 001db02b: https://www.dropbox.com/s/g0el6ft0jmvzqsi/001db02b_log?dl=0

>From reading the logs, it seems that the filesystem p_fs_clust04  is getting successfully unmounted on node 001db02b, but the drbd resource never stops. On node 001db01b, it tries to mount the filesystem but fails because the drbd volume is not master.

Why isn't drbd transitioning?

Disclaimer : This email and any files transmitted with it are confidential and intended solely for intended recipients. If you are not the named addressee you should not disseminate, distribute, copy or alter this email. Any views or opinions presented in this email are solely those of the author and might not represent those of Physician Select Management. Warning: Although Physician Select Management has taken reasonable precautions to ensure no viruses are present in this email, the company cannot accept responsibility for any loss or damage arising from the use of this email or attachments.
Disclaimer : This email and any files transmitted with it are confidential and intended solely for intended recipients. If you are not the named addressee you should not disseminate, distribute, copy or alter this email. Any views or opinions presented in this email are solely those of the author and might not represent those of Physician Select Management. Warning: Although Physician Select Management has taken reasonable precautions to ensure no viruses are present in this email, the company cannot accept responsibility for any loss or damage arising from the use of this email or attachments.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20210228/afe38899/attachment-0001.htm>