<html>

  <head>


    <meta http-equiv="content-type" content="text/html; charset=UTF-8">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <tt>Hello again,<br>

      <br>

      Long time I don't show up... I was finishing up details of Ubuntu

      20.04 HA packages (with lots of other stuff), so sorry for not

      being active until now (about to change). During my regression lab

      preparation, as I spoke in latest HA conf, I'm facing a situation

      I'd like to have some inputs on if anyone has...<br>

      <br>

      I'm clearing up needed fence_mpath/fence_iscsi setup for all

      Ubuntu versions:<br>

      <br>

<a class="moz-txt-link-freetext" href="https://bugs.launchpad.net/ubuntu/+source/fence-agents/+bug/1864404">https://bugs.launchpad.net/ubuntu/+source/fence-agents/+bug/1864404</a><br>

      <br>

      and I just faced this:<br>

      <br>

      - 3 x node cluster setup<br>

      - 3 x nodes share 4 paths to /dev/mapper/volume{00..10}<br>

      - Using /dev/mapper/volume01 for fencing tests<br>

      - softdog configured for /dev/watchdog<br>

      - fence_mpath_check installed in /etc/watchdog.d/<br>

      <br>

      ----<br>

      <br>

      (k)rafaeldtinoco@clusterg01:~$ crm configure show<br>

      node 1: clusterg01<br>

      node 2: clusterg02<br>

      node 3: clusterg03<br>

      primitive fence-mpath-clusterg01 stonith:fence_mpath \<br>

          params pcmk_on_timeout=70 pcmk_off_timeout=70

      pcmk_host_list=clusterg01 pcmk_monitor_action=metadata

      pcmk_reboot_action=off key=59450000 devices="/dev/mapper/volume01"

      power_wait=65 \<br>

          meta provides=unfencing target-role=Started<br>

      primitive fence-mpath-clusterg02 stonith:fence_mpath \<br>

          params pcmk_on_timeout=70 pcmk_off_timeout=70

      pcmk_host_list=clusterg02 pcmk_monitor_action=metadata

      pcmk_reboot_action=off key=59450001 devices="/dev/mapper/volume01"

      power_wait=65 \<br>

          meta provides=unfencing target-role=Started<br>

      primitive fence-mpath-clusterg03 stonith:fence_mpath \<br>

          params pcmk_on_timeout=70 pcmk_off_timeout=70

      pcmk_host_list=clusterg03 pcmk_monitor_action=metadata

      pcmk_reboot_action=off key=59450002 devices="/dev/mapper/volume01"

      power_wait=65 \<br>

          meta provides=unfencing target-role=Started<br>

      property cib-bootstrap-options: \<br>

          have-watchdog=false \<br>

          dc-version=2.0.3-4b1f869f0f \<br>

          cluster-infrastructure=corosync \<br>

          cluster-name=clusterg \<br>

          stonith-enabled=true \<br>

          no-quorum-policy=stop \<br>

          last-lrm-refresh=1590773755<br>

      <br>

      ----<br>

      <br>

      (k)rafaeldtinoco@clusterg03:~$ crm status<br>

      Cluster Summary:<br>

        * Stack: corosync<br>

        * Current DC: clusterg02 (version 2.0.3-4b1f869f0f) - partition

      with quorum<br>

        * Last updated: Mon Jun  1 12:55:13 2020<br>

        * Last change:  Mon Jun  1 04:35:07 2020 by root via cibadmin on

      clusterg03<br>

        * 3 nodes configured<br>

        * 3 resource instances configured<br>

      <br>

      Node List:<br>

        * Online: [ clusterg01 clusterg02 clusterg03 ]<br>

      <br>

      Full List of Resources:<br>

        * fence-mpath-clusterg01    (stonith:fence_mpath):     Started

      clusterg02<br>

        * fence-mpath-clusterg02    (stonith:fence_mpath):     Started

      clusterg03<br>

        * fence-mpath-clusterg03    (stonith:fence_mpath):     Started

      clusterg01<br>

      <br>

      ----<br>

      <br>

      (k)rafaeldtinoco@clusterg03:~$ sudo mpathpersist --in -r

      /dev/mapper/volume01<br>

        PR generation=0x2d, Reservation follows:<br>

         Key = 0x59450001<br>

        scope = LU_SCOPE, type = Write Exclusive, registrants only<br>

      <br>

      (k)rafaeldtinoco@clusterg03:~$ sudo mpathpersist --in -k

      /dev/mapper/volume01<br>

        PR generation=0x2d,     12 registered reservation keys follow:<br>

          0x59450001<br>

          0x59450001<br>

          0x59450001<br>

          0x59450001<br>

          0x59450002<br>

          0x59450002<br>

          0x59450002<br>

          0x59450002<br>

          0x59450000<br>

          0x59450000<br>

          0x59450000<br>

          0x59450000<br>

      <br>

      ----<br>

      <br>

      You can see that everything looks fine. If I disable the 2

      interconnects I have for corosync:<br>

      <br>

      (k)rafaeldtinoco@clusterg01:~$ sudo corosync-quorumtool -a<br>

      Quorum information<br>

      ------------------<br>

      Date:             Mon Jun  1 12:56:00 2020<br>

      Quorum provider:  corosync_votequorum<br>

      Nodes:            3<br>

      Node ID:          1<br>

      Ring ID:          1.120<br>

      Quorate:          Yes<br>

      <br>

      Votequorum information<br>

      ----------------------<br>

      Expected votes:   3<br>

      Highest expected: 3<br>

      Total votes:      3<br>

      Quorum:           2  <br>

      Flags:            Quorate <br>

      <br>

      Membership information<br>

      ----------------------<br>

          Nodeid      Votes Name<br>

               1          1 clusterg01, clusterg01bkp (local)<br>

               2          1 clusterg02, clusterg02bkp<br>

               3          1 clusterg03, clusterg03bkp<br>

      <br>

      for node clusterg01 I have it fenced correctly:<br>

      <br>

      Pending Fencing Actions:<br>

        * reboot of clusterg01 pending: client=pacemaker-controld.906,

      origin=clusterg02<br>

      <br>

      (k)rafaeldtinoco@clusterg03:~$ sudo mpathpersist --in -r

      /dev/mapper/volume01<br>

        PR generation=0x2e, Reservation follows:<br>

         Key = 0x59450001<br>

        scope = LU_SCOPE, type = Write Exclusive, registrants only<br>

      <br>

      (k)rafaeldtinoco@clusterg03:~$ sudo mpathpersist --in -k

      /dev/mapper/volume01<br>

        PR generation=0x2e,     8 registered reservation keys follow:<br>

          0x59450001<br>

          0x59450001<br>

          0x59450001<br>

          0x59450001<br>

          0x59450002<br>

          0x59450002<br>

          0x59450002<br>

          0x59450002<br>

      <br>

      and watchdog reboots  it.. but.. turns out that it returns with

      just 1 reservation key for 1 path (instead of 4). I was wondering

      if that was because of the async nature of the combination:

      systemd + open-iscsi + multipath-tools + pacemaker service

      startup.<br>

      <br>

      Check:<br>

      <br>

      (k)rafaeldtinoco@clusterg01:~$ uptime<br>

       12:58:22 up 0 min,  0 users,  load average: 0.31, 0.09, 0.03<br>

      <br>

      (k)rafaeldtinoco@clusterg03:~$ sudo mpathpersist --in -r

      /dev/mapper/volume01<br>

        PR generation=0x2f, Reservation follows:<br>

         Key = 0x59450001<br>

        scope = LU_SCOPE, type = Write Exclusive, registrants only<br>

      <br>

      (k)rafaeldtinoco@clusterg03:~$ sudo mpathpersist --in -k

      /dev/mapper/volume01<br>

        PR generation=0x2f,     9 registered reservation keys follow:<br>

          0x59450001<br>

          0x59450001<br>

          0x59450001<br>

          0x59450001<br>

          0x59450002<br>

          0x59450002<br>

          0x59450002<br>

          0x59450002<br>

          0x59450000<br>

      <br>

      After this ^ I have to run:<br>

      <br>

      (k)rafaeldtinoco@clusterg01:~$ sudo mpathpersist --out --register

      --param-rk=0x59450000 /dev/mapper/volume01<br>

      persistent reserve out: scsi status: Reservation Conflict<br>

      PR out: command failed<br>

      <br>

      (k)rafaeldtinoco@clusterg01:~$ sudo fence_mpath -v -d

      /dev/mapper/volume01 -n 59450000 -o on<br>

      2020-06-01 12:59:46,388 INFO: Executing: /usr/sbin/mpathpersist -i

      -k -d /dev/mapper/volume01<br>

      <br>

      To guarantee all reservations are correctly placed again, after

      the fence was done:<br>

      <br>

      (k)rafaeldtinoco@clusterg03:~$ sudo mpathpersist --in -k

      /dev/mapper/volume01<br>

        PR generation=0x33,     12 registered reservation keys follow:<br>

          0x59450001<br>

          0x59450001<br>

          0x59450001<br>

          0x59450001<br>

          0x59450002<br>

          0x59450002<br>

          0x59450002<br>

          0x59450002<br>

          0x59450000<br>

          0x59450000<br>

          0x59450000<br>

          0x59450000<br>

      <br>

      I was wondering if "resource-agents-deps.target" being RequiredBy

      in [Install] systemd section for open-iscsi.service and

      multipath-tools.service, together with

      "Before=resource-agents-deps.target" in [Unit] section, would be

      enough but in this case I think it is not enough. <br>

      <br>

      Any idea why this happens ? Did the agent start with there was a

      single path available to the disk when iscsi session was being

      established and multipath-tools had scanned a single path only ? I

      tend to think that, if this was the case, sometimes I would have 1

      path, sometimes 2, etc.. and not a single path with reservations

      all the time (missing 3 reservations). <br>

      <br>

      OR there is something else about the PERSIST RESERVATION I'm

      missing from SBC-3/4. <br>

      <br>

      Any thoughts ?<br>

    </tt>

  </body>

</html>