<html>
  <head>

    <meta http-equiv="content-type" content="text/html; charset=UTF-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <tt>Hello again,<br>
      <br>
      Long time I don't show up... I was finishing up details of Ubuntu
      20.04 HA packages (with lots of other stuff), so sorry for not
      being active until now (about to change). During my regression lab
      preparation, as I spoke in latest HA conf, I'm facing a situation
      I'd like to have some inputs on if anyone has...<br>
      <br>
      I'm clearing up needed fence_mpath/fence_iscsi setup for all
      Ubuntu versions:<br>
      <br>
<a class="moz-txt-link-freetext" href="https://bugs.launchpad.net/ubuntu/+source/fence-agents/+bug/1864404">https://bugs.launchpad.net/ubuntu/+source/fence-agents/+bug/1864404</a><br>
      <br>
      and I just faced this:<br>
      <br>
      - 3 x node cluster setup<br>
      - 3 x nodes share 4 paths to /dev/mapper/volume{00..10}<br>
      - Using /dev/mapper/volume01 for fencing tests<br>
      - softdog configured for /dev/watchdog<br>
      - fence_mpath_check installed in /etc/watchdog.d/<br>
      <br>
      ----<br>
      <br>
      (k)rafaeldtinoco@clusterg01:~$ crm configure show<br>
      node 1: clusterg01<br>
      node 2: clusterg02<br>
      node 3: clusterg03<br>
      primitive fence-mpath-clusterg01 stonith:fence_mpath \<br>
          params pcmk_on_timeout=70 pcmk_off_timeout=70
      pcmk_host_list=clusterg01 pcmk_monitor_action=metadata
      pcmk_reboot_action=off key=59450000 devices="/dev/mapper/volume01"
      power_wait=65 \<br>
          meta provides=unfencing target-role=Started<br>
      primitive fence-mpath-clusterg02 stonith:fence_mpath \<br>
          params pcmk_on_timeout=70 pcmk_off_timeout=70
      pcmk_host_list=clusterg02 pcmk_monitor_action=metadata
      pcmk_reboot_action=off key=59450001 devices="/dev/mapper/volume01"
      power_wait=65 \<br>
          meta provides=unfencing target-role=Started<br>
      primitive fence-mpath-clusterg03 stonith:fence_mpath \<br>
          params pcmk_on_timeout=70 pcmk_off_timeout=70
      pcmk_host_list=clusterg03 pcmk_monitor_action=metadata
      pcmk_reboot_action=off key=59450002 devices="/dev/mapper/volume01"
      power_wait=65 \<br>
          meta provides=unfencing target-role=Started<br>
      property cib-bootstrap-options: \<br>
          have-watchdog=false \<br>
          dc-version=2.0.3-4b1f869f0f \<br>
          cluster-infrastructure=corosync \<br>
          cluster-name=clusterg \<br>
          stonith-enabled=true \<br>
          no-quorum-policy=stop \<br>
          last-lrm-refresh=1590773755<br>
      <br>
      ----<br>
      <br>
      (k)rafaeldtinoco@clusterg03:~$ crm status<br>
      Cluster Summary:<br>
        * Stack: corosync<br>
        * Current DC: clusterg02 (version 2.0.3-4b1f869f0f) - partition
      with quorum<br>
        * Last updated: Mon Jun  1 12:55:13 2020<br>
        * Last change:  Mon Jun  1 04:35:07 2020 by root via cibadmin on
      clusterg03<br>
        * 3 nodes configured<br>
        * 3 resource instances configured<br>
      <br>
      Node List:<br>
        * Online: [ clusterg01 clusterg02 clusterg03 ]<br>
      <br>
      Full List of Resources:<br>
        * fence-mpath-clusterg01    (stonith:fence_mpath):     Started
      clusterg02<br>
        * fence-mpath-clusterg02    (stonith:fence_mpath):     Started
      clusterg03<br>
        * fence-mpath-clusterg03    (stonith:fence_mpath):     Started
      clusterg01<br>
      <br>
      ----<br>
      <br>
      (k)rafaeldtinoco@clusterg03:~$ sudo mpathpersist --in -r
      /dev/mapper/volume01<br>
        PR generation=0x2d, Reservation follows:<br>
         Key = 0x59450001<br>
        scope = LU_SCOPE, type = Write Exclusive, registrants only<br>
      <br>
      (k)rafaeldtinoco@clusterg03:~$ sudo mpathpersist --in -k
      /dev/mapper/volume01<br>
        PR generation=0x2d,     12 registered reservation keys follow:<br>
          0x59450001<br>
          0x59450001<br>
          0x59450001<br>
          0x59450001<br>
          0x59450002<br>
          0x59450002<br>
          0x59450002<br>
          0x59450002<br>
          0x59450000<br>
          0x59450000<br>
          0x59450000<br>
          0x59450000<br>
      <br>
      ----<br>
      <br>
      You can see that everything looks fine. If I disable the 2
      interconnects I have for corosync:<br>
      <br>
      (k)rafaeldtinoco@clusterg01:~$ sudo corosync-quorumtool -a<br>
      Quorum information<br>
      ------------------<br>
      Date:             Mon Jun  1 12:56:00 2020<br>
      Quorum provider:  corosync_votequorum<br>
      Nodes:            3<br>
      Node ID:          1<br>
      Ring ID:          1.120<br>
      Quorate:          Yes<br>
      <br>
      Votequorum information<br>
      ----------------------<br>
      Expected votes:   3<br>
      Highest expected: 3<br>
      Total votes:      3<br>
      Quorum:           2  <br>
      Flags:            Quorate <br>
      <br>
      Membership information<br>
      ----------------------<br>
          Nodeid      Votes Name<br>
               1          1 clusterg01, clusterg01bkp (local)<br>
               2          1 clusterg02, clusterg02bkp<br>
               3          1 clusterg03, clusterg03bkp<br>
      <br>
      for node clusterg01 I have it fenced correctly:<br>
      <br>
      Pending Fencing Actions:<br>
        * reboot of clusterg01 pending: client=pacemaker-controld.906,
      origin=clusterg02<br>
      <br>
      (k)rafaeldtinoco@clusterg03:~$ sudo mpathpersist --in -r
      /dev/mapper/volume01<br>
        PR generation=0x2e, Reservation follows:<br>
         Key = 0x59450001<br>
        scope = LU_SCOPE, type = Write Exclusive, registrants only<br>
      <br>
      (k)rafaeldtinoco@clusterg03:~$ sudo mpathpersist --in -k
      /dev/mapper/volume01<br>
        PR generation=0x2e,     8 registered reservation keys follow:<br>
          0x59450001<br>
          0x59450001<br>
          0x59450001<br>
          0x59450001<br>
          0x59450002<br>
          0x59450002<br>
          0x59450002<br>
          0x59450002<br>
      <br>
      and watchdog reboots  it.. but.. turns out that it returns with
      just 1 reservation key for 1 path (instead of 4). I was wondering
      if that was because of the async nature of the combination:
      systemd + open-iscsi + multipath-tools + pacemaker service
      startup.<br>
      <br>
      Check:<br>
      <br>
      (k)rafaeldtinoco@clusterg01:~$ uptime<br>
       12:58:22 up 0 min,  0 users,  load average: 0.31, 0.09, 0.03<br>
      <br>
      (k)rafaeldtinoco@clusterg03:~$ sudo mpathpersist --in -r
      /dev/mapper/volume01<br>
        PR generation=0x2f, Reservation follows:<br>
         Key = 0x59450001<br>
        scope = LU_SCOPE, type = Write Exclusive, registrants only<br>
      <br>
      (k)rafaeldtinoco@clusterg03:~$ sudo mpathpersist --in -k
      /dev/mapper/volume01<br>
        PR generation=0x2f,     9 registered reservation keys follow:<br>
          0x59450001<br>
          0x59450001<br>
          0x59450001<br>
          0x59450001<br>
          0x59450002<br>
          0x59450002<br>
          0x59450002<br>
          0x59450002<br>
          0x59450000<br>
      <br>
      After this ^ I have to run:<br>
      <br>
      (k)rafaeldtinoco@clusterg01:~$ sudo mpathpersist --out --register
      --param-rk=0x59450000 /dev/mapper/volume01<br>
      persistent reserve out: scsi status: Reservation Conflict<br>
      PR out: command failed<br>
      <br>
      (k)rafaeldtinoco@clusterg01:~$ sudo fence_mpath -v -d
      /dev/mapper/volume01 -n 59450000 -o on<br>
      2020-06-01 12:59:46,388 INFO: Executing: /usr/sbin/mpathpersist -i
      -k -d /dev/mapper/volume01<br>
      <br>
      To guarantee all reservations are correctly placed again, after
      the fence was done:<br>
      <br>
      (k)rafaeldtinoco@clusterg03:~$ sudo mpathpersist --in -k
      /dev/mapper/volume01<br>
        PR generation=0x33,     12 registered reservation keys follow:<br>
          0x59450001<br>
          0x59450001<br>
          0x59450001<br>
          0x59450001<br>
          0x59450002<br>
          0x59450002<br>
          0x59450002<br>
          0x59450002<br>
          0x59450000<br>
          0x59450000<br>
          0x59450000<br>
          0x59450000<br>
      <br>
      I was wondering if "resource-agents-deps.target" being RequiredBy
      in [Install] systemd section for open-iscsi.service and
      multipath-tools.service, together with
      "Before=resource-agents-deps.target" in [Unit] section, would be
      enough but in this case I think it is not enough. <br>
      <br>
      Any idea why this happens ? Did the agent start with there was a
      single path available to the disk when iscsi session was being
      established and multipath-tools had scanned a single path only ? I
      tend to think that, if this was the case, sometimes I would have 1
      path, sometimes 2, etc.. and not a single path with reservations
      all the time (missing 3 reservations). <br>
      <br>
      OR there is something else about the PERSIST RESERVATION I'm
      missing from SBC-3/4. <br>
      <br>
      Any thoughts ?<br>
    </tt>
  </body>
</html>