[ClusterLabs] SLES cluster join fails with TLS handshake error

Reynolds, John F - San Mateo, CA - Contractor John.F.Reynolds2 at usps.gov
Tue Dec 31 16:22:47 EST 2019


I have reworked csync2's SSL keys, and I was able to use ha-cluster-join to add the second node to the cluster.   Thank you for the guidance!


However, not all the resources are happy with this.  

eagnmnmeqfc1:/var/lib/pacemaker/cib # crm status
Stack: corosync
Current DC: eagnmnmeqfc0 (version 1.1.16-4.8-77ea74d) - partition with quorum
Last updated: Tue Dec 31 15:16:14 2019
Last change: Tue Dec 31 15:01:34 2019 by hacluster via crmd on eagnmnmeqfc0

2 nodes configured
16 resources configured

Online: [ eagnmnmeqfc0 eagnmnmeqfc1 ]

Full list of resources:

 Resource Group: grp_ncoa
     ncoa_dg_mqm        (ocf::heartbeat:LVM):   Started eagnmnmeqfc0
     ncoa_dg_a01        (ocf::heartbeat:LVM):   Started eagnmnmeqfc0
     ncoa_dg_a02        (ocf::heartbeat:LVM):   Started eagnmnmeqfc0
     ncoa_dg_a03        (ocf::heartbeat:LVM):   Started eagnmnmeqfc0
     ncoa_dg_a04        (ocf::heartbeat:LVM):   Started eagnmnmeqfc0
     ncoa_dg_a05        (ocf::heartbeat:LVM):   Started eagnmnmeqfc0
     ncoa_mqm   (ocf::heartbeat:Filesystem):    Started eagnmnmeqfc0
     ncoa_a01shared     (ocf::heartbeat:Filesystem):    Started eagnmnmeqfc0
     ncoa_a02shared     (ocf::heartbeat:Filesystem):    Started eagnmnmeqfc0
     ncoa_a03shared     (ocf::heartbeat:Filesystem):    Started eagnmnmeqfc0
     ncoa_a04shared     (ocf::heartbeat:Filesystem):    Started eagnmnmeqfc0
     ncoa_a05shared     (ocf::heartbeat:Filesystem):    Started eagnmnmeqfc0
     IP_56.76.161.36    (ocf::heartbeat:IPaddr2):       Started eagnmnmeqfc0
     ncoa_apache        (systemd:apache2):      Started eagnmnmeqfc0
     ncoa_dg_a00        (ocf::heartbeat:LVM):   FAILED[ eagnmnmeqfc0 eagnmnmeqfc1 ]
     ncoa_a00shared     (ocf::heartbeat:Filesystem):    FAILED eagnmnmeqfc0 (blocked)

Failed Actions:
* ncoa_a00shared_stop_0 on eagnmnmeqfc0 'unknown error' (1): call=206, status=complete, exitreason='Couldn't unmount /ncoa/qncoa/a00shared, giving up!',
    last-rc-change='Tue Dec 31 15:01:35 2019', queued=0ms, exec=7478ms
* ncoa_dg_a00_monitor_0 on eagnmnmeqfc1 'unknown error' (1): call=141, status=complete, exitreason='WARNING: vg_qncoa_noncloned-a00 is active without the cluster tag, "pacemaker"',
    last-rc-change='Tue Dec 31 15:01:34 2019', queued=0ms, exec=287ms

eagnmnmeqfc1:/var/lib

The PV and VG are present on both servers.  The resource is defined in cib.xml as :

        </primitive>
        <primitive id="ncoa_dg_a00" class="ocf" provider="heartbeat" type="LVM">
          <instance_attributes id="ncoa_dg_a00-instance_attributes">
            <nvpair name="volgrpname" value="vg_qncoa_noncloned-a00" id="ncoa_dg_a00-instance_attributes-volgrpname"/>
            <nvpair name="exclusive" value="true" id="ncoa_dg_a00-instance_attributes-exclusive"/>
          </instance_attributes>
          <operations>
            <op name="monitor" interval="60" timeout="60" id="ncoa_dg_a00-monitor-60">
              <instance_attributes id="ncoa_dg_a00-monitor-60-instance_attributes">
                <nvpair name="is_managed" value="true" id="ncoa_dg_a00-monitor-60-instance_attributes-is_managed"/>
              </instance_attributes>
            </op>
          </operations>
        </primitive>
        <primitive id="ncoa_a00shared" class="ocf" provider="heartbeat" type="Filesystem">
          <instance_attributes id="ncoa_a00shared-instance_attributes">
            <nvpair name="device" value="/dev/vg_qncoa_noncloned-a00/lv_a00shared" id="ncoa_a00shared-instance_attributes-device"/>
            <nvpair name="directory" value="/ncoa/qncoa/a00shared" id="ncoa_a00shared-instance_attributes-directory"/>
            <nvpair name="fstype" value="xfs" id="ncoa_a00shared-instance_attributes-fstype"/>
          </instance_attributes>
          <operations>
            <op name="monitor" interval="60" timeout="60" id="ncoa_a00shared-monitor-60"/>
          </operations>
        </primitive>

Which is, as far as I can see, the same as one of the resources that is working: 

    <primitive id="ncoa_dg_a01" class="ocf" provider="heartbeat" type="LVM">
          <instance_attributes id="ncoa_dg_a01-instance_attributes">
            <nvpair name="volgrpname" value="vg_qncoa_noncloned-a01" id="ncoa_dg_a01-instance_attributes-volgrpname"/>
            <nvpair name="exclusive" value="true" id="ncoa_dg_a01-instance_attributes-exclusive"/>
          </instance_attributes>
          <operations>
            <op name="monitor" interval="60" timeout="60" id="ncoa_dg_a01-monitor-60">
              <instance_attributes id="ncoa_dg_a01-monitor-60-instance_attributes">
                <nvpair name="is_managed" value="true" id="ncoa_dg_a01-monitor-60-instance_attributes-is_managed"/>
              </instance_attributes>
            </op>
          </operations>
        </primitive>
        <primitive id="ncoa_dg_a02" class="ocf" provider="heartbeat" type="LVM">
          <instance_attributes id="ncoa_dg_a02-instance_attributes">
            <nvpair name="volgrpname" value="vg_ncoa_cloned-a02" id="ncoa_dg_a02-instance_attributes-volgrpname"/>
            <nvpair name="exclusive" value="true" id="ncoa_dg_a02-instance_attributes-exclusive"/>
          </instance_attributes>
          <operations>
            <op name="monitor" interval="60" timeout="60" id="ncoa_dg_a02-monitor-60">
              <instance_attributes id="ncoa_dg_a02-monitor-60-instance_attributes">
                <nvpair name="is_managed" value="true" id="ncoa_dg_a02-monitor-60-instance_attributes-is_managed"/>
              </instance_attributes>
            </op>
          </operations>
        </primitive>

"crm resource cleanup' doesn’t fix the problem.


Now, the /ncoa/qncoa/a00shared filesystem can't be unmounted because there are open files on it.  Could it be that the problem is simply that the cluster-add wanted to unmount and remount all the disk resources, and, since it couldn't do it, tossed it as an error?  


Thank you.

John Reynolds


More information about the Users mailing list