<html>

  <head>

    <meta content="text/html; charset=windows-1252"

      http-equiv="Content-Type">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <div class="moz-cite-prefix">On 09/05/2016 03:02 PM, Gabriele Bulfon

      wrote:<br>

    </div>

    <blockquote

      cite="mid:13621319.53.1473080520636.JavaMail.sonicle@www"

      type="cite">

      <div style="font-family: Arial; font-size: 13;">I read docs, looks

        like sbd fencing is more about iscsi/fc exposed storage

        resources.<br>

        Here I have real shared disks (seen from solaris with the format

        utility as normal sas disks, but on both nodes).<br>

        They are all jbod disks, that ZFS organizes in raidz/mirror

        pools, so I have 5 disks on one pool in one node, and the other

        5 disks on another pool in one node.<br>

        How can sbd work in this situation? Has it already been

        used/tested on a Solaris env with ZFS ?<br>

      </div>

    </blockquote>

    <br>

    You wouldn't have to have discs at all with sbd. You can just use it

    for pacemaker<br>

    to be monitored by a hardware-watchdog.<br>

    But if you want to add discs it shouldn't really matter how they are

    accessed as<br>

    long as you can concurrently read/write the block-devices.

    Configuration of <br>

    caching in the controllers might be an issue as well.<br>

    I'm e.g. currently testing with a simple kvm setup using following

    virsh-config<br>

    for the shared block-device:<br>

    <br>

    <disk type='file' device='disk'><br>

          <driver name='qemu' type='raw' cache='none'/><br>

          <source file='SHARED_IMAGE_FILE'/><br>

          <target dev='vdb' bus='virtio'/><br>

          <shareable/><br>

          <address type='pci' domain='0x0000' bus='0x00' slot='0x15'

    function='0x0'/><br>

     </disk><br>

    <br>

    Don't know about test-coverage for sbd on Solaris. Actually it

    should be independent<br>

    of which file-system you are using as you would anyway use a

    partition without <br>

    filesystem for sbd.<br>

    <br>

    <blockquote

      cite="mid:13621319.53.1473080520636.JavaMail.sonicle@www"

      type="cite">

      <div style="font-family: Arial; font-size: 13;"><br>

        BTW, is there any other possibility other than sbd.<br>

        <br>

      </div>

    </blockquote>

    <br>

    Probably - see Kens' suggestions.<br>

    Excuse me thinking a little unidimensional at the moment<br>

    working on some sbd-issue ;-)<br>

    And not having a proper fencing-device a watchdog is the last resort

    to have something<br>

    working reliably. And pacemakers' way to do watchdog is sbd...<br>

    <br>

    <blockquote

      cite="mid:13621319.53.1473080520636.JavaMail.sonicle@www"

      type="cite">

      <div style="font-family: Arial; font-size: 13;">Last but not

        least, is there any way to let ssh-fencing be considered good?<br>

        At the moment, with ssh-fencing, if I shut down the second node,

        I get all second resources in UNCLEAN state, not taken by the

        first one.<br>

        If I reboot the second , I only get the node on again, but

        resources remain stopped.<br>

      </div>

    </blockquote>

    <br>

    Strange... What do the logs say about the fencing-action being

    successful or not?<br>

    <br>

    <blockquote

      cite="mid:13621319.53.1473080520636.JavaMail.sonicle@www"

      type="cite">

      <div style="font-family: Arial; font-size: 13;"><br>

        I remember my tests with heartbeat react different (halt would

        move everything to node1 and get back everything on restart)<br>

        <br>

        Gabriele<br>

        <br>

        <div id="wt-mailcard">

          <div style="font-family: Arial;">----------------------------------------------------------------------------------------<br>

          </div>

          <div style="font-family: Arial;"><b>Sonicle S.r.l. </b>: <a

              moz-do-not-send="true" href="http://www.sonicle.com/"

              target="_new"><a class="moz-txt-link-freetext" href="http://www.sonicle.com">http://www.sonicle.com</a></a></div>

          <div style="font-family: Arial;"><b>Music: </b><a

              moz-do-not-send="true"

              href="http://www.gabrielebulfon.com/" target="_new"><a class="moz-txt-link-freetext" href="http://www.gabrielebulfon.com">http://www.gabrielebulfon.com</a></a></div>

          <div style="font-family: Arial;"><b>Quantum Mechanics : </b><a

              moz-do-not-send="true"

              href="http://www.cdbaby.com/cd/gabrielebulfon"

              target="_new"><a class="moz-txt-link-freetext" href="http://www.cdbaby.com/cd/gabrielebulfon">http://www.cdbaby.com/cd/gabrielebulfon</a></a></div>

        </div>

        <tt><br>

          <br>

          <br>

----------------------------------------------------------------------------------<br>

          <br>

          Da: Klaus Wenninger <a class="moz-txt-link-rfc2396E" href="mailto:kwenning@redhat.com"><kwenning@redhat.com></a><br>

          A: <a class="moz-txt-link-abbreviated" href="mailto:users@clusterlabs.org">users@clusterlabs.org</a> <br>

          Data: 5 settembre 2016 12.21.25 CEST<br>

          Oggetto: Re: [ClusterLabs] ip clustering strange behaviour<br>

          <br>

        </tt>

        <blockquote style="BORDER-LEFT: #000080 2px solid; MARGIN-LEFT:

          5px; PADDING-LEFT: 5px"><tt>On 09/05/2016 11:20 AM, Gabriele

            Bulfon wrote:<br>

            > The dual machine is equipped with a syncro controller

            LSI 3008 MPT SAS3.<br>

            > Both nodes can see the same jbod disks (10 at the

            moment, up to 24).<br>

            > Systems are XStreamOS / illumos, with ZFS.<br>

            > Each system has one ZFS pool of 5 disks, with different

            pool names<br>

            > (data1, data2).<br>

            > When in active / active, the two machines run different

            zones and<br>

            > services on their pools, on their networks.<br>

            > I have custom resource agents (tested on

            pacemaker/heartbeat, now<br>

            > porting to pacemaker/corosync) for ZFS pools and zones

            migration.<br>

            > When I was testing pacemaker/heartbeat, when

            ssh-fencing discovered<br>

            > the other node to be down (cleanly or abrupt halt), it

            was<br>

            > automatically using IPaddr and our ZFS agents to take

            control of<br>

            > everything, mounting the other pool and running any

            configured zone in it.<br>

            > I would like to do the same with pacemaker/corosync.<br>

            > The two nodes of the dual machine have an inernal lan

            connecting them,<br>

            > a 100Mb ethernet: maybe this is enough reliable to

            trust ssh-fencing?<br>

            > Or is there anything I can do to ensure at the

            controller level that<br>

            > the pool is not in use on the other node?<br>

            <br>

            It is not just about the reliability of the

            networking-connection why<br>

            ssh-fencing might be<br>

            suboptimal. Something with the IP-Stack config (dynamic due

            to moving<br>

            resources)<br>

            might have gone wrong. And resources might be somehow

            hanging so that<br>

            the node<br>

            can't be brought down gracefully. Thus my suggestion to add

            a watchdog<br>

            (so available)<br>

            via sbd.<br>

            <br>

            ><br>

            > Gabriele<br>

            ><br>

            >

----------------------------------------------------------------------------------------<br>

            > *Sonicle S.r.l. *: <a class="moz-txt-link-freetext" href="http://www.sonicle.com">http://www.sonicle.com</a>

            <a class="moz-txt-link-rfc2396E" href="http://www.sonicle.com/"><http://www.sonicle.com/></a><br>

            > *Music: *<a class="moz-txt-link-freetext" href="http://www.gabrielebulfon.com">http://www.gabrielebulfon.com</a>

            <a class="moz-txt-link-rfc2396E" href="http://www.gabrielebulfon.com/"><http://www.gabrielebulfon.com/></a><br>

            > *Quantum Mechanics :

            *<a class="moz-txt-link-freetext" href="http://www.cdbaby.com/cd/gabrielebulfon">http://www.cdbaby.com/cd/gabrielebulfon</a><br>

            ><br>

            ><br>

            ><br>

            >

----------------------------------------------------------------------------------<br>

            ><br>

            > Da: Ken Gaillot <a class="moz-txt-link-rfc2396E" href="mailto:kgaillot@redhat.com"><kgaillot@redhat.com></a><br>

            > A: <a class="moz-txt-link-abbreviated" href="mailto:gbulfon@sonicle.com">gbulfon@sonicle.com</a> Cluster Labs - All topics

            related to<br>

            > open-source clustering welcomed

            <a class="moz-txt-link-rfc2396E" href="mailto:users@clusterlabs.org"><users@clusterlabs.org></a><br>

            > Data: 1 settembre 2016 15.49.04 CEST<br>

            > Oggetto: Re: [ClusterLabs] ip clustering strange

            behaviour<br>

            ><br>

            > On 08/31/2016 11:50 PM, Gabriele Bulfon wrote:<br>

            > > Thanks, got it.<br>

            > > So, is it better to use "two_node: 1" or, as

            suggested else<br>

            > where, or<br>

            > > "no-quorum-policy=stop"?<br>

            ><br>

            > I'd prefer "two_node: 1" and letting pacemaker's

            options default. But<br>

            > see the votequorum(5) man page for what two_node

            implies -- most<br>

            > importantly, both nodes have to be available when the

            cluster starts<br>

            > before it will start any resources. Node failure is

            handled fine once<br>

            > the cluster has started, but at start time, both nodes

            must be up.<br>

            ><br>

            > > About fencing, the machine I'm going to implement

            the 2-nodes<br>

            > cluster is<br>

            > > a dual machine with shared disks backend.<br>

            > > Each node has two 10Gb ethernets dedicated to the

            public ip and the<br>

            > > admin console.<br>

            > > Then there is a third 100Mb ethernet connecing the

            two machines<br>

            > internally.<br>

            > > I was going to use this last one as fencing via

            ssh, but looks<br>

            > like this<br>

            > > way I'm not gonna have ip/pool/zone movements if

            one of the nodes<br>

            > > freezes or halts without shutting down pacemaker

            clean.<br>

            > > What should I use instead?<br>

            ><br>

            > I'm guessing as a dual machine, they share a power

            supply, so that<br>

            > rules<br>

            > out a power switch. If the box has IPMI that can

            individually power<br>

            > cycle each host, you can use fence_ipmilan. If the

            disks are<br>

            > shared via<br>

            > iSCSI, you could use fence_scsi. If the box has a

            hardware watchdog<br>

            > device that can individually target the hosts, you

            could use sbd. If<br>

            > none of those is an option, probably the best you could

            do is run the<br>

            > cluster nodes as VMs on each host, and use fence_xvm.<br>

            ><br>

            > > Thanks for your help,<br>

            > > Gabriele<br>

            > ><br>

            > ><br>

            >

----------------------------------------------------------------------------------------<br>

            > > *Sonicle S.r.l. *: <a class="moz-txt-link-freetext" href="http://www.sonicle.com">http://www.sonicle.com</a>

            <a class="moz-txt-link-rfc2396E" href="http://www.sonicle.com/"><http://www.sonicle.com/></a><br>

            > > *Music: *<a class="moz-txt-link-freetext" href="http://www.gabrielebulfon.com">http://www.gabrielebulfon.com</a><br>

            > <a class="moz-txt-link-rfc2396E" href="http://www.gabrielebulfon.com/"><http://www.gabrielebulfon.com/></a><br>

            > > *Quantum Mechanics :

            *<a class="moz-txt-link-freetext" href="http://www.cdbaby.com/cd/gabrielebulfon">http://www.cdbaby.com/cd/gabrielebulfon</a><br>

            > ><br>

            > ><br>

            > ><br>

            > ><br>

            >

----------------------------------------------------------------------------------<br>

            > ><br>

            > > Da: Ken Gaillot <a class="moz-txt-link-rfc2396E" href="mailto:kgaillot@redhat.com"><kgaillot@redhat.com></a><br>

            > > A: <a class="moz-txt-link-abbreviated" href="mailto:users@clusterlabs.org">users@clusterlabs.org</a><br>

            > > Data: 31 agosto 2016 17.25.05 CEST<br>

            > > Oggetto: Re: [ClusterLabs] ip clustering strange

            behaviour<br>

            > ><br>

            > > On 08/30/2016 01:52 AM, Gabriele Bulfon wrote:<br>

            > > > Sorry for reiterating, but my main question

            was:<br>

            > > ><br>

            > > > why does node 1 removes its own IP if I shut

            down node 2 abruptly?<br>

            > > > I understand that it does not take the node 2

            IP (because the<br>

            > > > ssh-fencing has no clue about what happened

            on the 2nd node),<br>

            > but I<br>

            > > > wouldn't expect it to shut down its own

            IP...this would kill any<br>

            > > service<br>

            > > > on both nodes...what am I wrong?<br>

            > ><br>

            > > Assuming you're using corosync 2, be sure you have

            "two_node: 1" in<br>

            > > corosync.conf. That will tell corosync to pretend

            there is always<br>

            > > quorum, so pacemaker doesn't need any special

            quorum settings.<br>

            > See the<br>

            > > votequorum(5) man page for details. Of course, you

            need fencing<br>

            > in this<br>

            > > setup, to handle when communication between the

            nodes is broken<br>

            > but both<br>

            > > are still up.<br>

            > ><br>

            > > ><br>

            > ><br>

            >

----------------------------------------------------------------------------------------<br>

            > > > *Sonicle S.r.l. *: <a class="moz-txt-link-freetext" href="http://www.sonicle.com">http://www.sonicle.com</a><br>

            > <a class="moz-txt-link-rfc2396E" href="http://www.sonicle.com/"><http://www.sonicle.com/></a><br>

            > > > *Music: *<a class="moz-txt-link-freetext" href="http://www.gabrielebulfon.com">http://www.gabrielebulfon.com</a><br>

            > > <a class="moz-txt-link-rfc2396E" href="http://www.gabrielebulfon.com/"><http://www.gabrielebulfon.com/></a><br>

            > > > *Quantum Mechanics :

            *<a class="moz-txt-link-freetext" href="http://www.cdbaby.com/cd/gabrielebulfon">http://www.cdbaby.com/cd/gabrielebulfon</a><br>

            > > ><br>

            > > ><br>

            > ><br>

            >

            ------------------------------------------------------------------------<br>

            > > ><br>

            > > ><br>

            > > > *Da:* Gabriele Bulfon

            <a class="moz-txt-link-rfc2396E" href="mailto:gbulfon@sonicle.com"><gbulfon@sonicle.com></a><br>

            > > > *A:* <a class="moz-txt-link-abbreviated" href="mailto:kwenning@redhat.com">kwenning@redhat.com</a> Cluster Labs - All

            topics related to<br>

            > > > open-source clustering welcomed

            <a class="moz-txt-link-rfc2396E" href="mailto:users@clusterlabs.org"><users@clusterlabs.org></a><br>

            > > > *Data:* 29 agosto 2016 17.37.36 CEST<br>

            > > > *Oggetto:* Re: [ClusterLabs] ip clustering

            strange behaviour<br>

            > > ><br>

            > > ><br>

            > > > Ok, got it, I hadn't gracefully shut

            pacemaker on node2.<br>

            > > > Now I restarted, everything was up, stopped

            pacemaker service on<br>

            > > > host2 and I got host1 with both IPs

            configured. ;)<br>

            > > ><br>

            > > > But, though I understand that if I halt host2

            with no grace<br>

            > shut of<br>

            > > > pacemaker, it will not move the IP2 to Host1,

            I don't expect host1<br>

            > > > to loose its own IP! Why?<br>

            > > ><br>

            > > > Gabriele<br>

            > > ><br>

            > > ><br>

            > ><br>

            >

----------------------------------------------------------------------------------------<br>

            > > > *Sonicle S.r.l. *: <a class="moz-txt-link-freetext" href="http://www.sonicle.com">http://www.sonicle.com</a><br>

            > <a class="moz-txt-link-rfc2396E" href="http://www.sonicle.com/"><http://www.sonicle.com/></a><br>

            > > > *Music: *<a class="moz-txt-link-freetext" href="http://www.gabrielebulfon.com">http://www.gabrielebulfon.com</a><br>

            > > <a class="moz-txt-link-rfc2396E" href="http://www.gabrielebulfon.com/"><http://www.gabrielebulfon.com/></a><br>

            > > > *Quantum Mechanics :

            *<a class="moz-txt-link-freetext" href="http://www.cdbaby.com/cd/gabrielebulfon">http://www.cdbaby.com/cd/gabrielebulfon</a><br>

            > > ><br>

            > > ><br>

            > > ><br>

            > > ><br>

            > ><br>

            >

----------------------------------------------------------------------------------<br>

            > > ><br>

            > > > Da: Klaus Wenninger

            <a class="moz-txt-link-rfc2396E" href="mailto:kwenning@redhat.com"><kwenning@redhat.com></a><br>

            > > > A: <a class="moz-txt-link-abbreviated" href="mailto:users@clusterlabs.org">users@clusterlabs.org</a><br>

            > > > Data: 29 agosto 2016 17.26.49 CEST<br>

            > > > Oggetto: Re: [ClusterLabs] ip clustering

            strange behaviour<br>

            > > ><br>

            > > > On 08/29/2016 05:18 PM, Gabriele Bulfon

            wrote:<br>

            > > > > Hi,<br>

            > > > ><br>

            > > > > now that I have IPaddr work, I have a

            strange behaviour on<br>

            > my test<br>

            > > > > setup of 2 nodes, here is my

            configuration:<br>

            > > > ><br>

            > > > > ===STONITH/FENCING===<br>

            > > > ><br>

            > > > > primitive xstorage1-stonith

            stonith:external/ssh-sonicle op<br>

            > > > monitor<br>

            > > > > interval="25" timeout="25"

            start-delay="25" params<br>

            > > > hostlist="xstorage1"<br>

            > > > ><br>

            > > > > primitive xstorage2-stonith

            stonith:external/ssh-sonicle op<br>

            > > > monitor<br>

            > > > > interval="25" timeout="25"

            start-delay="25" params<br>

            > > > hostlist="xstorage2"<br>

            > > > ><br>

            > > > > location xstorage1-stonith-pref

            xstorage1-stonith -inf:<br>

            > xstorage1<br>

            > > > > location xstorage2-stonith-pref

            xstorage2-stonith -inf:<br>

            > xstorage2<br>

            > > > ><br>

            > > > > property stonith-action=poweroff<br>

            > > > ><br>

            > > > ><br>

            > > > ><br>

            > > > > ===IP RESOURCES===<br>

            > > > ><br>

            > > > ><br>

            > > > > primitive xstorage1_wan1_IP

            ocf:heartbeat:IPaddr params<br>

            > > > ip="1.2.3.4"<br>

            > > > > cidr_netmask="255.255.255.0"

            nic="e1000g1"<br>

            > > > > primitive xstorage2_wan2_IP

            ocf:heartbeat:IPaddr params<br>

            > > > ip="1.2.3.5"<br>

            > > > > cidr_netmask="255.255.255.0"

            nic="e1000g1"<br>

            > > > ><br>

            > > > > location xstorage1_wan1_IP_pref

            xstorage1_wan1_IP 100: xstorage1<br>

            > > > > location xstorage2_wan2_IP_pref

            xstorage2_wan2_IP 100: xstorage2<br>

            > > > ><br>

            > > > > ===================<br>

            > > > ><br>

            > > > > So I plumbed e1000g1 with unconfigured

            IP on both machines and<br>

            > > > started<br>

            > > > > corosync/pacemaker, and after some time

            I got all nodes<br>

            > online and<br>

            > > > > started, with IP configured as virtual

            interfaces (e1000g1:1 and<br>

            > > > > e1000g1:2) one in host1 and one in

            host2.<br>

            > > > ><br>

            > > > > Then I halted host2, and I expected to

            have host1 started with<br>

            > > > both<br>

            > > > > IPs configured on host1.<br>

            > > > > Instead, I got host1 started with the IP

            stopped and removed<br>

            > (only<br>

            > > > > e1000g1 unconfigured), host2 stopped

            saying IP started (!?).<br>

            > > > > Not exactly what I expected...<br>

            > > > > What's wrong?<br>

            > > ><br>

            > > > How did you stop host2? Graceful shutdown of

            pacemaker? If not ...<br>

            > > > Anyway ssh-fencing is just working if the

            machine is still<br>

            > > > running ...<br>

            > > > So it will stay unclean and thus pacemaker is

            thinking that<br>

            > > > the IP might still be running on it. So this

            is actually the<br>

            > > > expected<br>

            > > > behavior.<br>

            > > > You might add a watchdog via sbd if you don't

            have other fencing<br>

            > > > hardware at hand ...<br>

            > > > ><br>

            > > > > Here is the crm status after I stopped

            host 2:<br>

            > > > ><br>

            > > > > 2 nodes and 4 resources configured<br>

            > > > ><br>

            > > > > Node xstorage2: UNCLEAN (offline)<br>

            > > > > Online: [ xstorage1 ]<br>

            > > > ><br>

            > > > > Full list of resources:<br>

            > > > ><br>

            > > > > xstorage1-stonith

            (stonith:external/ssh-sonicle): Started<br>

            > > > xstorage2<br>

            > > > > (UNCLEAN)<br>

            > > > > xstorage2-stonith

            (stonith:external/ssh-sonicle): Stopped<br>

            > > > > xstorage1_wan1_IP

            (ocf::heartbeat:IPaddr): Stopped<br>

            > > > > xstorage2_wan2_IP

            (ocf::heartbeat:IPaddr): Started xstorage2<br>

            > > > (UNCLEAN)<br>

            > > > ><br>

            > > > ><br>

            > > > > Gabriele<br>

            > > > ><br>

            > > > ><br>

            > > ><br>

            > ><br>

            >

----------------------------------------------------------------------------------------<br>

            > > > > *Sonicle S.r.l. *:

            <a class="moz-txt-link-freetext" href="http://www.sonicle.com">http://www.sonicle.com</a><br>

            > > > <a class="moz-txt-link-rfc2396E" href="http://www.sonicle.com/"><http://www.sonicle.com/></a><br>

            > > > > *Music: *<a class="moz-txt-link-freetext" href="http://www.gabrielebulfon.com">http://www.gabrielebulfon.com</a><br>

            > > > <a class="moz-txt-link-rfc2396E" href="http://www.gabrielebulfon.com/"><http://www.gabrielebulfon.com/></a><br>

            > > > > *Quantum Mechanics :

            *<a class="moz-txt-link-freetext" href="http://www.cdbaby.com/cd/gabrielebulfon">http://www.cdbaby.com/cd/gabrielebulfon</a><br>

            ><br>

            ><br>

            ><br>

            ><br>

            ><br>

            > _______________________________________________<br>

            > Users mailing list: <a class="moz-txt-link-abbreviated" href="mailto:Users@clusterlabs.org">Users@clusterlabs.org</a><br>

            > <a class="moz-txt-link-freetext" href="http://clusterlabs.org/mailman/listinfo/users">http://clusterlabs.org/mailman/listinfo/users</a><br>

            ><br>

            > Project Home: <a class="moz-txt-link-freetext" href="http://www.clusterlabs.org">http://www.clusterlabs.org</a><br>

            > Getting started:

            <a class="moz-txt-link-freetext" href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>

            > Bugs: <a class="moz-txt-link-freetext" href="http://bugs.clusterlabs.org">http://bugs.clusterlabs.org</a><br>

            <br>

            <br>

            _______________________________________________<br>

            Users mailing list: <a class="moz-txt-link-abbreviated" href="mailto:Users@clusterlabs.org">Users@clusterlabs.org</a><br>

            <a class="moz-txt-link-freetext" href="http://clusterlabs.org/mailman/listinfo/users">http://clusterlabs.org/mailman/listinfo/users</a><br>

            <br>

            Project Home: <a class="moz-txt-link-freetext" href="http://www.clusterlabs.org">http://www.clusterlabs.org</a><br>

            Getting started:

            <a class="moz-txt-link-freetext" href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>

            Bugs: <a class="moz-txt-link-freetext" href="http://bugs.clusterlabs.org">http://bugs.clusterlabs.org</a><br>

            <br>

            <br>

          </tt></blockquote>

      </div>

    </blockquote>

    <br>

  </body>

</html>