[ClusterLabs] Interface confusion

Adam Budziński budzinski.adam at gmail.com
Tue Mar 19 10:55:42 EDT 2019


Hello Ken,

Thank you.

But if I have a two node cluster and a working fencing mechanism wouldn't
it be enough to disable the corosync and pacemaker service on both nodes so
when it fence they won't come up?

Thank you

pon., 18.03.2019, 16:19 użytkownik Ken Gaillot <kgaillot at redhat.com>
napisał:

> On Sat, 2019-03-16 at 11:10 +0100, Adam Budziński wrote:
> > Hello Andrei,
> >
> > Ok I see your point. So per my understanding if the resource is
> > started successfully in that case fence vmware it will be monitored
> > indefinitely but as you sad it will monitor the current active node.
> > So how does the fence agent gets aware of problems with the slave? I
>
> The fence agent doesn't monitor the active node, or any node -- it
> monitors the fence device.
>
> The cluster layer (i.e. corosync) monitors all nodes, and reports any
> issues to pacemaker, which will initiate fencing if necessary.
>
> Pacemaker also monitors each resource and fence device, via any
> recurring monitors that have been configured.
>
> > mean if in a two node cluster the cluster splits in to two partitions
> > each of them will fence the other or does that happen because both
> > will assume they are the only survivors and thus need to fence the
> > other node which is in a unknow state so to say?
>
> If both nodes are functional but can't see each other, they will each
> want to initiate fencing. If one of them is quicker than the other to
> determine this, the other one will get shot before it has a chance to
> do anything itself.
>
> However there is the possibility that both nodes will shoot at about
> the same time, resulting in both nodes getting shot (a "stonith death
> match"). This is only a problem in 2-node clusters. There are a few
> ways around this:
>
> 1. Configure two separate fence devices, each targeting one of the
> nodes, and put a delay on one of them (or a random delay on both). This
> makes it highly unlikely that they will shoot at the same time.
>
> 2. Configure a fencing topology with a fence heuristics device plus
> your real device. A fence heuristics device runs some test, and refuses
> to shoot the other node if the test fails. For example,
> fence_heuristics_ping tries to ping an IP address you give it; the idea
> is that if a node can't ping that IP, you don't want it to survive.
> This ensures that only a node that passes the test can shoot (which
> means there still might be some cases where the nodes can both shoot
> each other, and cases where the cluster will freeze because neither
> node can see the IP).
>
> 3. Configure corosync with qdevice to provide true quorum via a third
> host (which doesn't participate in the cluster otherwise).
>
> 4. Use sbd with a hardware watchdog and a shared storage device as the
> fencing device. This is not a reliable option with VMWare, but I'm
> listing it for the general case.
>
>
> >
> > Thank you and Best Regards,
> > Adam
> >
> > sob., 16.03.2019, 07:17 użytkownik Andrei Borzenkov <
> > arvidjaar at gmail.com> napisał:
> > > 16.03.2019 9:01, Adam Budziński пишет:
> > > > Thank you Andrei. The problem is that I can see with 'pcs status'
> > > that
> > > > resources are runnin on srv2cr1 but its at the same time its
> > > telling that
> > > > the fence_vmware_soap is running on srv1cr1. That's somewhat
> > > confusing.
> > > > Could you possibly explain this?
> > > >
> > >
> > > Two points.
> > >
> > > It is actually logical to have stonith agent running on different
> > > node
> > > than node with active resources - because it is the *other* node
> > > that
> > > will initiate fencing when node with active resources fails.
> > >
> > > But even considering the above, active (running) state of fence (or
> > > stonith) agent just determines on which node recurring monitor
> > > operation
> > > will be started. The actual result of this monitor operation has no
> > > impact on subsequent stonith attempt and serves just as warning to
> > > administrator. When stonith request comes, agent may be used by any
> > > node
> > > where stonith agent is not prohibited to run by (co-)location
> > > rules. My
> > > understanding is that this node is selected by DC in partition.
> > >
> > > > Thank you!
> > > >
> > > > sob., 16.03.2019, 05:37 użytkownik Andrei Borzenkov <
> > > arvidjaar at gmail.com>
> > > > napisał:
> > > >
> > > >> 16.03.2019 1:16, Adam Budziński пишет:
> > > >>> Hi Tomas,
> > > >>>
> > > >>> Ok but how then pacemaker or the fence agent knows which route
> > > to take to
> > > >>> reach the vCenter?
> > > >>
> > > >> They do not know or care at all. It is up to your underlying
> > > operating
> > > >> system and its routing tables.
> > > >>
> > > >>> Btw. Do I have to add the stonith resource on each of the nodes
> > > or is it
> > > >>> just enough to add it on one as for other resources?
> > > >>
> > > >> If your fencing agent can (should) be able to run on any node,
> > > it should
> > > >> be enough to define it just once as long as it can properly
> > > determine
> > > >> "port" to use on fencing "device" for a given node. There are
> > > cases when
> > > >> you may want to restrict fencing agent to only subset on nodes
> > > or when
> > > >> you are forced to set unique parameter for each node (consider
> > > IPMI IP
> > > >> address), in this case you would need separate instance of agent
> > > in each
> > > >> case.
> > > >>
> > > >>> Thank you!
> > > >>>
> > > >>> pt., 15.03.2019, 15:48 użytkownik Tomas Jelinek <
> > > tojeline at redhat.com>
> > > >>> napisał:
> > > >>>
> > > >>>> Dne 15. 03. 19 v 15:09 Adam Budziński napsal(a):
> > > >>>>> Hello Tomas,
> > > >>>>>
> > > >>>>> Thank you! So far I  need to say how great this community is,
> > > would
> > > >>>>> never expect so much positive vibes! A big thank you your
> > > doing a great
> > > >>>>> job!
> > > >>>>>
> > > >>>>> Now let's talk business :)
> > > >>>>>
> > > >>>>> So if pcsd is using ring0 and it fails will ring1 not be used
> > > at all?
> > > >>>>
> > > >>>> Pcs and pcsd never use ring1, but they are just tools for
> > > managing
> > > >>>> clusters. You can have a perfectly functioning cluster without
> > > pcs and
> > > >>>> pcsd running or even installed, it would be just more
> > > complicated to set
> > > >>>> it up and manage it.
> > > >>>>
> > > >>>> Even if ring0 fails, you will be able to use pcs (in somehow
> > > limited
> > > >>>> manner) as most of its commands don't go through network
> > > anyway.
> > > >>>>
> > > >>>> Corosync, which is the actual cluster messaging layer, will of
> > > course
> > > >>>> use ring1 in case of ring0 failure.
> > > >>>>
> > > >>>>>
> > > >>>>> So in regards to VMware that would mean that the interface
> > > should be
> > > >>>>> configured with a network that can access the  vCenter to
> > > fence right?
> > > >>>>> But wouldn't it then use only ring0 so if that fails it
> > > wouldn't switch
> > > >>>>> to ring1?
> > > >>>>
> > > >>>> If you are talking about pcmk_host_map, that does not really
> > > have
> > > >>>> anything to do with network interfaces of cluster nodes. It
> > > maps node
> > > >>>> names (parts before :) to "ports" of a fence device (parts
> > > after :).
> > > >>>> Pcs-0.9.x does not support defining custom node names,
> > > therefore node
> > > >>>> names are the same as ring0 addresses.
> > > >>>>
> > > >>>> I am not an expert on fence agents / devices, but I'm sure
> > > someone else
> > > >>>> on this list will be able to help you with configuring fencing
> > > for your
> > > >>>> cluster.
> > > >>>>
> > > >>>>
> > > >>>> Tomas
> > > >>>>
> > > >>>>>
> > > >>>>> Thank you!
> > > >>>>>
> > > >>>>> pt., 15.03.2019, 13:14 użytkownik Tomas Jelinek <
> > > tojeline at redhat.com
> > > >>>>> <mailto:tojeline at redhat.com>> napisał:
> > > >>>>>
> > > >>>>>     Dne 15. 03. 19 v 12:32 Adam Budziński napsal(a):
> > > >>>>>      > Hello Folks,____
> > > >>>>>      >
> > > >>>>>      > __ __
> > > >>>>>      >
> > > >>>>>      > Tow node active/passive VMware VM cluster.____
> > > >>>>>      >
> > > >>>>>      > __ __
> > > >>>>>      >
> > > >>>>>      > /etc/hosts____
> > > >>>>>      >
> > > >>>>>      > __ __
> > > >>>>>      >
> > > >>>>>      > 10.116.63.83    srv1____
> > > >>>>>      >
> > > >>>>>      > 10.116.63.84    srv2____
> > > >>>>>      >
> > > >>>>>      > 172.16.21.12    srv2cr1____
> > > >>>>>      >
> > > >>>>>      > 172.16.22.12    srv2cr2____
> > > >>>>>      >
> > > >>>>>      > 172.16.21.11    srv1cr1____
> > > >>>>>      >
> > > >>>>>      > 172.16.22.11    srv1cr2____
> > > >>>>>      >
> > > >>>>>      > __ __
> > > >>>>>      >
> > > >>>>>      > __ __
> > > >>>>>      >
> > > >>>>>      > I have 3 NIC’s on each VM:____
> > > >>>>>      >
> > > >>>>>      > __ __
> > > >>>>>      >
> > > >>>>>      > 10.116.63.83    srv1  and 10.116.63.84    srv2 are
> > > networks used
> > > >>>> to
> > > >>>>>      > access the VM’s via SSH or any resource directly if
> > > not via a
> > > >>>>>     VIP.____
> > > >>>>>      >
> > > >>>>>      > __ __
> > > >>>>>      >
> > > >>>>>      > Everything with cr in its name is used for corosync
> > > >>>>>     communication, so
> > > >>>>>      > basically I have two rings (this are two no routable
> > > networks
> > > >>>>>     just for
> > > >>>>>      > that).____
> > > >>>>>      >
> > > >>>>>      > __ __
> > > >>>>>      >
> > > >>>>>      > My questions are:____
> > > >>>>>      >
> > > >>>>>      > __ __
> > > >>>>>      >
> > > >>>>>      > __1.__With ‘pcs cluster auth’ which interface /
> > > interfaces
> > > >> should
> > > >>>>>     I use
> > > >>>>>      > ?____
> > > >>>>>
> > > >>>>>     Hi Adam,
> > > >>>>>
> > > >>>>>     I can see you are using pcs-0.9.x. In that case you
> > > should do:
> > > >>>>>     pcs cluster auth srv1cr1 srv2cr1
> > > >>>>>
> > > >>>>>     In other words, use the first address of each node.
> > > >>>>>     Authenticating all the other addresses should not cause
> > > any issues.
> > > >>>> It
> > > >>>>>     is pointless, though, as pcs only communicates via ring0
> > > addresses.
> > > >>>>>
> > > >>>>>      >
> > > >>>>>      > __2.__With ‘pcs cluster setup –name’ I would use the
> > > corosync
> > > >>>>>     interfaces
> > > >>>>>      > e.g. ‘pcs cluster setup –name MyCluster
> > > srv1cr1,srv1cr2
> > > >>>>>     srv2cr1,srv2cr2’
> > > >>>>>      > right ?____
> > > >>>>>
> > > >>>>>     Yes, that is correct.
> > > >>>>>
> > > >>>>>      >
> > > >>>>>      > __3.__With fence_vmware_soap
> > > >> inpcmk_host_map="X:VM_C;X:VM:OTRS_D"
> > > >>>>>     which
> > > >>>>>      > interface should replace X ?____
> > > >>>>>
> > > >>>>>     X should be replaced by node names as seen by pacemaker.
> > > Once you
> > > >>>>>     set up
> > > >>>>>     and start your cluster, run 'pcs status' to get (amongs
> > > other info)
> > > >>>> the
> > > >>>>>     node names. In your configuration, they should be srv1cr1
> > > and
> > > >>>> srv2cr1.
> > > >>>>>
> > > >>>>>
> > > >>>>>     Regards,
> > > >>>>>     Tomas
> > > >>>>>
> > > >>>>>      > __ __
> > > >>>>>      >
> > > >>>>>      > Thank you!
> > > >>>>>      >
> > > >>>>>      >
> > > >>>>>      > _______________________________________________
> > > >>>>>      > Users mailing list: Users at clusterlabs.org
> > > >>>>>     <mailto:Users at clusterlabs.org>
> > > >>>>>      > https://lists.clusterlabs.org/mailman/listinfo/users
> > > >>>>>      >
> > > >>>>>      > Project Home: http://www.clusterlabs.org
> > > >>>>>      > Getting started:
> > > >>>>>     http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > > >>>>>      > Bugs: http://bugs.clusterlabs.org
> > > >>>>>      >
> > > >>>>>     _______________________________________________
> > > >>>>>     Users mailing list: Users at clusterlabs.org <mailto:
> > > >>>> Users at clusterlabs.org>
> > > >>>>>     https://lists.clusterlabs.org/mailman/listinfo/users
> > > >>>>>
> > > >>>>>     Project Home: http://www.clusterlabs.org
> > > >>>>>     Getting started:
> > > >>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > > >>>>>     Bugs: http://bugs.clusterlabs.org
> > > >>>>>
> > > >>>>>
> > > >>>>> _______________________________________________
> > > >>>>> Users mailing list: Users at clusterlabs.org
> > > >>>>> https://lists.clusterlabs.org/mailman/listinfo/users
> > > >>>>>
> > > >>>>> Project Home: http://www.clusterlabs.org
> > > >>>>> Getting started:
> > > >> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > > >>>>> Bugs: http://bugs.clusterlabs.org
> > > >>>>>
> > > >>>> _______________________________________________
> > > >>>> Users mailing list: Users at clusterlabs.org
> > > >>>> https://lists.clusterlabs.org/mailman/listinfo/users
> > > >>>>
> > > >>>> Project Home: http://www.clusterlabs.org
> > > >>>> Getting started:
> > > >> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > > >>>> Bugs: http://bugs.clusterlabs.org
> > > >>>>
> > > >>>
> > > >>>
> > > >>> _______________________________________________
> > > >>> Users mailing list: Users at clusterlabs.org
> > > >>> https://lists.clusterlabs.org/mailman/listinfo/users
> > > >>>
> > > >>> Project Home: http://www.clusterlabs.org
> > > >>> Getting started:
> > > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > > >>> Bugs: http://bugs.clusterlabs.org
> > > >>>
> > > >>
> > > >> _______________________________________________
> > > >> Users mailing list: Users at clusterlabs.org
> > > >> https://lists.clusterlabs.org/mailman/listinfo/users
> > > >>
> > > >> Project Home: http://www.clusterlabs.org
> > > >> Getting started:
> > > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > > >> Bugs: http://bugs.clusterlabs.org
> > > >>
> > > >
> > > >
> > > > _______________________________________________
> > > > Users mailing list: Users at clusterlabs.org
> > > > https://lists.clusterlabs.org/mailman/listinfo/users
> > > >
> > > > Project Home: http://www.clusterlabs.org
> > > > Getting started:
> > > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > > > Bugs: http://bugs.clusterlabs.org
> > > >
> > >
> > > _______________________________________________
> > > Users mailing list: Users at clusterlabs.org
> > > https://lists.clusterlabs.org/mailman/listinfo/users
> > >
> > > Project Home: http://www.clusterlabs.org
> > > Getting started:
> > > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > > Bugs: http://bugs.clusterlabs.org
> >
> > _______________________________________________
> > Users mailing list: Users at clusterlabs.org
> > https://lists.clusterlabs.org/mailman/listinfo/users
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started:
> > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> --
> Ken Gaillot <kgaillot at redhat.com>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20190319/35d5963c/attachment-0001.html>


More information about the Users mailing list