[Pacemaker] Fencing of bare-metal remote nodes

David Vossel dvossel at redhat.com
Wed Nov 26 16:26:22 EST 2014



----- Original Message -----
> 26.11.2014 18:36, David Vossel wrote:
> > 
> > 
> > ----- Original Message -----
> >> 25.11.2014 23:41, David Vossel wrote:
> >>>
> >>>
> >>> ----- Original Message -----
> >>>> Hi!
> >>>>
> >>>> is subj implemented?
> >>>>
> >>>> Trying echo c > /proc/sysrq-trigger on remote nodes and no fencing
> >>>> occurs.
> >>>
> >>> Yes, fencing remote-nodes works. Are you certain your fencing devices can
> >>> handle
> >>> fencing the remote-node? Fencing a remote-node requires a cluster node to
> >>> invoke the agent that actually performs the fencing action on the
> >>> remote-node.
> >>
> >> Yes, if I invoke fencing action manually ('crm node fence <rnode>' in
> >> crmsh syntax), node is fenced. So the issue seems to be related to the
> >> detection of a "need fencing".
> >>
> >> Comments in related git commits are a little bit terse in this area. So
> >> could you please explain what exactly needs to happen on a remote node
> >> to initiate fencing?
> >>
> >> I tried so far:
> >> * kill pacemaker_remoted when no resources are running. systemd restated
> >> it and crmd reconnected after some time.

This should definitely cause the remote-node to be fenced. I tested this
earlier today after reading you were having problems and my setup fenced
the remote-node correctly.

> >> * crash kernel when no resources are running

If a remote-node connection is lost and pacemaker was able to verify the
node is clean before the connection is lost, pacemaker will attempt to
reconnect to the remote-node without issuing a fencing request.

I could see why both fencing and not fencing in this situation could be desired.
Maybe i should make an option.

> >> * crash kernel during massive start of resources

This should definitely cause the remote node to be fenced.

> > 
> > this last one should definitely cause fencing. What version of pacemaker
> > are
> > you using? I've made changes in this area recently. Can you provide a
> > crm_report.
> 
> It's c191bf3.
> crm_report is ready, but I still wait an approval from a customer to
> send it.

Great. I really need to see what you all are doing. Outside of my own setup I have
not seen many setups where pacemaker remote deployed on baremetal nodes. It is possible
something in your configuration exposes some edge case I haven't encountered yet.

There's a US holiday Thrusday and Friday, so I won't be able to look at this until next
week.

-- Vossel

> 
> > 
> > -- David
> > 
> >>
> >> No fencing happened. In the last case that start actions 'hung' and were
> >> failed by timeout (it is rather long), node was not even listed as
> >> failed. My customer asked me to stop crashing nodes because one of them
> >> does not boot anymore (I "like" that modern UEFI hardware very much.),
> >> so it is hard for me to play more with that.
> >>
> >> Best,
> >> Vladislav
> >>
> >>
> >>>
> >>> -- Vossel
> >>>
> >>>>
> >>>> Best,
> >>>> Vladislav
> >>>>
> >>>> _______________________________________________
> >>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> >>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >>>>
> >>>> Project Home: http://www.clusterlabs.org
> >>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> >>>> Bugs: http://bugs.clusterlabs.org
> >>>>
> >>>
> >>> _______________________________________________
> >>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >>>
> >>> Project Home: http://www.clusterlabs.org
> >>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> >>> Bugs: http://bugs.clusterlabs.org
> >>>
> >>
> >>
> >> _______________________________________________
> >> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >>
> >> Project Home: http://www.clusterlabs.org
> >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> >> Bugs: http://bugs.clusterlabs.org
> >>
> > 
> > _______________________________________________
> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > 
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> > 
> 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 




More information about the Pacemaker mailing list