[ClusterLabs] Fencing questions.

Mon Oct 19 10:51:28 EDT 2015

On 19/10/15 06:53 AM, Arjun Pandey wrote:
> Hi
> 
> I am running a 2 node cluster with this config on centos 6.5/6.6  where

It's important to keep both nodes on the same minor version,
particularly in this case. Please either upgrade centos 6.5 to 6.6 or
both to 6.7.

> i have a multi-state resource foo being run in master/slave mode and  a
> bunch of floating IP addresses configured. Additionally i have
> a collocation constraint for the IP addr to be collocated with the master.
> 
> Please find the following files attached 
> cluster.conf
> CIB

It's preferable on a mailing list to copy the text into the body of the
message. Easier to read.

> Issues that i have :-
> 1. Daemons required for fencing
> Earlier we were invoking cman start quorum from pacemaker script which
> ensured that fenced / gfs and other daemons are not started. This was ok
> since fencing wasn't being handled earlier.

The cman fencing is simply a pass-through to pacemaker. When pacemaker
tells cman that fencing succeeded, it inform DLM and begins cleanup.

> For fencing purpose do we only need the fenced to be started ?  We don't
> have any gfs partitions that we want to monitor via pacemaker. My
> concern here is that if i use the unmodified script then pacemaker start
> time increases significantly. I see a difference of 60 sec from the
> earlier startup before service pacemaker status shows up as started.

Don't start fenced manually, just start pacemaker and let it handle
everything. Ideally, use the pcs command (and pcsd daemon on the nodes)
to start/stop the cluster, but you'll need to upgrade to 6.7.

> 2. Fencing test cases.
>  Based on the internet queries i could find , apart from plugging out
> the dedicated cable. The only other case suggested is killing corosync
> process on one of the nodes.
> Are there any other basic cases that i should look at ?
> What about bring up interface down manually ? I understand that this is
> an unlikely scenario but i am just looking for more ways to test this out.

echo c > /proc/sysrq-trigger == kernel panic. It's my preferred test.
Also, killing the power to the node will cause IPMI to fail and will
test your backup fence method, if you have it, or ensure the cluster
locks up if you don't (better to hang than to risk corruption).

> 3. Testing whether fencing is working or not.
> Previously i have been using fence_ilo4 from the shell to test whether
> the command is working. I was assuming that similar invocation would be
> done by stonith when actual fencing needs to be done. 
> 
> However based on other threads i could find people also use fence_tool
> <node-name> to try this out. According to me this tests out whether
> fencing when invoked by fenced for a particular node succeeds or not. Is
> that valid ? 

Fence tool is just a command to control the cluster's fencing. The
fence_X agents do the actual work.

> Since we are configuring fence_pcmk as the fence device the flow of
> things is 
> fenced -> fence_pcmk -> stonith -> fence agent.

Basically correct.

> 4. Fencing agent to be used (fence_ipmilan vs fence_ilo4)
> Also for ILO fencing i see fence_ilo4 and fence_ipmilan both available.
> I had been using fence_ilo4 till now. 

Which ever works is fine. I believe a lot of the fence_X out-of-band
agents are actually just links to fence_ipmilan, but I might be wrong.

> I think this mail has multiple questions and i will probably send out
> another mail for a few issues i see after fencing takes place. 
> 
> Thanks in advance
> Arjun
> 
> 
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?