<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 TRANSITIONAL//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; CHARSET=UTF-8">
<META NAME="GENERATOR" CONTENT="GtkHTML/3.16.3">
</HEAD>
<BODY>
Digimer,<BR>
<BR>
Ok, sounds reasonable and I will investigate this further on Jan 2. WRT DRBD ... geeee, I don't recall multiple fencings. I'll check that also on Jan 2.<BR>
<BR>
E<FONT COLOR="#000000">mmanuel</FONT>, <BR>
<BR>
I have not seen pending fencing operations with "dlm_tool ls" ... but I have seen the word "pending" elsewhere (crm_mon?) without considering that it might be fencing that is pending. Interesting.<BR>
<BR>
Thanks & my best wishes for a healthy new year.<BR>
Bob Haxo<BR>
<BR>
<BR>
On Wed, 2014-01-01 at 00:19 -0500, Digimer wrote:
<BLOCKQUOTE TYPE=CITE>
<PRE>
<FONT COLOR="#000000">This is probably because cman (which is it's own cluster stack and used </FONT>
<FONT COLOR="#000000">to provide DLM and quorum to pacemaker on EL6) detected the node failed </FONT>
<FONT COLOR="#000000">after the initial fence and called it's own fence. You see a similar </FONT>
<FONT COLOR="#000000">behaviour when using DRBD. It will also call a fence when the peer dies </FONT>
<FONT COLOR="#000000">(even when it died because of a controlled fence call). In theory, </FONT>
<FONT COLOR="#000000">pacemaker using cman's dlm with DRBD would trigger three fences per </FONT>
<FONT COLOR="#000000">failure. :)</FONT>
<FONT COLOR="#000000">digimer</FONT>
<FONT COLOR="#000000">On 01/01/14 12:04 AM, emmanuel segura wrote:</FONT>
<FONT COLOR="#000000">> maybe you missing log when you had fenced the node? because i think the</FONT>
<FONT COLOR="#000000">> clvmd hungup because your node are in unclean state, use dlm_tool ls to</FONT>
<FONT COLOR="#000000">> see if you any pending fencing operation.</FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">> 2014/1/1 Bob Haxo <<A HREF="mailto:bhaxo@sgi.com">bhaxo@sgi.com</A> <<A HREF="mailto:bhaxo@sgi.com">mailto:bhaxo@sgi.com</A>>></FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">> __</FONT>
<FONT COLOR="#000000">> Greetings ... Happy New Year!</FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">> I am testing a configuration that is created from example in</FONT>
<FONT COLOR="#000000">> "Chapter 6. Configuring a GFS2 File System in a Cluster" of the "Red</FONT>
<FONT COLOR="#000000">> Hat Enterprise Linux 7.0 Beta Global File System 2" document. Only</FONT>
<FONT COLOR="#000000">> addition is stonith:fence_ipmilan. After encountering this issue</FONT>
<FONT COLOR="#000000">> when I configured with "crm", I re-configured using "pcs". I've</FONT>
<FONT COLOR="#000000">> included the configuration below.</FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">> I'm thinking that, in a 2-node cluster, if I run "stonith_admin -F</FONT>
<FONT COLOR="#000000">> <peer-node>", then <peer-node> should reboot and cleanly rejoin the</FONT>
<FONT COLOR="#000000">> cluster. This is not happening.</FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">> What ultimately happens is that after the initially fenced node</FONT>
<FONT COLOR="#000000">> reboots, the system from which the stonith_admin -F command was run</FONT>
<FONT COLOR="#000000">> is fenced and reboots. The fencing stops there, leaving the cluster</FONT>
<FONT COLOR="#000000">> in an appropriate state.</FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">> The issue seems to reside with clvmd/lvm. With the reboot of the</FONT>
<FONT COLOR="#000000">> initially fenced node, the clvmd resource fails on the surviving</FONT>
<FONT COLOR="#000000">> node, with a maximum of errors. I hypothesize there is an issue</FONT>
<FONT COLOR="#000000">> with locks, but have insufficient knowledge of clvmd/lvm locks to</FONT>
<FONT COLOR="#000000">> prove or disprove this hypothesis.</FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">> Have I missed something ...</FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">> 1) Is this expected behavior, and always the reboot of the fencing</FONT>
<FONT COLOR="#000000">> node happens?</FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">> 2) Or, maybe I didn't correctly duplicate the Chapter 6 example?</FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">> 3) Or, perhaps something is wrong or omitted from the Chapter 6 example?</FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">> Suggestions will be much appreciated.</FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">> Thanks,</FONT>
<FONT COLOR="#000000">> Bob Haxo</FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">> RHEL6.5</FONT>
<FONT COLOR="#000000">> pacemaker-cli-1.1.10-14.el6_5.1.x86_64</FONT>
<FONT COLOR="#000000">> crmsh-1.2.5-55.1sgi709r3.rhel6.x86_64</FONT>
<FONT COLOR="#000000">> pacemaker-libs-1.1.10-14.el6_5.1.x86_64</FONT>
<FONT COLOR="#000000">> cman-3.0.12.1-59.el6_5.1.x86_64</FONT>
<FONT COLOR="#000000">> pacemaker-1.1.10-14.el6_5.1.x86_64</FONT>
<FONT COLOR="#000000">> corosynclib-1.4.1-17.el6.x86_64</FONT>
<FONT COLOR="#000000">> corosync-1.4.1-17.el6.x86_64</FONT>
<FONT COLOR="#000000">> pacemaker-cluster-libs-1.1.10-14.el6_5.1.x86_64</FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">> Cluster Name: mici</FONT>
<FONT COLOR="#000000">> Corosync Nodes:</FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">> Pacemaker Nodes:</FONT>
<FONT COLOR="#000000">> mici-admin mici-admin2</FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">> Resources:</FONT>
<FONT COLOR="#000000">> Clone: clusterfs-clone</FONT>
<FONT COLOR="#000000">> Meta Attrs: interleave=true target-role=Started</FONT>
<FONT COLOR="#000000">> Resource: clusterfs (class=ocf provider=heartbeat type=Filesystem)</FONT>
<FONT COLOR="#000000">> Attributes: device=/dev/vgha2/lv_clust2 directory=/images</FONT>
<FONT COLOR="#000000">> fstype=gfs2 options=defaults,noatime,nodiratime</FONT>
<FONT COLOR="#000000">> Operations: monitor on-fail=fence interval=30s</FONT>
<FONT COLOR="#000000">> (clusterfs-monitor-interval-30s)</FONT>
<FONT COLOR="#000000">> Clone: clvmd-clone</FONT>
<FONT COLOR="#000000">> Meta Attrs: interleave=true ordered=true target-role=Started</FONT>
<FONT COLOR="#000000">> Resource: clvmd (class=lsb type=clvmd)</FONT>
<FONT COLOR="#000000">> Operations: monitor on-fail=fence interval=30s</FONT>
<FONT COLOR="#000000">> (clvmd-monitor-interval-30s)</FONT>
<FONT COLOR="#000000">> Clone: dlm-clone</FONT>
<FONT COLOR="#000000">> Meta Attrs: interleave=true ordered=true</FONT>
<FONT COLOR="#000000">> Resource: dlm (class=ocf provider=pacemaker type=controld)</FONT>
<FONT COLOR="#000000">> Operations: monitor on-fail=fence interval=30s</FONT>
<FONT COLOR="#000000">> (dlm-monitor-interval-30s)</FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">> Stonith Devices:</FONT>
<FONT COLOR="#000000">> Resource: p_ipmi_fencing_1 (class=stonith type=fence_ipmilan)</FONT>
<FONT COLOR="#000000">> Attributes: ipaddr=128.##.##.78 login=XXXXX passwd=XXXXX</FONT>
<FONT COLOR="#000000">> lanplus=1 action=reboot pcmk_host_check=static-list</FONT>
<FONT COLOR="#000000">> pcmk_host_list=mici-admin</FONT>
<FONT COLOR="#000000">> Meta Attrs: target-role=Started</FONT>
<FONT COLOR="#000000">> Operations: monitor start-delay=30 interval=60s timeout=30</FONT>
<FONT COLOR="#000000">> (p_ipmi_fencing_1-monitor-60s)</FONT>
<FONT COLOR="#000000">> Resource: p_ipmi_fencing_2 (class=stonith type=fence_ipmilan)</FONT>
<FONT COLOR="#000000">> Attributes: ipaddr=128.##.##.220 login=XXXXX passwd=XXXXX</FONT>
<FONT COLOR="#000000">> lanplus=1 action=reboot pcmk_host_check=static-list</FONT>
<FONT COLOR="#000000">> pcmk_host_list=mici-admin2</FONT>
<FONT COLOR="#000000">> Meta Attrs: target-role=Started</FONT>
<FONT COLOR="#000000">> Operations: monitor start-delay=30 interval=60s timeout=30</FONT>
<FONT COLOR="#000000">> (p_ipmi_fencing_2-monitor-60s)</FONT>
<FONT COLOR="#000000">> Fencing Levels:</FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">> Location Constraints:</FONT>
<FONT COLOR="#000000">> Resource: p_ipmi_fencing_1</FONT>
<FONT COLOR="#000000">> Disabled on: mici-admin (score:-INFINITY)</FONT>
<FONT COLOR="#000000">> (id:location-p_ipmi_fencing_1-mici-admin--INFINITY)</FONT>
<FONT COLOR="#000000">> Resource: p_ipmi_fencing_2</FONT>
<FONT COLOR="#000000">> Disabled on: mici-admin2 (score:-INFINITY)</FONT>
<FONT COLOR="#000000">> (id:location-p_ipmi_fencing_2-mici-admin2--INFINITY)</FONT>
<FONT COLOR="#000000">> Ordering Constraints:</FONT>
<FONT COLOR="#000000">> start dlm-clone then start clvmd-clone (Mandatory)</FONT>
<FONT COLOR="#000000">> (id:order-dlm-clone-clvmd-clone-mandatory)</FONT>
<FONT COLOR="#000000">> start clvmd-clone then start clusterfs-clone (Mandatory)</FONT>
<FONT COLOR="#000000">> (id:order-clvmd-clone-clusterfs-clone-mandatory)</FONT>
<FONT COLOR="#000000">> Colocation Constraints:</FONT>
<FONT COLOR="#000000">> clusterfs-clone with clvmd-clone (INFINITY)</FONT>
<FONT COLOR="#000000">> (id:colocation-clusterfs-clone-clvmd-clone-INFINITY)</FONT>
<FONT COLOR="#000000">> clvmd-clone with dlm-clone (INFINITY)</FONT>
<FONT COLOR="#000000">> (id:colocation-clvmd-clone-dlm-clone-INFINITY)</FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">> Cluster Properties:</FONT>
<FONT COLOR="#000000">> cluster-infrastructure: cman</FONT>
<FONT COLOR="#000000">> dc-version: 1.1.10-14.el6_5.1-368c726</FONT>
<FONT COLOR="#000000">> last-lrm-refresh: 1388530552</FONT>
<FONT COLOR="#000000">> no-quorum-policy: ignore</FONT>
<FONT COLOR="#000000">> stonith-enabled: true</FONT>
<FONT COLOR="#000000">> Node Attributes:</FONT>
<FONT COLOR="#000000">> mici-admin: standby=off</FONT>
<FONT COLOR="#000000">> mici-admin2: standby=off</FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">> Last updated: Tue Dec 31 17:15:55 2013</FONT>
<FONT COLOR="#000000">> Last change: Tue Dec 31 16:57:37 2013 via cibadmin on mici-admin</FONT>
<FONT COLOR="#000000">> Stack: cman</FONT>
<FONT COLOR="#000000">> Current DC: mici-admin2 - partition with quorum</FONT>
<FONT COLOR="#000000">> Version: 1.1.10-14.el6_5.1-368c726</FONT>
<FONT COLOR="#000000">> 2 Nodes configured</FONT>
<FONT COLOR="#000000">> 8 Resources configured</FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">> Online: [ mici-admin mici-admin2 ]</FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">> Full list of resources:</FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">> p_ipmi_fencing_1 (stonith:fence_ipmilan): Started</FONT>
<FONT COLOR="#000000">> mici-admin2</FONT>
<FONT COLOR="#000000">> p_ipmi_fencing_2 (stonith:fence_ipmilan): Started</FONT>
<FONT COLOR="#000000">> mici-admin</FONT>
<FONT COLOR="#000000">> Clone Set: clusterfs-clone [clusterfs]</FONT>
<FONT COLOR="#000000">> Started: [ mici-admin mici-admin2 ]</FONT>
<FONT COLOR="#000000">> Clone Set: clvmd-clone [clvmd]</FONT>
<FONT COLOR="#000000">> Started: [ mici-admin mici-admin2 ]</FONT>
<FONT COLOR="#000000">> Clone Set: dlm-clone [dlm]</FONT>
<FONT COLOR="#000000">> Started: [ mici-admin mici-admin2 ]</FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">> Migration summary:</FONT>
<FONT COLOR="#000000">> * Node mici-admin:</FONT>
<FONT COLOR="#000000">> * Node mici-admin2:</FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">> =====================================================</FONT>
<FONT COLOR="#000000">> crm_mon after the fenced node reboots. Shows the failure of clvmd</FONT>
<FONT COLOR="#000000">> that then</FONT>
<FONT COLOR="#000000">> occurs, which in turn triggers a fencing of that nnode</FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">> Last updated: Tue Dec 31 17:06:55 2013</FONT>
<FONT COLOR="#000000">> Last change: Tue Dec 31 16:57:37 2013 via cibadmin on mici-admin</FONT>
<FONT COLOR="#000000">> Stack: cman</FONT>
<FONT COLOR="#000000">> Current DC: mici-admin - partition with quorum</FONT>
<FONT COLOR="#000000">> Version: 1.1.10-14.el6_5.1-368c726</FONT>
<FONT COLOR="#000000">> 2 Nodes configured</FONT>
<FONT COLOR="#000000">> 8 Resources configured</FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">> Node mici-admin: UNCLEAN (online)</FONT>
<FONT COLOR="#000000">> Online: [ mici-admin2 ]</FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">> Full list of resources:</FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">> p_ipmi_fencing_1 (stonith:fence_ipmilan): Stopped</FONT>
<FONT COLOR="#000000">> p_ipmi_fencing_2 (stonith:fence_ipmilan): Started</FONT>
<FONT COLOR="#000000">> mici-admin</FONT>
<FONT COLOR="#000000">> Clone Set: clusterfs-clone [clusterfs]</FONT>
<FONT COLOR="#000000">> Started: [ mici-admin ]</FONT>
<FONT COLOR="#000000">> Stopped: [ mici-admin2 ]</FONT>
<FONT COLOR="#000000">> Clone Set: clvmd-clone [clvmd]</FONT>
<FONT COLOR="#000000">> clvmd (lsb:clvmd): FAILED mici-admin</FONT>
<FONT COLOR="#000000">> Stopped: [ mici-admin2 ]</FONT>
<FONT COLOR="#000000">> Clone Set: dlm-clone [dlm]</FONT>
<FONT COLOR="#000000">> Started: [ mici-admin mici-admin2 ]</FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">> Migration summary:</FONT>
<FONT COLOR="#000000">> * Node mici-admin:</FONT>
<FONT COLOR="#000000">> clvmd: migration-threshold=1000000 fail-count=1</FONT>
<FONT COLOR="#000000">> last-failure='Tue Dec 31 17:04:29 2013'</FONT>
<FONT COLOR="#000000">> * Node mici-admin2:</FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">> Failed actions:</FONT>
<FONT COLOR="#000000">> clvmd_monitor_30000 on mici-admin 'unknown error' (1): call=60,</FONT>
<FONT COLOR="#000000">> status=Timed Out, la</FONT>
<FONT COLOR="#000000">> st-rc-change='Tue Dec 31 17:04:29 2013', queued=0ms, exec=0ms</FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">> _______________________________________________</FONT>
<FONT COLOR="#000000">> Pacemaker mailing list: <A HREF="mailto:Pacemaker@oss.clusterlabs.org">Pacemaker@oss.clusterlabs.org</A></FONT>
<FONT COLOR="#000000">> <<A HREF="mailto:Pacemaker@oss.clusterlabs.org">mailto:Pacemaker@oss.clusterlabs.org</A>></FONT>
<FONT COLOR="#000000">> <A HREF="http://oss.clusterlabs.org/mailman/listinfo/pacemaker">http://oss.clusterlabs.org/mailman/listinfo/pacemaker</A></FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">> Project Home: <A HREF="http://www.clusterlabs.org">http://www.clusterlabs.org</A></FONT>
<FONT COLOR="#000000">> Getting started: <A HREF="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</A></FONT>
<FONT COLOR="#000000">> Bugs: <A HREF="http://bugs.clusterlabs.org">http://bugs.clusterlabs.org</A></FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">> --</FONT>
<FONT COLOR="#000000">> esta es mi vida e me la vivo hasta que dios quiera</FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">> _______________________________________________</FONT>
<FONT COLOR="#000000">> Pacemaker mailing list: <A HREF="mailto:Pacemaker@oss.clusterlabs.org">Pacemaker@oss.clusterlabs.org</A></FONT>
<FONT COLOR="#000000">> <A HREF="http://oss.clusterlabs.org/mailman/listinfo/pacemaker">http://oss.clusterlabs.org/mailman/listinfo/pacemaker</A></FONT>
<FONT COLOR="#000000">></FONT>
<FONT COLOR="#000000">> Project Home: <A HREF="http://www.clusterlabs.org">http://www.clusterlabs.org</A></FONT>
<FONT COLOR="#000000">> Getting started: <A HREF="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</A></FONT>
<FONT COLOR="#000000">> Bugs: <A HREF="http://bugs.clusterlabs.org">http://bugs.clusterlabs.org</A></FONT>
<FONT COLOR="#000000">></FONT>
</PRE>
</BLOCKQUOTE>
</BODY>
</HTML>