<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
</head>
<body>
<p>Hello.</p>
<p><br>
</p>
Despite many years of Pacemaker use, it never stops fooling me...<br>
<br>
<br>
This time, I have faced a trivial problem. In my new setup, the
cluster consists of several identical nodes. A clone resource
(vg.sanlock) is started on every node, ensuring it has access to SAN
storage. Almost all other resources are colocated and ordered after
vg.sanlock.<br>
<br>
<br>
This day, I've started a node, and vg.sanlock has failed to start.
Then the cluster has desided to stop all the clone instances "due to
node availability", taking down all other resources by dependencies.
This seemes illogical to me. In the case of a failing clone, I would
prefer to see it stopping on one node only. How do I do it properly?<br>
<br>
<br>
I've tried this config with Pacemaker 2.0.3 and 1.1.16, the
behaviour stays the same. <br>
<p><br>
</p>
<p>Reduced test config here:</p>
<p><br>
</p>
<p>
</p>
<div class="moz-text-html" lang="x-unicode">
<div class="WordSection1">
<p class="MsoNormal"><span lang="EN-US">pcs cluster auth
test-pcmk0 test-pcmk1 <>/dev/tty</span></p>
<p class="MsoNormal"><span lang="EN-US">pcs cluster setup --name
test-pcmk test-pcmk0 test-pcmk1 --transport udpu \</span></p>
<p class="MsoNormal"><span lang="EN-US"> --auto_tie_breaker 1</span></p>
<p class="MsoNormal"><span lang="EN-US">pcs cluster start --all
--wait=60</span></p>
<p class="MsoNormal"><span lang="EN-US">pcs cluster cib
tmp-cib.xml</span></p>
<p class="MsoNormal"><span lang="EN-US">cp tmp-cib.xml
tmp-cib.xml.deltasrc</span></p>
<p class="MsoNormal"><span lang="EN-US">pcs -f tmp-cib.xml
property set stonith-enabled=false</span></p>
<span lang="EN-US">pcs -f tmp-cib.xml resource defaults
resource-stickiness=100</span>
<p class="MsoNormal"><span lang="EN-US">pcs -f tmp-cib.xml
resource create vg.sanlock ocf:pacemaker:Dummy \</span></p>
<p class="MsoNormal"><span lang="EN-US"> op monitor interval=10
timeout=20 start interval=0s stop interval=0s \</span></p>
<p class="MsoNormal"><span lang="EN-US"> timeout=20</span></p>
<p class="MsoNormal"><span lang="EN-US">pcs -f tmp-cib.xml
resource clone vg.sanlock interleave=true</span></p>
<p class="MsoNormal"><span lang="EN-US">pcs cluster cib-push
tmp-cib.xml diff-against=tmp-cib.xml.deltasrc</span></p>
<p class="MsoNormal"><span lang="EN-US"> </span></p>
<p class="MsoNormal"><span lang="EN-US">And here goes cluster
reaction to the failure:</span></p>
<p class="MsoNormal"><span lang="EN-US"> <br>
</span></p>
<p class="MsoNormal"><span lang="EN-US"># crm_simulate -x
state4.xml -S</span></p>
<p class="MsoNormal"><span lang="EN-US"> </span></p>
<p class="MsoNormal"><span lang="EN-US">Current cluster status:</span></p>
<p class="MsoNormal"><span lang="EN-US">Online: [ test-pcmk0
test-pcmk1 ]</span></p>
<p class="MsoNormal"><span lang="EN-US"> </span></p>
<p class="MsoNormal"><span lang="EN-US">Clone Set:
vg.sanlock-clone [vg.sanlock]</span></p>
<p class="MsoNormal"><span lang="EN-US"> vg.sanlock
(ocf::pacemaker:Dummy): FAILED test-pcmk0</span></p>
<p class="MsoNormal"><span lang="EN-US"> Started: [
test-pcmk1 ]</span></p>
<p class="MsoNormal"><span lang="EN-US"> </span></p>
<p class="MsoNormal"><span lang="EN-US">Transition Summary:</span></p>
<p class="MsoNormal"><span lang="EN-US">* Stop
vg.sanlock:0 ( test-pcmk1 ) due to node availability</span></p>
<p class="MsoNormal"><span lang="EN-US">* Stop
vg.sanlock:1 ( test-pcmk0 ) due to node availability</span></p>
<p class="MsoNormal"><span lang="EN-US"> </span></p>
<p class="MsoNormal"><span lang="EN-US">Executing cluster
transition:</span></p>
<p class="MsoNormal"><span lang="EN-US">* Pseudo action:
vg.sanlock-clone_stop_0</span></p>
<p class="MsoNormal"><span lang="EN-US">* Resource action:
vg.sanlock stop on test-pcmk1</span></p>
<p class="MsoNormal"><span lang="EN-US">* Resource action:
vg.sanlock stop on test-pcmk0</span></p>
<p class="MsoNormal"><span lang="EN-US">* Pseudo action:
vg.sanlock-clone_stopped_0</span></p>
<p class="MsoNormal"><span lang="EN-US">* Pseudo action:
all_stopped</span></p>
<p class="MsoNormal"><span lang="EN-US"> </span></p>
<p class="MsoNormal"><span lang="EN-US">Revised cluster status:</span></p>
<p class="MsoNormal"><span lang="EN-US">Online: [ test-pcmk0
test-pcmk1 ]</span></p>
<p class="MsoNormal"><span lang="EN-US"> </span></p>
<p class="MsoNormal"><span lang="EN-US">Clone Set:
vg.sanlock-clone [vg.sanlock]</span></p>
<p class="MsoNormal"><span lang="EN-US"> Stopped: [
test-pcmk0 test-pcmk1 ]</span></p>
<p class="MsoNormal"><span lang="EN-US"><br>
</span></p>
<p class="MsoNormal"><span lang="EN-US">As a sidenote, if I make
those clones globally-unique, they seem to behave properly.
But nowhere I found a reference to this solution. In
general, globally-unique clones are referred to only where
resource agents make distinction between clone instances.
This is not the case.<br>
</span></p>
<p class="MsoNormal"><span lang="EN-US"><br>
</span></p>
<p class="MsoNormal"><span lang="EN-US">--<br>
</span></p>
<p class="MsoNormal"><span lang="EN-US">Thanks,</span></p>
<p class="MsoNormal"><span lang="EN-US">Pavel<br>
</span></p>
<p class="MsoNormal"><span lang="EN-US"><br>
</span></p>
<p class="MsoNormal"><span lang="EN-US"><br>
</span></p>
</div>
<font size="1" face="Arial" color="Gray">
</font>
</div>
</body>
</html>