<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
I hope I'll be able to explain the problem clearly and correctly.<br>
<br>
My setup (simplified): I have two cloned resources, a filesystem
mount and a process which writes to that filesystem. The filesystem
is Gluster so its OK to clone it. I also have a mandatory ordering
constraint "start gluster-mount-clone then start
writer-process-clone". I don't have a STONITH device, so I've
disable STONITH by settin<code></code><code class="literal"></code>g
stonith-enabled=false.<br>
<br>
The problem: Sometimes the Gluster freezes for a while, which causes
the gluster-mount resource's monitor with the OCF_CHECK_LEVEL=20 to
timeout (it is unable to write the status file). When this happens,
the cluster tries to recover by restarting the writer-process
resource. But the writer-process is writing to the frozen filesystem
which makes it uninterruptable, not even SIGKILL works. Then the
stop operation times out and on-fail with disabled STONITH defaults
to block (don’t perform any further operations on the resource):<br>
<tt>warning: Forcing writer-process-clone away from
node1.example.org after 1000000 failures (max=1000000)<br>
</tt>After that, the cluster continues with the recovery process by
restarting the gluster-mount resource on that node and it usually
succeeds. As a consequence of that remount, the uninterruptable
system call in the writer process fails, signals are finally
delivered and the writer-process is terminated. But the cluster
doesn't know about that!<br>
<br>
I thought I can solve this by setting the failure-timeout meta
attribute to the writer-process resource, but it only made things
worse. The documentation states: "Stop failures are slightly
different and crucial. ... If a resource fails to stop and STONITH
is not enabled, then the cluster has no way to continue and will not
try to start the resource elsewhere, but will try to stop it again
after the failure timeout.", but I'm seeing something different.
When the policy engine is launched after the nearest
cluster-recheck-interval, following lines are written to the syslog:<br>
<tt>crmd[11852]: notice: State transition S_IDLE ->
S_POLICY_ENGINE</tt><tt><br>
</tt><tt>pengine[11851]: notice: Clearing expired failcount for
writer-process:1 on node1.example.org</tt><tt><br>
</tt><tt>pengine[11851]: notice: Clearing expired failcount for
writer-process:1 on node1.example.org</tt><tt><br>
</tt><tt>pengine[11851]: notice: Ignoring expired calculated
failure writer-process_stop_0 (rc=1,
magic=2:1;64:557:0:2169780b-ca1f-483e-ad42-118b7c7c1a7d) on
node1.example.org</tt><tt><br>
</tt><tt>pengine[11851]: notice: Clearing expired failcount for
writer-process:1 on node1.example.org</tt><tt><br>
</tt><tt>pengine[11851]: notice: Ignoring expired calculated
failure writer-process_stop_0 (rc=1,
magic=2:1;64:557:0:2169780b-ca1f-483e-ad42-118b7c7c1a7d) on
node1.example.org</tt><tt><br>
</tt><tt>pengine[11851]: warning: Processing failed op monitor for
gluster-mount:1 on node1.example.org: unknown error (1)</tt><tt><br>
</tt><tt>pengine[11851]: notice: Calculated transition 564, saving
inputs in /var/lib/pacemaker/pengine/pe-input-362.bz2</tt><tt><br>
</tt><tt>crmd[11852]: notice: Transition 564 (Complete=2,
Pending=0, Fired=0, Skipped=0, Incomplete=0,
Source=/var/lib/pacemaker/pengine/pe-input-362.bz2): Complete</tt><tt><br>
</tt><tt>crmd[11852]: notice: State transition S_TRANSITION_ENGINE
-> S_IDLE</tt><tt><br>
</tt><tt>crmd[11852]: notice: State transition S_IDLE ->
S_POLICY_ENGINE</tt><tt><br>
</tt><tt>crmd[11852]: warning: No reason to expect node 3 to be down</tt><tt><br>
</tt><tt>crmd[11852]: warning: No reason to expect node 1 to be down</tt><tt><br>
</tt><tt>crmd[11852]: warning: No reason to expect node 1 to be down</tt><tt><br>
</tt><tt>crmd[11852]: warning: No reason to expect node 3 to be down</tt><tt><br>
</tt><tt>pengine[11851]: warning: Processing failed op stop for
writer-process:1 on node1.example.org: unknown error (1)</tt><tt><br>
</tt><tt>pengine[11851]: warning: Processing failed op monitor for
gluster-mount:1 on node1.example.org: unknown error (1)</tt><tt><br>
</tt><tt>pengine[11851]: warning: Forcing writer-process-clone away
from node1.example.org after 1000000 failures (max=1000000)</tt><tt><br>
</tt><tt>pengine[11851]: warning: Forcing writer-process-clone away
from node1.example.org after 1000000 failures (max=1000000)</tt><tt><br>
</tt><tt>pengine[11851]: warning: Forcing writer-process-clone away
from node1.example.org after 1000000 failures (max=1000000)</tt><tt><br>
</tt><tt>pengine[11851]: notice: Calculated transition 565, saving
inputs in /var/lib/pacemaker/pengine/pe-input-363.bz2</tt><tt><br>
</tt><tt>pengine[11851]: notice: Ignoring expired calculated
failure writer-process_stop_0 (rc=1,
magic=2:1;64:557:0:2169780b-ca1f-483e-ad42-118b7c7c1a7d) on
node1.example.org</tt><tt><br>
</tt><tt>pengine[11851]: warning: Processing failed op monitor for
gluster-mount:1 on node1.example.org: unknown error (1)</tt><tt><br>
</tt><tt>crmd[11852]: notice: Transition 566 (Complete=0,
Pending=0, Fired=0, Skipped=0, Incomplete=0,
Source=/var/lib/pacemaker/pengine/pe-input-364.bz2): Complete</tt><tt><br>
</tt><tt>crmd[11852]: notice: State transition S_TRANSITION_ENGINE
-> S_IDLE</tt><tt><br>
</tt><tt>pengine[11851]: notice: Calculated transition 566, saving
inputs in /var/lib/pacemaker/pengine/pe-input-364.bz2</tt><br>
<br>
Then after each cluster-recheck-interval:<br>
<tt>crmd[11852]: notice: State transition S_IDLE ->
S_POLICY_ENGINE</tt><tt><br>
</tt><tt>pengine[11851]: notice: Ignoring expired calculated
failure writer-process_stop_0 (rc=1,
magic=2:1;64:557:0:2169780b-ca1f-483e-ad42-118b7c7c1a7d) on
node1.example.org</tt><tt><br>
</tt><tt>pengine[11851]: warning: Processing failed op monitor for
gluster-mount:1 on node1.example.org: unknown error (1)</tt><tt><br>
</tt><tt>crmd[11852]: notice: Transition 567 (Complete=0,
Pending=0, Fired=0, Skipped=0, Incomplete=0,
Source=/var/lib/pacemaker/pengine/pe-input-364.bz2): Complete</tt><tt><br>
</tt><tt>crmd[11852]: notice: State transition S_TRANSITION_ENGINE
-> S_IDLE</tt><br>
<br>
And the crm_mon is happily showing the writer-process as Started,
although it is not running. This is very confusing. Could anyone
please explain what is going on here?<br>
<p><code></code><code></code> </p>
</body>
</html>