<div dir="ltr">Hi,<br>Sorry for replying late.<br><div class="gmail_extra"><br><div class="gmail_quote">2016-01-15 21:19 GMT+09:00 Dejan Muhamedagic <span dir="ltr">&lt;<a href="mailto:dejanmm@fastmail.fm" target="_blank">dejanmm@fastmail.fm</a>&gt;</span>:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi,<br>
<div><div class="h5"><br>
On Fri, Jan 15, 2016 at 04:54:37PM +0900, yuta takeshita wrote:<br>
&gt; Hi,<br>
&gt;<br>
&gt; Tanks for responding and making a patch.<br>
&gt;<br>
&gt; 2016-01-14 19:16 GMT+09:00 Dejan Muhamedagic &lt;<a href="mailto:dejanmm@fastmail.fm">dejanmm@fastmail.fm</a>&gt;:<br>
&gt;<br>
&gt; &gt; On Thu, Jan 14, 2016 at 11:04:09AM +0100, Dejan Muhamedagic wrote:<br>
&gt; &gt; &gt; Hi,<br>
&gt; &gt; &gt;<br>
&gt; &gt; &gt; On Thu, Jan 14, 2016 at 04:20:19PM +0900, yuta takeshita wrote:<br>
&gt; &gt; &gt; &gt; Hello.<br>
&gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; I have been a problem with nfsserver RA on RHEL 7.1 and systemd.<br>
&gt; &gt; &gt; &gt; When the nfsd process is lost with unexpectly failure,<br>
&gt; &gt; nfsserver_monitor()<br>
&gt; &gt; &gt; &gt; doesn&#39;t detect it and doesn&#39;t execute failover.<br>
&gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; I use the below RA.(but this problem may be caused with latest<br>
&gt; &gt; nfsserver RA<br>
&gt; &gt; &gt; &gt; as well)<br>
&gt; &gt; &gt; &gt;<br>
&gt; &gt; <a href="https://github.com/ClusterLabs/resource-agents/blob/v3.9.6/heartbeat/nfsserver" rel="noreferrer" target="_blank">https://github.com/ClusterLabs/resource-agents/blob/v3.9.6/heartbeat/nfsserver</a><br>
&gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; The cause is following.<br>
&gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; 1. After execute &quot;pkill -9 nfsd&quot;, &quot;systemctl status nfs-server.service&quot;<br>
&gt; &gt; &gt; &gt; returns 0.<br>
&gt; &gt; &gt;<br>
&gt; &gt; &gt; I think that it should be systemctl is-active. Already had a<br>
&gt; &gt; &gt; problem with systemctl status, well, not being what one would<br>
&gt; &gt; &gt; assume status would be. Can you please test that and then open<br>
&gt; &gt; &gt; either a pull request or issue at<br>
&gt; &gt; &gt; <a href="https://github.com/ClusterLabs/resource-agents" rel="noreferrer" target="_blank">https://github.com/ClusterLabs/resource-agents</a><br>
&gt; &gt;<br>
&gt; &gt; I already made a pull request:<br>
&gt; &gt;<br>
&gt; &gt; <a href="https://github.com/ClusterLabs/resource-agents/pull/741" rel="noreferrer" target="_blank">https://github.com/ClusterLabs/resource-agents/pull/741</a><br>
&gt; &gt;<br>
&gt; &gt; Please test if you find time.<br>
&gt; &gt;<br>
&gt; I tested the code, but still problems remain.<br>
&gt; systemctl is-active retrun active and the return code is 0 as well as<br>
&gt; systemctl status.<br>
&gt; Perhaps it is inappropriate to use systemctl for monitoring the kernel<br>
&gt; process.<br>
<br>
</div></div>OK. My patch was too naive and didn&#39;t take into account the<br>
systemd/kernel intricacies.<br>
<span class=""><br>
&gt; Mr Kay Sievers who is a developer of systemd said that systemd doesn&#39;t<br>
&gt; monitor kernel process in the following.<br>
&gt; <a href="http://comments.gmane.org/gmane.comp.sysutils.systemd.devel/34367" rel="noreferrer" target="_blank">http://comments.gmane.org/gmane.comp.sysutils.systemd.devel/34367</a><br>
<br>
</span>Thanks for the reference. One interesting thing could also be<br>
reading /proc/fs/nfsd/threads instead of checking the process<br>
existence. Furthermore, we could do some RPC based monitor, but<br>
that would be, I guess, better suited for another monitor depth.<br>
<br></blockquote><div>OK. I survey and test the /proc/fs/nfsd/threads.<br><div style="" id="gt-input-tool"><div id="itamenu"><span class=""></span></div></div><div id="gt-src-c" class=""><div id="gt-src-p"> </div> </div> <div id="gt-res-content" class=""><div dir="ltr" style="zoom:1"><span id="result_box" class="" lang="en"><span class="">It seems</span> <span class="">to work well</span> <span class="">on my</span> <span class="">cluster.</span></span></div></div>I make a patch and a pull request.<br><a href="https://github.com/ClusterLabs/resource-agents/pull/746">https://github.com/ClusterLabs/resource-agents/pull/746</a><br><br></div><div>Please check if you have time.<br><br></div><div>Regards,<br></div><div>Yuta<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
Cheers,<br>
<br>
Dejan<br>
<div class=""><div class="h5"><br>
&gt; I reply to your pull request.<br>
&gt;<br>
&gt; Regards,<br>
&gt; Yuta Takeshita<br>
&gt;<br>
&gt; &gt;<br>
&gt; &gt; Thanks for reporting!<br>
&gt; &gt;<br>
&gt; &gt; Dejan<br>
&gt; &gt;<br>
&gt; &gt; &gt; Thanks,<br>
&gt; &gt; &gt;<br>
&gt; &gt; &gt; Dejan<br>
&gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; 2. nfsserver_monitor() judge with the return value of &quot;systemctl status<br>
&gt; &gt; &gt; &gt; nfs-server.service&quot;.<br>
&gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; ----------------------------------------------------------------------<br>
&gt; &gt; &gt; &gt; # ps ax | grep nfsd<br>
&gt; &gt; &gt; &gt; 25193 ?        S&lt;     0:00 [nfsd4]<br>
&gt; &gt; &gt; &gt; 25194 ?        S&lt;     0:00 [nfsd4_callbacks]<br>
&gt; &gt; &gt; &gt; 25197 ?        S      0:00 [nfsd]<br>
&gt; &gt; &gt; &gt; 25198 ?        S      0:00 [nfsd]<br>
&gt; &gt; &gt; &gt; 25199 ?        S      0:00 [nfsd]<br>
&gt; &gt; &gt; &gt; 25200 ?        S      0:00 [nfsd]<br>
&gt; &gt; &gt; &gt; 25201 ?        S      0:00 [nfsd]<br>
&gt; &gt; &gt; &gt; 25202 ?        S      0:00 [nfsd]<br>
&gt; &gt; &gt; &gt; 25203 ?        S      0:00 [nfsd]<br>
&gt; &gt; &gt; &gt; 25204 ?        S      0:00 [nfsd]<br>
&gt; &gt; &gt; &gt; 25238 pts/0    S+     0:00 grep --color=auto nfsd<br>
&gt; &gt; &gt; &gt; #<br>
&gt; &gt; &gt; &gt; # pkill -9 nfsd<br>
&gt; &gt; &gt; &gt; #<br>
&gt; &gt; &gt; &gt; # systemctl status nfs-server.service<br>
&gt; &gt; &gt; &gt; ● nfs-server.service - NFS server and services<br>
&gt; &gt; &gt; &gt;    Loaded: loaded (/etc/systemd/system/nfs-server.service; disabled;<br>
&gt; &gt; vendor<br>
&gt; &gt; &gt; &gt; preset: disabled)<br>
&gt; &gt; &gt; &gt;    Active: active (exited) since 木 2016-01-14 11:35:39 JST; 1min 3s ago<br>
&gt; &gt; &gt; &gt;   Process: 25184 ExecStart=/usr/sbin/rpc.nfsd $RPCNFSDARGS<br>
&gt; &gt; (code=exited,<br>
&gt; &gt; &gt; &gt; status=0/SUCCESS)<br>
&gt; &gt; &gt; &gt;   Process: 25182 ExecStartPre=/usr/sbin/exportfs -r (code=exited,<br>
&gt; &gt; &gt; &gt; status=0/SUCCESS)<br>
&gt; &gt; &gt; &gt;  Main PID: 25184 (code=exited, status=0/SUCCESS)<br>
&gt; &gt; &gt; &gt;    CGroup: /system.slice/nfs-server.service<br>
&gt; &gt; &gt; &gt; (snip)<br>
&gt; &gt; &gt; &gt; #<br>
&gt; &gt; &gt; &gt; # echo $?<br>
&gt; &gt; &gt; &gt; 0<br>
&gt; &gt; &gt; &gt; #<br>
&gt; &gt; &gt; &gt; # ps ax | grep nfsd<br>
&gt; &gt; &gt; &gt; 25256 pts/0    S+     0:00 grep --color=auto nfsd<br>
&gt; &gt; &gt; &gt; ----------------------------------------------------------------------<br>
&gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; It is because the nfsd process is kernel process, and systemd does not<br>
&gt; &gt; &gt; &gt; monitor the state of the kernel process of running.<br>
&gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; Is there something good way?<br>
&gt; &gt; &gt; &gt; (When I use &quot;pidof&quot; instead of &quot;systemctl status&quot;, the faileover is<br>
&gt; &gt; &gt; &gt; successful.)<br>
&gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; Regards,<br>
&gt; &gt; &gt; &gt; Yuta Takeshita<br>
&gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; _______________________________________________<br>
&gt; &gt; &gt; &gt; Users mailing list: <a href="mailto:Users@clusterlabs.org">Users@clusterlabs.org</a><br>
&gt; &gt; &gt; &gt; <a href="http://clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://clusterlabs.org/mailman/listinfo/users</a><br>
&gt; &gt; &gt; &gt;<br>
&gt; &gt; &gt; &gt; Project Home: <a href="http://www.clusterlabs.org" rel="noreferrer" target="_blank">http://www.clusterlabs.org</a><br>
&gt; &gt; &gt; &gt; Getting started:<br>
&gt; &gt; <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" rel="noreferrer" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>
&gt; &gt; &gt; &gt; Bugs: <a href="http://bugs.clusterlabs.org" rel="noreferrer" target="_blank">http://bugs.clusterlabs.org</a><br>
&gt; &gt; &gt;<br>
&gt; &gt; &gt;<br>
&gt; &gt; &gt; _______________________________________________<br>
&gt; &gt; &gt; Users mailing list: <a href="mailto:Users@clusterlabs.org">Users@clusterlabs.org</a><br>
&gt; &gt; &gt; <a href="http://clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://clusterlabs.org/mailman/listinfo/users</a><br>
&gt; &gt; &gt;<br>
&gt; &gt; &gt; Project Home: <a href="http://www.clusterlabs.org" rel="noreferrer" target="_blank">http://www.clusterlabs.org</a><br>
&gt; &gt; &gt; Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" rel="noreferrer" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>
&gt; &gt; &gt; Bugs: <a href="http://bugs.clusterlabs.org" rel="noreferrer" target="_blank">http://bugs.clusterlabs.org</a><br>
&gt; &gt;<br>
&gt; &gt; _______________________________________________<br>
&gt; &gt; Users mailing list: <a href="mailto:Users@clusterlabs.org">Users@clusterlabs.org</a><br>
&gt; &gt; <a href="http://clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://clusterlabs.org/mailman/listinfo/users</a><br>
&gt; &gt;<br>
&gt; &gt; Project Home: <a href="http://www.clusterlabs.org" rel="noreferrer" target="_blank">http://www.clusterlabs.org</a><br>
&gt; &gt; Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" rel="noreferrer" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>
&gt; &gt; Bugs: <a href="http://bugs.clusterlabs.org" rel="noreferrer" target="_blank">http://bugs.clusterlabs.org</a><br>
&gt; &gt;<br>
<br>
&gt; _______________________________________________<br>
&gt; Users mailing list: <a href="mailto:Users@clusterlabs.org">Users@clusterlabs.org</a><br>
&gt; <a href="http://clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://clusterlabs.org/mailman/listinfo/users</a><br>
&gt;<br>
&gt; Project Home: <a href="http://www.clusterlabs.org" rel="noreferrer" target="_blank">http://www.clusterlabs.org</a><br>
&gt; Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" rel="noreferrer" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>
&gt; Bugs: <a href="http://bugs.clusterlabs.org" rel="noreferrer" target="_blank">http://bugs.clusterlabs.org</a><br>
<br>
<br>
_______________________________________________<br>
Users mailing list: <a href="mailto:Users@clusterlabs.org">Users@clusterlabs.org</a><br>
<a href="http://clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://clusterlabs.org/mailman/listinfo/users</a><br>
<br>
Project Home: <a href="http://www.clusterlabs.org" rel="noreferrer" target="_blank">http://www.clusterlabs.org</a><br>
Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" rel="noreferrer" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>
Bugs: <a href="http://bugs.clusterlabs.org" rel="noreferrer" target="_blank">http://bugs.clusterlabs.org</a><br>
</div></div></blockquote></div><br></div></div>