<div dir="ltr">Hi,<br>Sorry for replying late.<br><div class="gmail_extra"><br><div class="gmail_quote">2016-01-15 21:19 GMT+09:00 Dejan Muhamedagic <span dir="ltr"><<a href="mailto:dejanmm@fastmail.fm" target="_blank">dejanmm@fastmail.fm</a>></span>:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi,<br>
<div><div class="h5"><br>
On Fri, Jan 15, 2016 at 04:54:37PM +0900, yuta takeshita wrote:<br>
> Hi,<br>
><br>
> Tanks for responding and making a patch.<br>
><br>
> 2016-01-14 19:16 GMT+09:00 Dejan Muhamedagic <<a href="mailto:dejanmm@fastmail.fm">dejanmm@fastmail.fm</a>>:<br>
><br>
> > On Thu, Jan 14, 2016 at 11:04:09AM +0100, Dejan Muhamedagic wrote:<br>
> > > Hi,<br>
> > ><br>
> > > On Thu, Jan 14, 2016 at 04:20:19PM +0900, yuta takeshita wrote:<br>
> > > > Hello.<br>
> > > ><br>
> > > > I have been a problem with nfsserver RA on RHEL 7.1 and systemd.<br>
> > > > When the nfsd process is lost with unexpectly failure,<br>
> > nfsserver_monitor()<br>
> > > > doesn't detect it and doesn't execute failover.<br>
> > > ><br>
> > > > I use the below RA.(but this problem may be caused with latest<br>
> > nfsserver RA<br>
> > > > as well)<br>
> > > ><br>
> > <a href="https://github.com/ClusterLabs/resource-agents/blob/v3.9.6/heartbeat/nfsserver" rel="noreferrer" target="_blank">https://github.com/ClusterLabs/resource-agents/blob/v3.9.6/heartbeat/nfsserver</a><br>
> > > ><br>
> > > > The cause is following.<br>
> > > ><br>
> > > > 1. After execute "pkill -9 nfsd", "systemctl status nfs-server.service"<br>
> > > > returns 0.<br>
> > ><br>
> > > I think that it should be systemctl is-active. Already had a<br>
> > > problem with systemctl status, well, not being what one would<br>
> > > assume status would be. Can you please test that and then open<br>
> > > either a pull request or issue at<br>
> > > <a href="https://github.com/ClusterLabs/resource-agents" rel="noreferrer" target="_blank">https://github.com/ClusterLabs/resource-agents</a><br>
> ><br>
> > I already made a pull request:<br>
> ><br>
> > <a href="https://github.com/ClusterLabs/resource-agents/pull/741" rel="noreferrer" target="_blank">https://github.com/ClusterLabs/resource-agents/pull/741</a><br>
> ><br>
> > Please test if you find time.<br>
> ><br>
> I tested the code, but still problems remain.<br>
> systemctl is-active retrun active and the return code is 0 as well as<br>
> systemctl status.<br>
> Perhaps it is inappropriate to use systemctl for monitoring the kernel<br>
> process.<br>
<br>
</div></div>OK. My patch was too naive and didn't take into account the<br>
systemd/kernel intricacies.<br>
<span class=""><br>
> Mr Kay Sievers who is a developer of systemd said that systemd doesn't<br>
> monitor kernel process in the following.<br>
> <a href="http://comments.gmane.org/gmane.comp.sysutils.systemd.devel/34367" rel="noreferrer" target="_blank">http://comments.gmane.org/gmane.comp.sysutils.systemd.devel/34367</a><br>
<br>
</span>Thanks for the reference. One interesting thing could also be<br>
reading /proc/fs/nfsd/threads instead of checking the process<br>
existence. Furthermore, we could do some RPC based monitor, but<br>
that would be, I guess, better suited for another monitor depth.<br>
<br></blockquote><div>OK. I survey and test the /proc/fs/nfsd/threads.<br><div style="" id="gt-input-tool"><div id="itamenu"><span class=""></span></div></div><div id="gt-src-c" class=""><div id="gt-src-p"> </div> </div> <div id="gt-res-content" class=""><div dir="ltr" style="zoom:1"><span id="result_box" class="" lang="en"><span class="">It seems</span> <span class="">to work well</span> <span class="">on my</span> <span class="">cluster.</span></span></div></div>I make a patch and a pull request.<br><a href="https://github.com/ClusterLabs/resource-agents/pull/746">https://github.com/ClusterLabs/resource-agents/pull/746</a><br><br></div><div>Please check if you have time.<br><br></div><div>Regards,<br></div><div>Yuta<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
Cheers,<br>
<br>
Dejan<br>
<div class=""><div class="h5"><br>
> I reply to your pull request.<br>
><br>
> Regards,<br>
> Yuta Takeshita<br>
><br>
> ><br>
> > Thanks for reporting!<br>
> ><br>
> > Dejan<br>
> ><br>
> > > Thanks,<br>
> > ><br>
> > > Dejan<br>
> > ><br>
> > > > 2. nfsserver_monitor() judge with the return value of "systemctl status<br>
> > > > nfs-server.service".<br>
> > > ><br>
> > > > ----------------------------------------------------------------------<br>
> > > > # ps ax | grep nfsd<br>
> > > > 25193 ? S< 0:00 [nfsd4]<br>
> > > > 25194 ? S< 0:00 [nfsd4_callbacks]<br>
> > > > 25197 ? S 0:00 [nfsd]<br>
> > > > 25198 ? S 0:00 [nfsd]<br>
> > > > 25199 ? S 0:00 [nfsd]<br>
> > > > 25200 ? S 0:00 [nfsd]<br>
> > > > 25201 ? S 0:00 [nfsd]<br>
> > > > 25202 ? S 0:00 [nfsd]<br>
> > > > 25203 ? S 0:00 [nfsd]<br>
> > > > 25204 ? S 0:00 [nfsd]<br>
> > > > 25238 pts/0 S+ 0:00 grep --color=auto nfsd<br>
> > > > #<br>
> > > > # pkill -9 nfsd<br>
> > > > #<br>
> > > > # systemctl status nfs-server.service<br>
> > > > ● nfs-server.service - NFS server and services<br>
> > > > Loaded: loaded (/etc/systemd/system/nfs-server.service; disabled;<br>
> > vendor<br>
> > > > preset: disabled)<br>
> > > > Active: active (exited) since 木 2016-01-14 11:35:39 JST; 1min 3s ago<br>
> > > > Process: 25184 ExecStart=/usr/sbin/rpc.nfsd $RPCNFSDARGS<br>
> > (code=exited,<br>
> > > > status=0/SUCCESS)<br>
> > > > Process: 25182 ExecStartPre=/usr/sbin/exportfs -r (code=exited,<br>
> > > > status=0/SUCCESS)<br>
> > > > Main PID: 25184 (code=exited, status=0/SUCCESS)<br>
> > > > CGroup: /system.slice/nfs-server.service<br>
> > > > (snip)<br>
> > > > #<br>
> > > > # echo $?<br>
> > > > 0<br>
> > > > #<br>
> > > > # ps ax | grep nfsd<br>
> > > > 25256 pts/0 S+ 0:00 grep --color=auto nfsd<br>
> > > > ----------------------------------------------------------------------<br>
> > > ><br>
> > > > It is because the nfsd process is kernel process, and systemd does not<br>
> > > > monitor the state of the kernel process of running.<br>
> > > ><br>
> > > > Is there something good way?<br>
> > > > (When I use "pidof" instead of "systemctl status", the faileover is<br>
> > > > successful.)<br>
> > > ><br>
> > > > Regards,<br>
> > > > Yuta Takeshita<br>
> > ><br>
> > > > _______________________________________________<br>
> > > > Users mailing list: <a href="mailto:Users@clusterlabs.org">Users@clusterlabs.org</a><br>
> > > > <a href="http://clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://clusterlabs.org/mailman/listinfo/users</a><br>
> > > ><br>
> > > > Project Home: <a href="http://www.clusterlabs.org" rel="noreferrer" target="_blank">http://www.clusterlabs.org</a><br>
> > > > Getting started:<br>
> > <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" rel="noreferrer" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>
> > > > Bugs: <a href="http://bugs.clusterlabs.org" rel="noreferrer" target="_blank">http://bugs.clusterlabs.org</a><br>
> > ><br>
> > ><br>
> > > _______________________________________________<br>
> > > Users mailing list: <a href="mailto:Users@clusterlabs.org">Users@clusterlabs.org</a><br>
> > > <a href="http://clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://clusterlabs.org/mailman/listinfo/users</a><br>
> > ><br>
> > > Project Home: <a href="http://www.clusterlabs.org" rel="noreferrer" target="_blank">http://www.clusterlabs.org</a><br>
> > > Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" rel="noreferrer" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>
> > > Bugs: <a href="http://bugs.clusterlabs.org" rel="noreferrer" target="_blank">http://bugs.clusterlabs.org</a><br>
> ><br>
> > _______________________________________________<br>
> > Users mailing list: <a href="mailto:Users@clusterlabs.org">Users@clusterlabs.org</a><br>
> > <a href="http://clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://clusterlabs.org/mailman/listinfo/users</a><br>
> ><br>
> > Project Home: <a href="http://www.clusterlabs.org" rel="noreferrer" target="_blank">http://www.clusterlabs.org</a><br>
> > Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" rel="noreferrer" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>
> > Bugs: <a href="http://bugs.clusterlabs.org" rel="noreferrer" target="_blank">http://bugs.clusterlabs.org</a><br>
> ><br>
<br>
> _______________________________________________<br>
> Users mailing list: <a href="mailto:Users@clusterlabs.org">Users@clusterlabs.org</a><br>
> <a href="http://clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://clusterlabs.org/mailman/listinfo/users</a><br>
><br>
> Project Home: <a href="http://www.clusterlabs.org" rel="noreferrer" target="_blank">http://www.clusterlabs.org</a><br>
> Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" rel="noreferrer" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>
> Bugs: <a href="http://bugs.clusterlabs.org" rel="noreferrer" target="_blank">http://bugs.clusterlabs.org</a><br>
<br>
<br>
_______________________________________________<br>
Users mailing list: <a href="mailto:Users@clusterlabs.org">Users@clusterlabs.org</a><br>
<a href="http://clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://clusterlabs.org/mailman/listinfo/users</a><br>
<br>
Project Home: <a href="http://www.clusterlabs.org" rel="noreferrer" target="_blank">http://www.clusterlabs.org</a><br>
Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" rel="noreferrer" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>
Bugs: <a href="http://bugs.clusterlabs.org" rel="noreferrer" target="_blank">http://bugs.clusterlabs.org</a><br>
</div></div></blockquote></div><br></div></div>