[ClusterLabs] proftpd resource agent - fix for a start/monitor race condition
Matthias Ferdinand
mf at 14v.de
Wed Mar 25 10:40:32 UTC 2015
Hello,
the proftpd resource agent sometimes shows a race condition:
if startup of the proftpd binary is slow, the pacemaker monitor
operation immediately following the start operation may not yet find
the pid-file from proftpd, and then it will signal failure. Subsequent
retries of the start operation then keep failing because the tcp sockets
are already used by the initial proftpd (which was never stopped).
Fix (copied from the apache resource agent): after invoking the proftpd
binary, do not return to caller until the monitor operation (called
from within the RA itself) shows "success". Handling startup timeouts is
left to the cluster manager.
Regards
Matthias Ferdinand
--
one4vision GmbH Fon +49 681 96727 - 60
Residenz am Schlossgarten Fax +49 681 96727 - 69
Talstraße 34-42 info at one4vision.de
D-66119 Saarbrücken http://www.one4vision.de
HRB 11751 verantwortl. Geschäftsführer:
Amtsgericht Saarbrücken Christof Allmann, Christoph Harth
-------------- next part --------------
--- 20150226_usr_lib_ocf_resource.d_heartbeat_proftpd 2015-02-26 17:39:19.956590821 +0100
+++ patched_proftpd 2015-02-26 17:51:06.027695989 +0100
@@ -163,7 +163,25 @@
exit $OCF_ERR_GENERIC
fi
- exit $OCF_SUCCESS
+ tries=0
+ while : # wait until the user set timeout
+ do
+ proftpd_monitor
+ ec=$?
+ if [ $ec -eq $OCF_NOT_RUNNING ]
+ then
+ tries=`expr $tries + 1`
+ ocf_log info "waiting for proftpd ${OCF_RESKEY_conffile} to come up"
+ sleep 1
+ else
+ break
+ fi
+ done
+
+ if [ $ec -ne 0 ]; then
+ proftpd_stop
+ fi
+ return $ec
}
@@ -264,6 +282,7 @@
case $1 in
start) proftpd_validate_all
proftpd_start
+ exit $?
;;
stop) proftpd_stop
@@ -298,4 +317,3 @@
exit $OCF_ERR_UNIMPLEMENTED
;;
esac
-
More information about the Users
mailing list