[Pacemaker] limiting number of simultaneous acions for particular resource type
Dejan Muhamedagic
dejanmm at fastmail.fm
Wed Nov 25 09:45:03 EST 2009
Hi,
On Wed, Nov 25, 2009 at 02:37:23PM +0100, Nikola Ciprich wrote:
> Hello everybody,
> I'm trying to solve following issue:
> I've got specific resource type (virtual machine in particular)
> which takes quite long to start/stop and those actions cause
> considerable load on hosting system. on my cluster we're
> running tens of instances of vm resources, and trying to
> shutdown pacemaker on node causes it trying to stop many of
> those resources in parallel, which causes heavy machine
> overload. Then operations start timing out and whole cluster
> goes nuts. Is it possible to set some kind of constraint so
> that not more than ie 2 parallel actions are executed in time
> for vm class resource? I can't group them using group resource,
> because some of those can have target-role set to stopped if
> they're not needed... Or how can I at least set some global
> limit on number of simultaneous actions in general? If
> possible, I'd like to limit even the monitor actions so they
> run in serial if possible...
Somebody else (Dominik I think) had a similar issue, but can't
recall the outcome now. At any rate, it's possible to set the
global limit on parallel actions per node in lrmd. It is included
in /etc/init.d/openais, but probably not in
/etc/init.d/heartbeat. This is how it's set:
# lrmadmin -p max-children $LRMD_MAX_CHILDREN
The default is 4. A child of the lrmd is actually an RA process
running some action (monitor, start, etc).
It's a bit more complicated in the init script since we have to
make sure that lrmd is ready to serve requests. This is the
relevant part:
wait_for_lrmd() {
local maxwait=30
local i=0
while [ $i -lt $maxwait ]; do
test -S /var/run/heartbeat/lrm_cmd_sock >/dev/null 2>&1 &&
break
sleep 1
i=$(($i+1))
done
if [ $i -lt $maxwait ]; then
return 0
else
echo "lrmd apparently didn't start"
return 1
fi
}
set_lrmd_options() {
if [ -n "$LRMD_MAX_CHILDREN" ]; then
wait_for_lrmd || return
$LRMADMIN -p max-children $LRMD_MAX_CHILDREN
fi
}
I'll have that bit added to heartbeat for the next release.
Thanks,
Dejan
> Thanks a lot in advance!
> with best regards
> nik
>
> --
> -------------------------------------
> Nikola CIPRICH
> LinuxBox.cz, s.r.o.
> 28. rijna 168, 709 01 Ostrava
>
> tel.: +420 596 603 142
> fax: +420 596 621 273
> mobil: +420 777 093 799
> www.linuxbox.cz
>
> mobil servis: +420 737 238 656
> email servis: servis at linuxbox.cz
> -------------------------------------
>
> _______________________________________________
> Pacemaker mailing list
> Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
More information about the Pacemaker
mailing list