<div class="gmail_quote">On Tue, Apr 7, 2009 at 6:49 AM, btinsley <span dir="ltr"><<a href="mailto:btinsley@gmail.com">btinsley@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div><div></div><div class="h5"><div class="gmail_quote">On Tue, Apr 7, 2009 at 6:13 AM, btinsley <span dir="ltr"><<a href="mailto:btinsley@gmail.com" target="_blank">btinsley@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div><div></div><div><div class="gmail_quote">On Tue, Apr 7, 2009 at 4:44 AM, Andrew Beekhof <span dir="ltr"><<a href="mailto:beekhof@gmail.com" target="_blank">beekhof@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div>On Tue, Apr 7, 2009 at 11:39, Dejan Muhamedagic <<a href="mailto:dejanmm@fastmail.fm" target="_blank">dejanmm@fastmail.fm</a>> wrote:<br>
> On Tue, Apr 07, 2009 at 08:07:27AM +0200, Andrew Beekhof wrote:<br>
>> Nothing in pacemaker - but the lrmd could be doing something that may<br>
>> be involved<br>
><br>
> AFAIK, it doesn't.<br>
<br>
</div>Ok, I was just thinking that its the one that ultimately spawns the<br>
process... so anything it did would also be inherited by the client.<br>
<br>
btinsley: Perhaps look at /proc/<pid>/sched for the lrmd and aisexec<br>
processes. That may give you an idea of where the settings are coming<br>
from.<br>
<div><div></div><div><br>
><br>
>><br>
>> On Tue, Apr 7, 2009 at 02:21, btinsley <<a href="mailto:btinsley@gmail.com" target="_blank">btinsley@gmail.com</a>> wrote:<br>
>> > This is a little early in investigation on my end, but is there anything<br>
>> > that Pacemaker... or potentially OpenAIS, does that would restrict a cluster<br>
>> > resource from setting the scheduler type and/or priority? I have been<br>
>> > tinkering with KVM instances and there is a 100% repeatable difference<br>
>> > between how it behaves when Pacemaker starts the resource and when I start<br>
>> > it via the same OCF script on the command line (as the root user). When it<br>
>> > is started via Pacemaker, it seems to be unable to change its scheduler and<br>
>> > priority and eventually the whole system is brought to its knees and all<br>
>> > monitors get stuck in the "waiting on I/O" (D) state, which freaks Pacemaker<br>
>> > out...rightfully so ;-)<br>
>> ><br>
>> > Looking at /proc/<pid>/sched shows:<br>
>> ><br>
>> > ...<br>
>> > policy???????????????????????????? :??????????????????? 0<br>
>> > prio?????????????????????????????? :??????????????????? 0<br>
>> > ...<br>
>> ><br>
>> > And invoking the resource from the command line shows:<br>
>> ><br>
>> > ...<br>
>> > policy???????????????????????????? :??????????????????? 2<br>
>> > prio?????????????????????????????? :??????????????????? 120<br>
>> > ...<br>
>> ><br>
>> > Invoking ulimit with the -e and -r parameters with a value of "unlimited" in<br>
>> > the OCF script does no good. Thoughts?<br>
>> ><br>
>> ><br>
</div></div></blockquote></div><br><br></div></div>Thanks! I will do that this morning and go from there.<br><br><br>
</blockquote></div><br></div></div>OK, aisexec and all Pacemaker processes have policy 2 (SCHED_RR) and priority 0. All cluster resources have the same values (I think I inverted the policy lines in the paste above). I also looked at other non-clustered processes on the system (syslog-ng, sshd, etc) and they all have policy 0 (SCHED_OTHER) and priority 120, which is what the KVM process appears to need to function properly. I will post to the AIS list and see what they say since it looks like tweaking the scheduler parameters begins there.<br>
<br>
</blockquote></div><br><br>AIS guys said to upgrade to the latest Whitetank :-) I did and the behavior is the same, but it's not necessarily incorrect. The aisexec process sets itself to the realtime scheduling class, which does the same for all of the Pacemaker processes when they are spawned. This is probably how you want the cluster daemons to run. However, when lrmd spawns resource scripts *everything* the script does also inherits the realtime scheduling class. I'm not sure this is how you want all your clustered applications running (or all the other stuff a resource script may do). Thoughts here?<br>
<br>As a workaround, I added calls to the chrt program in each resource script to "downgrade" the scheduler to SCHED_OTHER and set the priority to zero, which is the system default.<br><br>