[Pacemaker] The larger cluster is tested.
yusuke iida
yusk.iida at gmail.com
Mon Nov 11 23:25:00 EST 2013
Hi, Andrew
I'm sorry.
This report was a thing when two cores were assigned to the virtual machine.
https://drive.google.com/file/d/0BwMFJItoO-fVdlIwTVdFOGRkQ0U/edit?usp=sharing
I'm sorry to be misleading.
This is the report acquired with one core.
https://drive.google.com/file/d/0BwMFJItoO-fVSlo0dE0xMzNORGc/edit?usp=sharing
It does not define the LRMD_MAX_CHILDREN on any node.
load-threshold is still default.
cib_max_cpu is set to 0.4 by the following processing.
if(cores == 1) {
cib_max_cpu = 0.4;
}
since -- if it exceeds 60%, it will be in the state of Extreme.
Nov 08 11:08:31 [2390] vm01 crmd: ( throttle.c:441 ) notice:
throttle_mode: Extreme CIB load detected: 0.670000
>From the state of a bit, DC is detecting that vm01 is in the state of Extreme.
Nov 08 11:08:32 [2387] vm13 crmd: ( throttle.c:701 ) debug:
throttle_update: Host vm01 supports a maximum of 2 jobs and
throttle mode 1000. New job limit is 1
>From the following log, a dynamic change of batch-limit also seems to
process satisfactorily.
# grep "throttle_get_total_job_limit" pacemaker.log
(snip)
Nov 08 11:08:31 [2387] vm13 crmd: ( throttle.c:629 ) trace:
throttle_get_total_job_limit: No change to batch-limit=0
Nov 08 11:08:32 [2387] vm13 crmd: ( throttle.c:632 ) trace:
throttle_get_total_job_limit: Using batch-limit=8
(snip)
Nov 08 11:10:32 [2387] vm13 crmd: ( throttle.c:632 ) trace:
throttle_get_total_job_limit: Using batch-limit=16
The above shows that it is not solved even if it restricts the whole
number of jobs by batch-limit.
Are there any other methods of reducing a synchronous message?
Internal IPC message is not so much.
Do not be able to handle even a little it on the way to handle the
synchronization message?
Regards,
Yusuke
2013/11/12 Andrew Beekhof <andrew at beekhof.net>:
>
> On 11 Nov 2013, at 11:48 pm, yusuke iida <yusk.iida at gmail.com> wrote:
>
>> Execution of the graph was also checked.
>> Since the number of pending(s) is restricted to 16 from the middle, it
>> is judged that batch-limit is effective.
>> Observing here, even if a job is restricted by batch-limit, two or
>> more jobs are always fired(ed) in 1 second.
>> These performed jobs return a result and the synchronous message of
>> CIB generates them.
>> The node which continued receiving a synchronous message processes
>> there preferentially, and postpones an internal IPC message.
>> I think that it caused timeout.
>
> What load-threshold were you running this with?
>
> I see this in the logs:
> "Host vm10 supports a maximum of 4 jobs and throttle mode 0100. New job limit is 1"
>
> Have you set LRMD_MAX_CHILDREN=4 on these nodes?
> I wouldn't recommend that for a single core VM. I'd let the default of 2*cores be used.
>
>
> Also, I'm not seeing "Extreme CIB load detected". Are these still single core machines?
> If so it would suggest that something about:
>
> if(cores == 1) {
> cib_max_cpu = 0.4;
> }
> if(throttle_load_target > 0.0 && throttle_load_target < cib_max_cpu) {
> cib_max_cpu = throttle_load_target;
> }
>
> if(load > 1.5 * cib_max_cpu) {
> /* Can only happen on machines with a low number of cores */
> crm_notice("Extreme %s detected: %f", desc, load);
> mode |= throttle_extreme;
>
> is wrong.
>
> What was load-threshold configured as?
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
--
----------------------------------------
METRO SYSTEMS CO., LTD
Yusuke Iida
Mail: yusk.iida at gmail.com
----------------------------------------
More information about the Pacemaker
mailing list