[Pacemaker] The larger cluster is tested.
    Andrew Beekhof 
    andrew at beekhof.net
       
    Mon Nov 11 23:03:20 UTC 2013
    
    
  
On 11 Nov 2013, at 11:48 pm, yusuke iida <yusk.iida at gmail.com> wrote:
> Execution of the graph was also checked.
> Since the number of pending(s) is restricted to 16 from the middle, it
> is judged that batch-limit is effective.
> Observing here, even if a job is restricted by batch-limit, two or
> more jobs are always fired(ed) in 1 second.
> These performed jobs return a result and the synchronous message of
> CIB generates them.
> The node which continued receiving a synchronous message processes
> there preferentially, and postpones an internal IPC message.
> I think that it caused timeout.
What load-threshold were you running this with?
I see this in the logs:
"Host vm10 supports a maximum of 4 jobs and throttle mode 0100.  New job limit is 1"
Have you set LRMD_MAX_CHILDREN=4 on these nodes?
I wouldn't recommend that for a single core VM.  I'd let the default of 2*cores be used.
Also, I'm not seeing "Extreme CIB load detected".  Are these still single core machines?
If so it would suggest that something about:
        if(cores == 1) {
            cib_max_cpu = 0.4;
        }
        if(throttle_load_target > 0.0 && throttle_load_target < cib_max_cpu) {
            cib_max_cpu = throttle_load_target;
        }
        if(load > 1.5 * cib_max_cpu) {
            /* Can only happen on machines with a low number of cores */
            crm_notice("Extreme %s detected: %f", desc, load);
            mode |= throttle_extreme;
is wrong.
What was load-threshold configured as?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20131112/aa0eadfa/attachment-0004.sig>
    
    
More information about the Pacemaker
mailing list