[ClusterLabs Developers] libqb: Re: 633f262 logging: Remove linker 'magic' and just use statics for logging callsites (#322)

Christine Caulfield ccaulfie at redhat.com
Tue Jan 22 05:41:35 EST 2019


On 18/01/2019 19:25, Ken Gaillot wrote:
> On Fri, 2019-01-18 at 18:53 +0100, Lars Ellenberg wrote:
>> On Thu, Jan 17, 2019 at 09:09:11AM +1100, Andrew Beekhof wrote:
>>>
>>>
>>>> On 17 Jan 2019, at 2:59 am, Ken Gaillot <kgaillot at redhat.com>
>>>> wrote:
>>>>
>>>> I'm not familiar with the reasoning for the current setup, but
>>>> pacemaker's crm_crit(), crm_error(), etc. use qb_logt(), while
>>>> crm_debug() and crm_trace() (which won't be used in ordinary
>>>> runs) do
>>>> something similar to what you propose.
>>>>
>>>> Pacemaker has about 1,700 logging calls that would be affected
>>>> (not
>>>> counting another 2,000 debug/trace). Presumably that means
>>>> Pacemaker
>>>> currently has about +16KB of memory overhead and binary size for
>>>> debug/trace logging static pointers, and that would almost double
>>>> using
>>>> them for all logs. Not a big deal today? Or meaningful in an
>>>> embedded
>>>> context?
>>>>
>>>> Not sure if that overhead vs runtime trade-off is the original
>>>> motivation or not, but that's the first thing that comes to mind.
>>>
>>> I believe my interest was the ability to turn them on dynamically
>>> in a
>>> running program (yes, i used it plenty back in the day) and have
>>> the
>>> overhead be minimal for the normal case when they weren't in use.
>>
>> Also, with libqb before the commit mentioned in the subject (633f262)
>> and that is what pacemaker is using right now,
>> you'd get one huge static array of "struct callsites" (in a special
>> linker section; that's the linker magic that patch removes).
>>
>> Note: the whole struct was statically allocated,
>> it is an array of structs, not just an array of pointers.
>>
>> sizeof(struct qb_log_callsite) is 40
> 
> OK, so prior to the change, pacemaker used ~66KB of static struct
> allocation in the linker section for <=info logs, and ~16KB of normal
> static pointer allocation for debug/trace.
> 
> After the change, ~16KB is used statically, and an additional fraction
> of 66KB is dynamically allocated according to what messages are hit.
> 
> With your suggestion, an additional ~13KB would be used statically, and
> some fraction of that would be saved from dynamic allocation (besides
> the CPU efficiency). Could be worth it.
> 
>> Now, those structs get dynamically allocated,
>> and put in some lineno based lookup hash.
>> (so already at least additional 16 bytes),
>> not counting malloc overhead for all the tiny objects.
> 
> true, but now only messages that are potentially logged get
> (dynamically) allocated, whereas before, even messages in functions and
> code blocks that will never be used in a given run were (statically)
> allocated
> 
>> The additional 8 byte static pointer
>> is certainly not "doubling" that overhead.
> 
> right, a better estimate is above
> 
>> But can be used to skip the additional lookup,
>> sprintf, memcpy and whatnot, and even the function call,
>> if the callsite at hand is currently disabled,
>> which is probably the case for most >= trace
>> callsites most of the time.
> 
> pacemaker already implements >=debug similar to what you propose, so it
> likely wouldn't help much. other users of libqb might benefit though
> 
>> Any volunteers to benchmark the cpu usage?
>> I think we'd need
>> (trace logging: {enabled, disabled})
> 
> maybe info logging for pacemaker; debug and trace won't be affected.
> some other libqb app might be more significant. can also compare binary
> sizes.
> 
> I'm definitely not opposed to your suggestion. I think it probably
> makes sense.
> 
>> x ({before 633f262,
>>     after 633f262,
>>     after 633f262 + lars patch})
>>
>> BTW,
>> I think without the "linker magic"
>> (static array of structs),
>> the _log_filter_apply() becomes a no-op?
>> That's qb_log_filter_ctl2() at runtime.
>> It would have to iterate over all the collision lists in all the
>> buckets
>> of the dynamically allocated callsites, instead of iterating the
>> (now non-existing) static array of callsites.
>>
>> One side-effect of not using a static pointer,
>> but *always* doing the lookup (qb_log_calsite_get()) again,
>> is that a potentially set _custom_filter_fn() would be called
>> and that filter applied to the callsite, at each invocation.
>>
>> But I don't think that that is intentional?
>>

(sorry for late reply, I've not been well lately)

It looks like a good optimisation to me. As Pacemaker is the biggest
used of libqb I'll take Ken's opinion very seriously :)

Chrissie



More information about the Developers mailing list