[ClusterLabs Developers] New challenges with corosync 3/kronosnet + pacemaker

Mon Feb 19 11:08:03 UTC 2018

On 09/02/18 17:55 -0600, Ken Gaillot wrote:
> On Fri, 2018-02-09 at 18:54 -0500, Digimer wrote:
>> On 2018-02-09 06:51 PM, Ken Gaillot wrote:
>>> On Fri, 2018-02-09 at 12:52 -0500, Digimer wrote:
>>>> On 2018-02-09 03:27 AM, Jan Pokorný wrote:
>>>>> there is certainly whole can of these worms, put first that
>>>>> crosses my mind: performing double (de)compression on two levels
>>>>> of abstraction in the inter-node communication is not very
>>>>> clever, to put it mildly.
>>>>> 
>>>>> So far, just pacemaker was doing that for itself under certain
>>>>> conditions, now corosync 3 will have it's iron in this fire
>>>>> through kronosnet, too.  Perhaps something to keep in mind to
>>>>> avoid exercises in futility.
>>>> 
>>>> Can pacemaker be told to not do compression? If not, can that be
>>>> added in pacemaker v2?
>>> 
>>> Or better yet, is there some corosync API call we can use to
>>> determine whether corosync/knet is using compression?
>>> 
>>> There's currently no way to turn compression off in Pacemaker,
>>> however it is only used for IPC messages that pass a fairly high
>>> size threshold, so many clusters would be unaffected even without
>>> changes.
>> 
>> Can you "turn off compression" but just changing that threshold to
>> some silly high number?
> 
> It's hardcoded, so you'd have to edit the source and recompile.

FTR, since half year ago, I've had some resources noted for further
investigation on this topic of pacemaker-level compression -- since
it compresses XML, there are some specifics of the input that sugggest
more effective processing is possible.

Indeed; there's a huge, rigorously maintained non-binary files
compression benchmark that coincidentally also aims at XML files
(despite presumably more text-oriented than structure-oriented):

  http://mattmahoney.net/dc/text.html

Basically, I can see two (three) categories of possible optimizations:

0. pre-fill the scan dictionary for the compression algorithm
   with sequences that are statistically (constantly) most frequent
   (a priori known tag names?)

1. preprocessing of XML to allow for more efficient generic
   compression (like with bzip2 that is currently utilized), e.g.

   * XMill
     - https://homes.cs.washington.edu/~suciu/XMILL/

   * XWRT (XML-WRT)
     - https://github.com/inikep/XWRT

2. more effiecient algorithms as such for non-binary payloads
   (the benchmark above can help with selection of the candidates)

* * *

That being said, there are legitimate reasons to want merely the
high-level messaging be involved with compression, because that's
the only layer intimate with the respective application-specific
data and hence can provide optimal compression methods beyond
the reach of the generic ones.

-- 
Poki
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://lists.clusterlabs.org/pipermail/developers/attachments/20180219/3bdff82f/attachment.sig>