[ClusterLabs] Antw: RRP works "in totally different way then most of people expects"

Ulrich Windl Ulrich.Windl at rz.uni-regensburg.de
Mon Jun 8 06:57:21 EDT 2015


Answering my own questions:

Those ringid files are created by corosync_ring_id_store() from main.c, and the contents is the ring sequence number (binary). The routine is called (indrirectly) from memb_state_commit_enter() (totemsrp.c). That routine is called by memb_join_process() and message_handler_memb_commit_token().

An obvious question arises:

As the sequence number is only stored for the active ring, what happens if rings are switched (rrp_mode == passive)? Likewise it seems corosync is using the IP of the first available ring as representative on startup, it seems. Shouldn't the representative (and identifier) be stable (as per slide 4 of [AMMA00] Amir, Y.; Moser, L. E.; Melliar-Smith, P. M.; Agarwal, D. A.; Ciarfella, P.: The Totem Single-Ring Ordering and Membership Protocol)? I doubt that representative and ringid are persistent just for each ring individually...

The "rxw"-Permissions are the natural consequence of this code (corosync_ring_id_store()):
        fd = open (filename, O_WRONLY, 0777);
        if (fd == -1) {
                fd = open (filename, O_CREAT|O_RDWR, 0777);
        }

So here's my first patch proposal also.

Regards,
Ulrich

>>> "Ulrich Windl" <Ulrich.Windl at rz.uni-regensburg.de> schrieb am 02.06.2015 um
16:07 in Nachricht <556DD4B1020000A10001AA01 at gwsmtp1.uni-regensburg.de>:
>>>> Jan Friesse <jfriesse at redhat.com> schrieb am 02.06.2015 um 12:57 in Nachricht
> <556D8C08.5060001 at redhat.com>:
> [...]
>> Last but not least is RRP. RRP itself works very well sadly it works in
>> totally different way then most of people expects.
> [...]
> 
> Probably because the documentation leaves to many questions unanswered...
> You can't tell what's wrong as long as you don't know how it should work.
> 
> Example: What's the purpose of /var/lib/corosync/ringid_* files, and why do 
> some hosts have /var/lib/corosync/ringid_127.0.0.1? The file content seems to 
> be some binary number, but the file is created with rwx permissions...
> 
> On systems with two rings (not on 127.*) I see all combinations from one to 
> three ringid files.
> 
> Regards,
> Ulrich
> 
> 
> 
> _______________________________________________
> Users mailing list: Users at clusterlabs.org 
> http://clusterlabs.org/mailman/listinfo/users 
> 
> Project Home: http://www.clusterlabs.org 
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
> Bugs: http://bugs.clusterlabs.org 




-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: 0001-exec-main.c-Improve-corosync_ring_id_store.patch
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20150608/4c0498e7/attachment-0003.ksh>


More information about the Users mailing list