[ClusterLabs] [corosync] Virtual Synchrony Property guarantees in case of network partition

Jan Friesse jfriesse at redhat.com
Mon Jun 6 11:10:39 UTC 2016


satish kumar napsal(a):
> Hello honza, thanks for the response !
>
> With state sync, I simply mean that 'k-1' messages were delivered to N1, N2
> and N3 and they have applied these messages to change their program state.
> N1.state = apply(m(k-1);
> N2.state = apply(m(k-1);
> N3.state = apply(m(k-1);
>
> The document you shared cleared many doubts. However I still need one
> clarification.
>
> According to the document:
> "The configuration change messages warn the application that a membership
> change has occurred, so that the application program can take appropriate
> action based on the membership change. Extended virtual synchrony
> guarantees a consistent order of messages delivery across a partition,
> which is essential if the application program are to be able to reconcile
> their states following repair of a failed processor or reemerging of the
> partitioned network."
>
> I just want to know that this property is not something related to
> CPG_TYPE_SAFE, which is still not implemented.
> Please consider this scenario:
> 0. N1, N2 and N3 has received the message m(k-1).
> 1. N1 mcast(CPG_TYPE_AGREED) m(k) message.
> 2. As it is not CPG_TYPE_SAFE, m(k) delievered to N1 but was not yet
> delivered to N2 and N3.
> 3. Network partition separate N1 from N2 and N3. N2 and N3 can never see
> m(k).
> 4. Configuration change message is now delivered to N1, N2 and N3.
>
> Here, N1 will change its state to N1.state = apply(m(k), thinking all in
> the current configuration has received the message.
>
> According to your reply it looks like N1 will not receive m(k). So this is
> what each node will see:
> N1 will see: m(k-1) -> C1 (config change)
> N2 will see: m(k-1) -> C1 (config change)
> N3 will see: m(k-1) -> C1 (config change)

For N2 and N3, it's not same C1. So let's call it C2. Because C1 for N1 
is (N2 and N3 left) and C2 for N2 and N3 is (N1 left).


>
> Message m(k) will be discarded, and will not be delivered to N1 even if it
> was sent by N1 before the network partition.

No. m(k) will be delivered to app running on N1. So N1 will see m(k-1), 
C1, m(k). So application exactly knows which node got message m(k).

Regards,
   Honza

>
> This is the expected behavior with CPG_TYPE_AGREED?
>
> Regards,
> Satish
>
>
> On Mon, Jun 6, 2016 at 4:15 PM, Jan Friesse <jfriesse at redhat.com> wrote:
>
>> Hi,
>>
>> Hello,
>>>
>>> Virtual Synchrony Property - messages are delivered in agreed order and
>>> configuration changes are delivered in agreed order relative to message.
>>>
>>> What happen to this property when network is partitioned the cluster into
>>> two. Consider following scenario (which I took from one of the
>>> previous query by Andrei Elkin):
>>>
>>> * N1, N2 and N3 are in state sync with m(k-1) messages are delivered.
>>>
>>
>> What exactly you mean by "state sync"?
>>
>> * N1 sends m(k) and just now network partition N1 node from N2 and N3.
>>>
>>> Does CPG_TYPE_AGREED guarantee that virtual synchrony is held?
>>>
>>
>> Yes it does (actually higher level of VS called EVS)
>>
>>
>>> When property is held, configuration change message C1 is guaranteed to
>>> delivered before m(k) to N1.
>>> N1 will see: m(k-1) C1 m(k)
>>> N2 and N3 will see: m(k-1) C1
>>>
>>> But if this property is violated:
>>> N1 will see: m(k-1) m(k) C1
>>> N2 and N3 will see: m(k-1) C1
>>>
>>> Violation will screw any user application running on the cluster.
>>>
>>> Could someone please explain what is the behavior of Corosync in this
>>> scenario with CPG_TYPE_AGREED ordering.
>>>
>>
>> For description how exactly totem synchronization works take a look to
>> http://corosync.github.com/corosync/doc/DAAgarwal.thesis.ps.gz
>>
>> After totem is synchronized, there is another level of synchronization of
>> services (not described in above doc). All services synchronize in very
>> similar way, so you can take a look to CPG as example. Basically only state
>> held by CPG is connected clients. So every node sends it's connected
>> clients list to every other node. If sync is aborted (change of
>> membership), it's restarted. These sync messages has priority over user
>> messages (actually it's not possible to send messages during sync). User
>> app can be sure that message was delivered only after it gets it's own
>> message. Also app gets configuration change message so it knows, who got
>> the message.
>>
>> Regards,
>>    Honza
>>
>>
>>> Regards,
>>> Satish
>>>
>>>
>>>
>>> _______________________________________________
>>> Users mailing list: Users at clusterlabs.org
>>> http://clusterlabs.org/mailman/listinfo/users
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>>
>>>
>>
>> _______________________________________________
>> Users mailing list: Users at clusterlabs.org
>> http://clusterlabs.org/mailman/listinfo/users
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>
>
>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>





More information about the Users mailing list