<div dir="ltr">Thank you. Now I am aware of it.</div><div class="gmail_extra"><br clear="all"><div><div class="gmail_signature"><div dir="ltr">Thank you,<div>Kostya</div></div></div></div>
<br><div class="gmail_quote">On Wed, Jan 14, 2015 at 12:59 PM, Jan Friesse <span dir="ltr"><<a href="mailto:jfriesse@redhat.com" target="_blank">jfriesse@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Kostiantyn,<br>
<span class=""><br>
> Honza,<br>
><br>
> Thank you for helping me.<br>
> So, there is no defined behavior in case one of the interfaces is not in<br>
> the system?<br>
<br>
</span>You are right. There is no defined behavior.<br>
<div class="HOEnZb"><div class="h5"><br>
Regards,<br>
Honza<br>
<br>
<br>
><br>
><br>
> Thank you,<br>
> Kostya<br>
><br>
> On Tue, Jan 13, 2015 at 12:01 PM, Jan Friesse <<a href="mailto:jfriesse@redhat.com">jfriesse@redhat.com</a>> wrote:<br>
><br>
>> Kostiantyn,<br>
>><br>
>><br>
>>> According to the <a href="https://access.redhat.com/solutions/638843" target="_blank">https://access.redhat.com/solutions/638843</a> , the<br>
>>> interface, that is defined in the corosync.conf, must be present in the<br>
>>> system (see at the bottom of the article, section "ROOT CAUSE").<br>
>>> To confirm that I made a couple of tests.<br>
>>><br>
>>> Here is a part of the corosync.conf file (in a free-write form) (also<br>
>>> attached the origin config file):<br>
>>> ===============================<br>
>>> rrp_mode: passive<br>
>>> ring0_addr is defined in corosync.conf<br>
>>> ring1_addr is defined in corosync.conf<br>
>>> ===============================<br>
>>><br>
>>> -------------------------------<br>
>>><br>
>>> Two-node cluster<br>
>>><br>
>>> -------------------------------<br>
>>><br>
>>> Test #1:<br>
>>> --------------------------------------------------<br>
>>> IP for ring0 is not defines in the system:<br>
>>> --------------------------------------------------<br>
>>> Start Corosync simultaneously on both nodes.<br>
>>> Corosync fails to start.<br>
>>> From the logs:<br>
>>> Jan 08 09:43:56 [2992] A6-402-2 corosync error [MAIN ] parse error in<br>
>>> config: No interfaces defined<br>
>>> Jan 08 09:43:56 [2992] A6-402-2 corosync error [MAIN ] Corosync Cluster<br>
>>> Engine exiting with status 8 at main.c:1343.<br>
>>> Result: Corosync and Pacemaker are not running.<br>
>>><br>
>>> Test #2:<br>
>>> --------------------------------------------------<br>
>>> IP for ring1 is not defines in the system:<br>
>>> --------------------------------------------------<br>
>>> Start Corosync simultaneously on both nodes.<br>
>>> Corosync starts.<br>
>>> Start Pacemaker simultaneously on both nodes.<br>
>>> Pacemaker fails to start.<br>
>>> From the logs, the last writes from the "corosync":<br>
>>> Jan 8 16:31:29 daemon.err<27> corosync[3728]: [TOTEM ] Marking ringid 0<br>
>>> interface 169.254.1.3 FAULTY<br>
>>> Jan 8 16:31:30 daemon.notice<29> corosync[3728]: [TOTEM ] Automatically<br>
>>> recovered ring 0<br>
>>> Result: Corosync and Pacemaker are not running.<br>
>>><br>
>>><br>
>>> Test #3:<br>
>>><br>
>>> "rrp_mode: active" leads to the same result, except Corosync and<br>
>> Pacemaker<br>
>>> init scripts return status "running".<br>
>>> But still "vim /var/log/cluster/corosync.log" shows a lot of errors like:<br>
>>> Jan 08 16:30:47 [4067] A6-402-1 cib: error: pcmk_cpg_dispatch: Connection<br>
>>> to the CPG API failed: Library error (2)<br>
>>><br>
>>> Result: Corosync and Pacemaker show their statuses as "running", but<br>
>>> "crm_mon" cannot connect to the cluster database. And half of the<br>
>>> Pacemaker's services are not running (including Cluster Information Base<br>
>>> (CIB)).<br>
>>><br>
>>><br>
>>> -------------------------------<br>
>>><br>
>>> For a single node mode<br>
>>><br>
>>> -------------------------------<br>
>>><br>
>>> IP for ring0 is not defines in the system:<br>
>>><br>
>>> Corosync fails to start.<br>
>>><br>
>>> IP for ring1 is not defines in the system:<br>
>>><br>
>>> Corosync and Pacemaker are started.<br>
>>><br>
>>> It is possible that configuration will be applied successfully (50%),<br>
>>><br>
>>> and it is possible that the cluster is not running any resources,<br>
>>><br>
>>> and it is possible that the node cannot be put in a standby mode (shows:<br>
>>> communication error),<br>
>>><br>
>>> and it is possible that the cluster is running all resources, but applied<br>
>>> configuration is not guaranteed to be fully loaded (some rules can be<br>
>>> missed).<br>
>>><br>
>>><br>
>>> -------------------------------<br>
>>><br>
>>> Conclusions:<br>
>>><br>
>>> -------------------------------<br>
>>><br>
>>> It is possible that in some rare cases (see comments to the bug) the<br>
>>> cluster will work, but in that case its working state is unstable and the<br>
>>> cluster can stop working every moment.<br>
>>><br>
>>><br>
>>> So, is it correct? Does my assumptions make any sense? I didn't any other<br>
>>> explanation in the network ... .<br>
>><br>
>> Corosync needs all interfaces during start and runtime. This doesn't<br>
>> mean they must be connected (this would make corosync unusable for<br>
>> physical NIC/Switch or cable failure), but they must be up and have<br>
>> correct ip.<br>
>><br>
>> When this is not the case, corosync rebinds to localhost and weird<br>
>> things happens. Removal of this rebinding is long time TODO, but there<br>
>> are still more important bugs (especially because rebind can be avoided).<br>
>><br>
>> Regards,<br>
>> Honza<br>
>><br>
>>><br>
>>><br>
>>><br>
>>> Thank you,<br>
>>> Kostya<br>
>>><br>
>>> On Fri, Jan 9, 2015 at 11:10 AM, Kostiantyn Ponomarenko <<br>
>>> <a href="mailto:konstantin.ponomarenko@gmail.com">konstantin.ponomarenko@gmail.com</a>> wrote:<br>
>>><br>
>>>> Hi guys,<br>
>>>><br>
>>>> Corosync fails to start if there is no such network interface configured<br>
>>>> in the system.<br>
>>>> Even with "rrp_mode: passive" the problem is the same when at least one<br>
>>>> network interface is not configured in the system.<br>
>>>><br>
>>>> Is this the expected behavior?<br>
>>>> I thought that when you use redundant rings, it is enough to have at<br>
>> least<br>
>>>> one NIC configured in the system. Am I wrong?<br>
>>>><br>
>>>> Thank you,<br>
>>>> Kostya<br>
>>>><br>
>>><br>
>>><br>
>>><br>
>>> _______________________________________________<br>
>>> Pacemaker mailing list: <a href="mailto:Pacemaker@oss.clusterlabs.org">Pacemaker@oss.clusterlabs.org</a><br>
>>> <a href="http://oss.clusterlabs.org/mailman/listinfo/pacemaker" target="_blank">http://oss.clusterlabs.org/mailman/listinfo/pacemaker</a><br>
>>><br>
>>> Project Home: <a href="http://www.clusterlabs.org" target="_blank">http://www.clusterlabs.org</a><br>
>>> Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>
>>> Bugs: <a href="http://bugs.clusterlabs.org" target="_blank">http://bugs.clusterlabs.org</a><br>
>>><br>
>><br>
>><br>
>> _______________________________________________<br>
>> Pacemaker mailing list: <a href="mailto:Pacemaker@oss.clusterlabs.org">Pacemaker@oss.clusterlabs.org</a><br>
>> <a href="http://oss.clusterlabs.org/mailman/listinfo/pacemaker" target="_blank">http://oss.clusterlabs.org/mailman/listinfo/pacemaker</a><br>
>><br>
>> Project Home: <a href="http://www.clusterlabs.org" target="_blank">http://www.clusterlabs.org</a><br>
>> Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>
>> Bugs: <a href="http://bugs.clusterlabs.org" target="_blank">http://bugs.clusterlabs.org</a><br>
>><br>
><br>
><br>
><br>
> _______________________________________________<br>
> Pacemaker mailing list: <a href="mailto:Pacemaker@oss.clusterlabs.org">Pacemaker@oss.clusterlabs.org</a><br>
> <a href="http://oss.clusterlabs.org/mailman/listinfo/pacemaker" target="_blank">http://oss.clusterlabs.org/mailman/listinfo/pacemaker</a><br>
><br>
> Project Home: <a href="http://www.clusterlabs.org" target="_blank">http://www.clusterlabs.org</a><br>
> Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>
> Bugs: <a href="http://bugs.clusterlabs.org" target="_blank">http://bugs.clusterlabs.org</a><br>
><br>
<br>
<br>
_______________________________________________<br>
Pacemaker mailing list: <a href="mailto:Pacemaker@oss.clusterlabs.org">Pacemaker@oss.clusterlabs.org</a><br>
<a href="http://oss.clusterlabs.org/mailman/listinfo/pacemaker" target="_blank">http://oss.clusterlabs.org/mailman/listinfo/pacemaker</a><br>
<br>
Project Home: <a href="http://www.clusterlabs.org" target="_blank">http://www.clusterlabs.org</a><br>
Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>
Bugs: <a href="http://bugs.clusterlabs.org" target="_blank">http://bugs.clusterlabs.org</a><br>
</div></div></blockquote></div><br></div>