<div dir="ltr"><div>According to the <a href="https://access.redhat.com/solutions/638843">https://access.redhat.com/solutions/638843</a> , the interface, that is defined in the corosync.conf, must be present in the system (see at the bottom of the article, section "ROOT CAUSE").</div><div>To confirm that I made a couple of tests.</div><div><br></div><div>Here is a part of the corosync.conf file (in a free-write form) (also attached the origin config file):</div><div>===============================</div><div>rrp_mode: passive</div><div>ring0_addr is defined in corosync.conf</div><div>ring1_addr is defined in corosync.conf</div><div>===============================</div><div><br></div><div><p style="margin:10px 0px 0px;padding:0px;color:rgb(51,51,51);font-family:Arial,sans-serif;font-size:14px;line-height:20px">-------------------------------<br></p><p style="margin:10px 0px 0px;padding:0px"><font color="#333333" face="Arial, sans-serif"><span style="font-size:14px;line-height:20px">Two-node cluster</span></font></p><p style="margin:10px 0px 0px;padding:0px"><font color="#333333" face="Arial, sans-serif"><span style="font-size:14px;line-height:20px">-------------------------------</span></font></p></div><div><br></div><div>Test #1:</div><div>--------------------------------------------------</div><div>IP for ring0 is not defines in the system:</div><div>--------------------------------------------------</div><div>Start Corosync simultaneously on both nodes.</div><div>Corosync fails to start. </div><div>From the logs:</div><div>Jan 08 09:43:56 [2992] A6-402-2 corosync error [MAIN ] parse error in config: No interfaces defined</div><div>Jan 08 09:43:56 [2992] A6-402-2 corosync error [MAIN ] Corosync Cluster Engine exiting with status 8 at main.c:1343.</div><div>Result: Corosync and Pacemaker are not running.</div><div><br></div><div>Test #2:</div><div>--------------------------------------------------</div><div>IP for ring1 is not defines in the system:</div><div>--------------------------------------------------</div><div>Start Corosync simultaneously on both nodes.</div><div>Corosync starts.</div><div>Start Pacemaker simultaneously on both nodes.</div><div>Pacemaker fails to start.</div><div>From the logs, the last writes from the "corosync":</div><div>Jan 8 16:31:29 daemon.err<27> corosync[3728]: [TOTEM ] Marking ringid 0 interface 169.254.1.3 FAULTY</div><div>Jan 8 16:31:30 daemon.notice<29> corosync[3728]: [TOTEM ] Automatically recovered ring 0</div><div>Result: Corosync and Pacemaker are not running.</div><div><br></div><div><br></div><div><p style="margin:0px;padding:0px;color:rgb(51,51,51);font-family:Arial,sans-serif;font-size:14px;line-height:20px">Test #3:</p><p style="margin:10px 0px 0px;padding:0px;color:rgb(51,51,51);font-family:Arial,sans-serif;font-size:14px;line-height:20px">"rrp_mode: active" leads to the same result, except Corosync and Pacemaker init scripts return status "running".<br>But still "vim /var/log/cluster/corosync.log" shows a lot of errors like:<br>Jan 08 16:30:47 <span class="">[4067]</span> A6-402-1 cib: error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2)</p><p style="margin:10px 0px 0px;padding:0px;color:rgb(51,51,51);font-family:Arial,sans-serif;font-size:14px;line-height:20px">Result: Corosync and Pacemaker show their statuses as "running", but "crm_mon" cannot connect to the cluster database. And half of the Pacemaker's services are not running (including Cluster Information Base (CIB)).</p><p style="margin:10px 0px 0px;padding:0px;color:rgb(51,51,51);font-family:Arial,sans-serif;font-size:14px;line-height:20px"><br></p><p style="margin:10px 0px 0px;padding:0px;color:rgb(51,51,51);font-family:Arial,sans-serif;font-size:14px;line-height:20px">-------------------------------<br></p><p style="margin:10px 0px 0px;padding:0px"><font color="#333333" face="Arial, sans-serif"><span style="font-size:14px;line-height:20px">For a single node mode</span></font></p><p style="margin:10px 0px 0px;padding:0px"><font color="#333333" face="Arial, sans-serif"><span style="font-size:14px;line-height:20px">-------------------------------</span></font></p><p style="margin:10px 0px 0px;padding:0px"><font color="#333333" face="Arial, sans-serif"><span style="font-size:14px;line-height:20px">IP for ring0 is not defines in the system:</span></font></p><p style="margin:10px 0px 0px;padding:0px"><font color="#333333" face="Arial, sans-serif"><span style="font-size:14px;line-height:20px">Corosync fails to start.</span></font></p><p style="margin:10px 0px 0px;padding:0px"><font color="#333333" face="Arial, sans-serif"><span style="font-size:14px;line-height:20px">IP for ring1 is not defines in the system:</span></font></p><p style="margin:10px 0px 0px;padding:0px"><font color="#333333" face="Arial, sans-serif"><span style="font-size:14px;line-height:20px">Corosync and Pacemaker are started.</span></font></p><p style="margin:10px 0px 0px;padding:0px"><font color="#333333" face="Arial, sans-serif"><span style="font-size:14px;line-height:20px">It is possible that configuration will be applied successfully (50%),</span></font></p><p style="margin:10px 0px 0px;padding:0px"><font color="#333333" face="Arial, sans-serif"><span style="font-size:14px;line-height:20px">and it is possible that the cluster is not running any resources,</span></font></p><p style="margin:10px 0px 0px;padding:0px"><font color="#333333" face="Arial, sans-serif"><span style="font-size:14px;line-height:20px">and it is possible that the node cannot be put in a standby mode (shows: communication error),</span></font></p><p style="margin:10px 0px 0px;padding:0px"><font color="#333333" face="Arial, sans-serif"><span style="font-size:14px;line-height:20px">and it is possible that the cluster is running all resources, but applied configuration is not guaranteed to be fully loaded (some rules can be missed).</span></font></p><p style="margin:10px 0px 0px;padding:0px;color:rgb(51,51,51);font-family:Arial,sans-serif;font-size:14px;line-height:20px"><br></p><p style="margin:10px 0px 0px;padding:0px;color:rgb(51,51,51);font-family:Arial,sans-serif;font-size:14px;line-height:20px">-------------------------------<br></p><p style="margin:10px 0px 0px;padding:0px;color:rgb(51,51,51);font-family:Arial,sans-serif;font-size:14px;line-height:20px">Conclusions:</p><p style="margin:10px 0px 0px;padding:0px;color:rgb(51,51,51);font-family:Arial,sans-serif;font-size:14px;line-height:20px">-------------------------------<br></p><p style="margin:10px 0px 0px;padding:0px"><font color="#333333" face="Arial, sans-serif"><span style="font-size:14px;line-height:20px">It is possible that in some rare cases (see comments to the bug) the cluster will work, but in that case its working state is unstable and the cluster can stop working every moment.</span></font><br></p><p style="margin:10px 0px 0px;padding:0px"><font color="#333333" face="Arial, sans-serif"><span style="font-size:14px;line-height:20px"><br></span></font></p><p style="margin:10px 0px 0px;padding:0px"><font color="#333333" face="Arial, sans-serif"><span style="font-size:14px;line-height:20px">So, is it correct? Does my assumptions make any sense? I didn't any other explanation in the network ... .</span></font></p><p style="margin:10px 0px 0px;padding:0px"><font color="#333333" face="Arial, sans-serif"><span style="font-size:14px;line-height:20px"><br></span></font></p></div></div><div class="gmail_extra"><br clear="all"><div><div class="gmail_signature"><div dir="ltr">Thank you,<div>Kostya</div></div></div></div>
<br><div class="gmail_quote">On Fri, Jan 9, 2015 at 11:10 AM, Kostiantyn Ponomarenko <span dir="ltr"><<a href="mailto:konstantin.ponomarenko@gmail.com" target="_blank">konstantin.ponomarenko@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>Hi guys,</div><div><br></div><div>Corosync fails to start if there is no such network interface configured in the system.</div><div>Even with "rrp_mode: passive" the problem is the same when at least one network interface is not configured in the system.</div><div><br></div><div>Is this the expected behavior?</div><div>I thought that when you use redundant rings, it is enough to have at least one NIC configured in the system. Am I wrong?</div><br clear="all"><div><div><div dir="ltr">Thank you,<div>Kostya</div></div></div></div>
</div>
</blockquote></div><br></div>