<div dir="ltr">Hello David,<div><br></div><div>I think I use the latest version from ubuntu, it is version 1.1.10</div><div>Do you think it has bug on it?</div><div>Should I compile from the source?</div><div><br></div><div>Best Regards,</div><div><br></div><div><br></div><div>Ariee</div><div><br><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Dec 19, 2014 at 8:27 PM, <span dir="ltr"><<a href="mailto:pacemaker-request@oss.clusterlabs.org" target="_blank">pacemaker-request@oss.clusterlabs.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Message: 2<br>
Date: Fri, 19 Dec 2014 14:21:59 -0500 (EST)<br>
From: David Vossel <<a href="mailto:dvossel@redhat.com">dvossel@redhat.com</a>><br>
To: The Pacemaker cluster resource manager<br>
<<a href="mailto:pacemaker@oss.clusterlabs.org">pacemaker@oss.clusterlabs.org</a>><br>
Subject: Re: [Pacemaker] pacemaker error after a couple week or month<br>
Message-ID:<br>
<<a href="mailto:102420175.739708.1419016919246.JavaMail.zimbra@redhat.com">102420175.739708.1419016919246.JavaMail.zimbra@redhat.com</a>><br>
Content-Type: text/plain; charset=utf-8<br>
<br>
<br>
<br>
----- Original Message -----<br>
> Hello,<br>
><br>
> I have 2 active-passive fail over system with corosync and drbd.<br>
> One system using 2 debian server and the other using 2 ubuntu server.<br>
> The debian servers are for web server fail over and the ubuntu servers are<br>
> for database server fail over.<br>
><br>
> I applied the same configuration in the pacemaker. Everything works fine,<br>
> fail over can be done nicely and also the file system synchronization, but<br>
> in the ubuntu server, it was always has error after a couple week or month.<br>
> The pacemaker in ubuntu1 had different status with ubuntu2, ubuntu1 assumed<br>
> that ubuntu2 was down and ubuntu2 assumed that something happened with<br>
> ubuntu1 but still alive and took over the resources. It made the drbd<br>
> resource cannot be taken over, thus no fail over happened and we must<br>
> manually restart the server because restarting pacemaker and corosync didn't<br>
> help. I have changed the configuration of pacemaker a couple time, but the<br>
> problem still exist.<br>
><br>
> has anyone experienced it? I use Ubuntu 14.04.1 LTS.<br>
><br>
> I got this error in apport.log<br>
><br>
> ERROR: apport (pid 20361) Fri Dec 19 02:43:52 2014: executable:<br>
> /usr/lib/pacemaker/lrmd (command line "/usr/lib/pacemaker/lrmd")<br>
<br>
wow, it looks like the lrmd is crashing on you. I haven't seen this occur<br>
in the wild before. Without a backtrace it will be nearly impossible to determine<br>
what is happening.<br>
<br>
Do you have the ability to upgrade pacemaker to a newer version?<br>
<br>
-- Vossel<br>
</blockquote></div><br></div></div></div>