[Pacemaker] 2 Node Clustering, when primary server goes down(shutdown) the secondary server restarts

kamal kishi kamal.kishi at gmail.com
Wed Oct 29 04:41:16 EDT 2014


Thanks for the info, was trying to configure IPMI in the servers.
Can you please suggest a configuration procedure for enabling and
configuring the IPMI(Which you might have referred to).
The sites I came across are not understandable.
The servers I'm using is DELL POWEREDGE R320

On Tue, Oct 28, 2014 at 7:55 PM, Digimer <lists at alteeve.ca> wrote:

> On 28/10/14 02:24 AM, kamal kishi wrote:
>
>> Hi,
>>
>>   I know, no fencing configuration creates issue.
>> But the current scenario is due to fencing??
>>
>
> Maybe, maybe not. I can say that *not* having it will make solving the
> problem much more difficult. Please get it working, it's pretty easy and it
> will make your life a lot easier.
>
>  The syslog isn't revealing much about the same.
>> I would love to configure fencing but currently need some solution to
>> overcome the current scenario, if you say fencing is the only solution
>> then I might have to do it remotely.
>>
>
> It is critical, yes. Please add it, test it and then hook DRBD into it.
>
>  OS -> UBUNTU 12.04 (64 bits)
>> DRBD -> 8.3.11
>>
>
> That is quite old. Can you update to 8.3.16? Also, what version is
> pacemaker and corosync?
>
>  Thanks for the quick reply
>>
>> On Tue, Oct 28, 2014 at 11:19 AM, Digimer <lists at alteeve.ca
>> <mailto:lists at alteeve.ca>> wrote:
>>
>>     On 28/10/14 01:39 AM, kamal kishi wrote:
>>
>>         Hi all,
>>
>>                 Facing a strange issue which I'm not able to resolve as
>>         I'm not
>>         sure where what is going wrong as the logs is not giving away
>>         much to my
>>         knowledge.
>>
>>         Issue -
>>         Have configured 2 Node Clustering, have attached the configuration
>>         file(New CRM conf of BIC.txt).
>>
>>         If Server2 which is primary is shutdown(forcefully by turning
>>         off the
>>         switch), Server1 restarts within few seconds and starts the
>>         resources.
>>         Even though the Server1 restarts and starts the resources the
>>         time taken
>>         to recover is too long to convince the clients and the current
>>         working
>>         is erroneous is what I feel.
>>
>>         Have attached the syslog with this mail.(syslog)
>>
>>         Do go through the same and let know a solution to resolve the
>>         same as
>>         the setup is in clients place.
>>
>>         --
>>         Regards,
>>         Kamal Kishore B V
>>
>>
>>     You really need fencing, first and foremost. This will cause the
>>     survivor to put the lost node into a known state and then safely
>>     begin taking over lost services. Do your nodes have IPMI (or iRMC,
>>     iLO, DRAC, etc)? If so, setting up stonith is easy.
>>
>>     Once it is setup, configure DRBD to use the fence-handler
>>     'crm-fence-peer.sh' and change the fencing policy to
>>     'resource-and-stonith'. Without this, you will get split-brains and
>>     fail-over will be unpredictable.
>>
>>     Once stonith is configured and tested in pacemaker and you've hooked
>>     DRBD's fencing into pacemaker, see if you problem remains. If it
>>     does, on both nodes, run: 'tail -f -n 0 /var/log/messages', kill a
>>     node and wait for things to settle down. Share the log output here.
>>
>>     Please also tell us your OS, pacemaker, drbd and corosync versions.
>>
>>     --
>>     Digimer
>>     Papers and Projects: https://alteeve.ca/w/
>>     What if the cure for cancer is trapped in the mind of a person
>>     without access to education?
>>
>>     _________________________________________________
>>     Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>     <mailto:Pacemaker at oss.clusterlabs.org>
>>     http://oss.clusterlabs.org/__mailman/listinfo/pacemaker
>>     <http://oss.clusterlabs.org/mailman/listinfo/pacemaker>
>>
>>     Project Home: http://www.clusterlabs.org
>>     Getting started:
>>     http://www.clusterlabs.org/__doc/Cluster_from_Scratch.pdf
>>     <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>
>>     Bugs: http://bugs.clusterlabs.org
>>
>>
>>
>>
>> --
>> Regards,
>> Kamal Kishore B V
>>
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>
>>
>
> --
> Digimer
> Papers and Projects: https://alteeve.ca/w/
> What if the cure for cancer is trapped in the mind of a person without
> access to education?
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>



-- 
Regards,
Kamal Kishore B V
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20141029/ab98a5f3/attachment-0003.html>


More information about the Pacemaker mailing list