[ClusterLabs] 2 Nodes Pacemaker for Nginx can only failover 1 time

jun huang huangjun.job at gmail.com
Sun Aug 9 04:57:55 UTC 2015


Hi Noel,

Thanks for the quick reply, I really appreciate it. I found out that after
I kill the nginx at the node1. I run the command *pcs status* and I got
below info.

[root at node2 ~]# pcs status
Cluster name: cluster_web
Last updated: Sun Aug  9 12:49:20 2015
Last change: Sun Aug  9 09:24:37 2015 via cibadmin on node1
Stack: corosync
Current DC: node2 (2) - partition with quorum
Version: 1.1.10-29.el7-368c726
2 Nodes configured
2 Resources configured


Online: [ node1 node2 ]

Full list of resources:

 Resource Group: test
     nginx      (ocf::heartbeat:nginx): Started node2
     virtual_ip (ocf::heartbeat:IPaddr2):       Started node2

Failed actions:
* nginx_monitor_60000 on node1 'not running' (7): call=11, status=complete,
last-rc-change='Sun Aug 9 12:34:47 2015', queued=0ms, exec=0ms*

Looks like the nginx monitor is failing on the node1 and causing the issue.
After I restart the cluster node 1, it take back the VIP and Nginx resource
again, because it got a higher score than the node2. But is it possible to
make the node1 recovery it's nginx monitor on it own? Thanks again for your
time!

Thanks,
Jacob

Noel Kuntze <noel at familie-kuntze.de>于2015年8月9日周日 上午11:07写道:

> Hello Jacob,
>
> Look at the journal. It will tell you. Also, it's hard to debug without
> any information from the daemons.
>
> Regards,
> Noel Kuntze
>
> Am 9. August 2015 05:02:15 MESZ, schrieb jun huang <huangjun.job at gmail.com
> >:
>
>> Hello Everyone,
>>
>> I setup a cluster with two nodes with pacemaker 1.1.10 on CentOS 7. Then
>> I downloaded aresource agent for nginx from github
>> <https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/nginx>
>>
>> I tested my setup like this:
>>
>>    1. Node 1 is started with the nginx and vip, everyting is ok
>>    2. Kill Node1 nginx, wait for a few seconds
>>    3. See the ngnix and vip are moved to node2, failover succeeded, and
>>    Node1 doesn't have any resources active
>>    4. I kill nginx on node2, but nginx and vip don't come back to Node1
>>
>> I set no-quorum-policy="ignore" and stonith-enabled="false".
>>
>> Why won't pacemaker let the resource come back to Node1? What did I miss
>> here?
>>
>>
>> I guess the node1 is still in some failure status, how can I recovery the
>> node? Does anyone can shed some light on my questions? Thank you in advance.
>>
>> Thanks,
>>
>> Jacob
>>
>> ------------------------------
>>
>> Users mailing list: Users at clusterlabs.org
>> http://clusterlabs.org/mailman/listinfo/users
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>
>>
> --
> Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail
> gesendet.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20150809/ffec9095/attachment-0002.html>


More information about the Users mailing list