[Pacemaker] Error: cluster is not currently running on this node

Miha miha at softnet.si
Tue Aug 19 05:17:11 EDT 2014


hi,

what do you mean by "by had of powweroff sp1"? do power off server sip1?

One thing also bothers me. Why on sip2 cluster service is not running if 
still virual ip and etc are all properly running?

tnx
miha


Dne 8/19/2014 9:08 AM, piše emmanuel segura:
> Your config look ok, have you tried to use fence_bladecenter_snmp by
> had for poweroff sp1?
>
> http://www.linuxcertif.com/man/8/fence_bladecenter_snmp/
>
> 2014-08-19 8:05 GMT+02:00 Miha <miha at softnet.si>:
>> sorry, here is it:
>>
>> <cluster config_version="9" name="sipproxy">
>>    <fence_daemon/>
>>    <clusternodes>
>>      <clusternode name="sip1" nodeid="1">
>>        <fence>
>>          <method name="pcmk-method">
>>            <device name="pcmk-redirect" port="sip1"/>
>>          </method>
>>        </fence>
>>      </clusternode>
>>      <clusternode name="sip2" nodeid="2">
>>        <fence>
>>          <method name="pcmk-method">
>>            <device name="pcmk-redirect" port="sip2"/>
>>          </method>
>>        </fence>
>>      </clusternode>
>>    </clusternodes>
>>    <cman expected_votes="1" two_node="1"/>
>>    <fencedevices>
>>      <fencedevice agent="fence_pcmk" name="pcmk-redirect"/>
>>    </fencedevices>
>>    <rm>
>>      <failoverdomains/>
>>      <resources/>
>>    </rm>
>> </cluster>
>>
>>
>> br
>> miha
>>
>> Dne 8/18/2014 11:33 AM, piše emmanuel segura:
>>> your cman /etc/cluster/cluster.conf ?
>>>
>>> 2014-08-18 7:08 GMT+02:00 Miha <miha at softnet.si>:
>>>> Hi Emmanuel,
>>>>
>>>> this is my config:
>>>>
>>>>
>>>> Pacemaker Nodes:
>>>>    sip1 sip2
>>>>
>>>> Resources:
>>>>    Master: ms_drbd_mysql
>>>>     Meta Attrs: master-max=1 master-node-max=1 clone-max=2
>>>> clone-node-max=1
>>>> notify=true
>>>>     Resource: p_drbd_mysql (class=ocf provider=linbit type=drbd)
>>>>      Attributes: drbd_resource=clusterdb_res
>>>>      Operations: monitor interval=29s role=Master
>>>> (p_drbd_mysql-monitor-29s)
>>>>                  monitor interval=31s role=Slave
>>>> (p_drbd_mysql-monitor-31s)
>>>>    Group: g_mysql
>>>>     Resource: p_fs_mysql (class=ocf provider=heartbeat type=Filesystem)
>>>>      Attributes: device=/dev/drbd0 directory=/var/lib/mysql_drbd
>>>> fstype=ext4
>>>>      Meta Attrs: target-role=Started
>>>>     Resource: p_ip_mysql (class=ocf provider=heartbeat type=IPaddr2)
>>>>      Attributes: ip=XXX.XXX.XXX.XXX cidr_netmask=24 nic=eth2
>>>>     Resource: p_mysql (class=ocf provider=heartbeat type=mysql)
>>>>      Attributes: datadir=/var/lib/mysql_drbd/data/ user=root group=root
>>>> config=/var/lib/mysql_drbd/my.cnf pid=/var/run/mysqld/mysqld.pid
>>>> socket=/var/lib/mysql/mysql.sock binary=/usr/bin/mysqld_safe
>>>> additional_parameters="--bind-address=212.13.249.55 --user=root"
>>>>      Meta Attrs: target-role=Started
>>>>      Operations: start interval=0 timeout=120s (p_mysql-start-0)
>>>>                  stop interval=0 timeout=120s (p_mysql-stop-0)
>>>>                  monitor interval=20s timeout=30s (p_mysql-monitor-20s)
>>>>    Clone: cl_ping
>>>>     Meta Attrs: interleave=true
>>>>     Resource: p_ping (class=ocf provider=pacemaker type=ping)
>>>>      Attributes: name=ping multiplier=1000 host_list=XXX.XXX.XXX.XXXX
>>>>      Operations: monitor interval=15s timeout=60s (p_ping-monitor-15s)
>>>>                  start interval=0s timeout=60s (p_ping-start-0s)
>>>>                  stop interval=0s (p_ping-stop-0s)
>>>>    Resource: opensips (class=lsb type=opensips)
>>>>     Meta Attrs: target-role=Started
>>>>     Operations: start interval=0 timeout=120 (opensips-start-0)
>>>>                 stop interval=0 timeout=120 (opensips-stop-0)
>>>>
>>>> Stonith Devices:
>>>>    Resource: fence_sip1 (class=stonith type=fence_bladecenter_snmp)
>>>>     Attributes: action=off ipaddr=172.30.0.2 port=8 community=test
>>>> login=snmp8
>>>> passwd=soft1234
>>>>     Meta Attrs: target-role=Started
>>>>    Resource: fence_sip2 (class=stonith type=fence_bladecenter_snmp)
>>>>     Attributes: action=off ipaddr=172.30.0.2 port=9 community=test1
>>>> login=snmp8 passwd=soft1234
>>>>     Meta Attrs: target-role=Started
>>>> Fencing Levels:
>>>>
>>>> Location Constraints:
>>>>     Resource: ms_drbd_mysql
>>>>       Constraint: l_drbd_master_on_ping
>>>>         Rule: score=-INFINITY role=Master boolean-op=or
>>>> (id:l_drbd_master_on_ping-rule)
>>>>           Expression: not_defined ping
>>>> (id:l_drbd_master_on_ping-expression)
>>>>           Expression: ping lte 0 type=number
>>>> (id:l_drbd_master_on_ping-expression-0)
>>>> Ordering Constraints:
>>>>     promote ms_drbd_mysql then start g_mysql (INFINITY)
>>>> (id:o_drbd_before_mysql)
>>>>     g_mysql then start opensips (INFINITY) (id:opensips_after_mysql)
>>>> Colocation Constraints:
>>>>     g_mysql with ms_drbd_mysql (INFINITY) (with-rsc-role:Master)
>>>> (id:c_mysql_on_drbd)
>>>>     opensips with g_mysql (INFINITY) (id:c_opensips_on_mysql)
>>>>
>>>> Cluster Properties:
>>>>    cluster-infrastructure: cman
>>>>    dc-version: 1.1.10-14.el6-368c726
>>>>    no-quorum-policy: ignore
>>>>    stonith-enabled: true
>>>> Node Attributes:
>>>>    sip1: standby=off
>>>>    sip2: standby=off
>>>>
>>>>
>>>> br
>>>> miha
>>>>
>>>> Dne 8/14/2014 3:05 PM, piše emmanuel segura:
>>>>
>>>>> ncomplete=10, Source=/var/lib/pacemaker/pengine/pe-warn-7.bz2): Stopped
>>>>> Jul 03 14:10:51 [2701] sip2       crmd:   notice:
>>>>> too_many_st_failures:         No devices found in cluster to fence
>>>>> sip1, giving up
>>>>>
>>>>> Jul 03 14:10:54 [2697] sip2 stonith-ng:     info: stonith_command:
>>>>>     Processed st_query reply from sip2: OK (0)
>>>>> Jul 03 14:10:54 [2697] sip2 stonith-ng:    error: remote_op_done:
>>>>>     Operation reboot of sip1 by sip2 for
>>>>> stonith_admin.cman.28299 at sip2.94474607: No such device
>>>>>
>>>>> Jul 03 14:10:54 [2697] sip2 stonith-ng:     info: stonith_command:
>>>>>     Processed st_notify reply from sip2: OK (0)
>>>>> Jul 03 14:10:54 [2701] sip2       crmd:   notice:
>>>>> tengine_stonith_notify:       Peer sip1 was not terminated (reboot) by
>>>>> sip2 for sip2: No such device
>>>>> (ref=94474607-8cd2-410d-bbf7-5bc7df614a50) by client
>>>>> stonith_admin.cman.28299
>>>>>
>>>>>
>>>>>
>>>>> :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
>>>>>
>>>>> Sorry for the short answer, have you tested your cluster fencing ? can
>>>>> you show your cluster.conf xml?
>>>>>
>>>>> 2014-08-14 14:44 GMT+02:00 Miha <miha at softnet.si>:
>>>>>> emmanuel,
>>>>>>
>>>>>> tnx. But how to know why fancing stop working?
>>>>>>
>>>>>> br
>>>>>> miha
>>>>>>
>>>>>> Dne 8/14/2014 2:35 PM, piše emmanuel segura:
>>>>>>
>>>>>>> Node sip2: UNCLEAN (offline) is unclean because the cluster fencing
>>>>>>> failed to complete the operation
>>>>>>>
>>>>>>> 2014-08-14 14:13 GMT+02:00 Miha <miha at softnet.si>:
>>>>>>>> hi.
>>>>>>>>
>>>>>>>> another thing.
>>>>>>>>
>>>>>>>> On node I pcs is running:
>>>>>>>> [root at sip1 ~]# pcs status
>>>>>>>> Cluster name: sipproxy
>>>>>>>> Last updated: Thu Aug 14 14:13:37 2014
>>>>>>>> Last change: Sat Feb  1 20:10:48 2014 via crm_attribute on sip1
>>>>>>>> Stack: cman
>>>>>>>> Current DC: sip1 - partition with quorum
>>>>>>>> Version: 1.1.10-14.el6-368c726
>>>>>>>> 2 Nodes configured
>>>>>>>> 10 Resources configured
>>>>>>>>
>>>>>>>>
>>>>>>>> Node sip2: UNCLEAN (offline)
>>>>>>>> Online: [ sip1 ]
>>>>>>>>
>>>>>>>> Full list of resources:
>>>>>>>>
>>>>>>>>      Master/Slave Set: ms_drbd_mysql [p_drbd_mysql]
>>>>>>>>          Masters: [ sip2 ]
>>>>>>>>          Slaves: [ sip1 ]
>>>>>>>>      Resource Group: g_mysql
>>>>>>>>          p_fs_mysql (ocf::heartbeat:Filesystem):    Started sip2
>>>>>>>>          p_ip_mysql (ocf::heartbeat:IPaddr2):       Started sip2
>>>>>>>>          p_mysql    (ocf::heartbeat:mysql): Started sip2
>>>>>>>>      Clone Set: cl_ping [p_ping]
>>>>>>>>          Started: [ sip1 sip2 ]
>>>>>>>>      opensips       (lsb:opensips): Stopped
>>>>>>>>      fence_sip1     (stonith:fence_bladecenter_snmp):       Started
>>>>>>>> sip2
>>>>>>>>      fence_sip2     (stonith:fence_bladecenter_snmp):       Started
>>>>>>>> sip2
>>>>>>>>
>>>>>>>>
>>>>>>>> [root at sip1 ~]#
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Dne 8/14/2014 2:12 PM, piše Miha:
>>>>>>>>
>>>>>>>>> Hi emmanuel,
>>>>>>>>>
>>>>>>>>> i think so, what is the best way to check?
>>>>>>>>>
>>>>>>>>> Sorry for my noob question, I have confiured this 6 mouths ago and
>>>>>>>>> everything was working fine till now. Now I need to find out what
>>>>>>>>> realy
>>>>>>>>> heppend beffor I do something stupid.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> tnx
>>>>>>>>>
>>>>>>>>> Dne 8/14/2014 1:58 PM, piše emmanuel segura:
>>>>>>>>>> are you sure your cluster fencing is working?
>>>>>>>>>>
>>>>>>>>>> 2014-08-14 13:40 GMT+02:00 Miha <miha at softnet.si>:
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> I noticed today that I am having some problem with cluster. I
>>>>>>>>>>> noticed
>>>>>>>>>>> the
>>>>>>>>>>> master server is offilne but still virutal ip is assigned to it
>>>>>>>>>>> and
>>>>>>>>>>> all
>>>>>>>>>>> services are running properly (for production).
>>>>>>>>>>>
>>>>>>>>>>> If I do this I am getting this notifications:
>>>>>>>>>>>
>>>>>>>>>>> [root at sip2 cluster]# pcs status
>>>>>>>>>>> Error: cluster is not currently running on this node
>>>>>>>>>>> [root at sip2 cluster]# /etc/init.d/corosync status
>>>>>>>>>>> corosync dead but pid file exists
>>>>>>>>>>> [root at sip2 cluster]# pcs status
>>>>>>>>>>> Error: cluster is not currently running on this node
>>>>>>>>>>> [root at sip2 cluster]#
>>>>>>>>>>> [root at sip2 cluster]#
>>>>>>>>>>> [root at sip2 cluster]# tailf fenced.log
>>>>>>>>>>> Aug 14 13:34:25 fenced cman_get_cluster error -1 112
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> The main thing is what to do now? Do "pcs start" and hope for the
>>>>>>>>>>> best
>>>>>>>>>>> or
>>>>>>>>>>> what?
>>>>>>>>>>>
>>>>>>>>>>> I have pasted log in pastebin: http://pastebin.com/SUp2GcmN
>>>>>>>>>>>
>>>>>>>>>>> tnx!
>>>>>>>>>>>
>>>>>>>>>>> miha
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>>>>
>>>>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>>>>> Getting started:
>>>>>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>
>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>> Getting started:
>>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>>
>>>>>>>
>>>>>> _______________________________________________
>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>
>>>>>> Project Home: http://www.clusterlabs.org
>>>>>> Getting started:
>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>
>>>>>
>>>> _______________________________________________
>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>
>>>> Project Home: http://www.clusterlabs.org
>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>> Bugs: http://bugs.clusterlabs.org
>>>
>>>
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
>





More information about the Pacemaker mailing list