[Pacemaker] Error: cluster is not currently running on this node

Tue Aug 19 05:53:25 EDT 2014

sorry,

That was a typo, fixed "try to poweroff sp1 by hand, using the
fence_bladecenter_snmp in your shell"

2014-08-19 11:17 GMT+02:00 Miha <miha at softnet.si>:
> hi,
>
> what do you mean by "by had of powweroff sp1"? do power off server sip1?
>
> One thing also bothers me. Why on sip2 cluster service is not running if
> still virual ip and etc are all properly running?
>
> tnx
> miha
>
>
> Dne 8/19/2014 9:08 AM, piše emmanuel segura:
>
>> Your config look ok, have you tried to use fence_bladecenter_snmp by
>> had for poweroff sp1?
>>
>> http://www.linuxcertif.com/man/8/fence_bladecenter_snmp/
>>
>> 2014-08-19 8:05 GMT+02:00 Miha <miha at softnet.si>:
>>>
>>> sorry, here is it:
>>>
>>> <cluster config_version="9" name="sipproxy">
>>>    <fence_daemon/>
>>>    <clusternodes>
>>>      <clusternode name="sip1" nodeid="1">
>>>        <fence>
>>>          <method name="pcmk-method">
>>>            <device name="pcmk-redirect" port="sip1"/>
>>>          </method>
>>>        </fence>
>>>      </clusternode>
>>>      <clusternode name="sip2" nodeid="2">
>>>        <fence>
>>>          <method name="pcmk-method">
>>>            <device name="pcmk-redirect" port="sip2"/>
>>>          </method>
>>>        </fence>
>>>      </clusternode>
>>>    </clusternodes>
>>>    <cman expected_votes="1" two_node="1"/>
>>>    <fencedevices>
>>>      <fencedevice agent="fence_pcmk" name="pcmk-redirect"/>
>>>    </fencedevices>
>>>    <rm>
>>>      <failoverdomains/>
>>>      <resources/>
>>>    </rm>
>>> </cluster>
>>>
>>>
>>> br
>>> miha
>>>
>>> Dne 8/18/2014 11:33 AM, piše emmanuel segura:
>>>>
>>>> your cman /etc/cluster/cluster.conf ?
>>>>
>>>> 2014-08-18 7:08 GMT+02:00 Miha <miha at softnet.si>:
>>>>>
>>>>> Hi Emmanuel,
>>>>>
>>>>> this is my config:
>>>>>
>>>>>
>>>>> Pacemaker Nodes:
>>>>>    sip1 sip2
>>>>>
>>>>> Resources:
>>>>>    Master: ms_drbd_mysql
>>>>>     Meta Attrs: master-max=1 master-node-max=1 clone-max=2
>>>>> clone-node-max=1
>>>>> notify=true
>>>>>     Resource: p_drbd_mysql (class=ocf provider=linbit type=drbd)
>>>>>      Attributes: drbd_resource=clusterdb_res
>>>>>      Operations: monitor interval=29s role=Master
>>>>> (p_drbd_mysql-monitor-29s)
>>>>>                  monitor interval=31s role=Slave
>>>>> (p_drbd_mysql-monitor-31s)
>>>>>    Group: g_mysql
>>>>>     Resource: p_fs_mysql (class=ocf provider=heartbeat type=Filesystem)
>>>>>      Attributes: device=/dev/drbd0 directory=/var/lib/mysql_drbd
>>>>> fstype=ext4
>>>>>      Meta Attrs: target-role=Started
>>>>>     Resource: p_ip_mysql (class=ocf provider=heartbeat type=IPaddr2)
>>>>>      Attributes: ip=XXX.XXX.XXX.XXX cidr_netmask=24 nic=eth2
>>>>>     Resource: p_mysql (class=ocf provider=heartbeat type=mysql)
>>>>>      Attributes: datadir=/var/lib/mysql_drbd/data/ user=root group=root
>>>>> config=/var/lib/mysql_drbd/my.cnf pid=/var/run/mysqld/mysqld.pid
>>>>> socket=/var/lib/mysql/mysql.sock binary=/usr/bin/mysqld_safe
>>>>> additional_parameters="--bind-address=212.13.249.55 --user=root"
>>>>>      Meta Attrs: target-role=Started
>>>>>      Operations: start interval=0 timeout=120s (p_mysql-start-0)
>>>>>                  stop interval=0 timeout=120s (p_mysql-stop-0)
>>>>>                  monitor interval=20s timeout=30s (p_mysql-monitor-20s)
>>>>>    Clone: cl_ping
>>>>>     Meta Attrs: interleave=true
>>>>>     Resource: p_ping (class=ocf provider=pacemaker type=ping)
>>>>>      Attributes: name=ping multiplier=1000 host_list=XXX.XXX.XXX.XXXX
>>>>>      Operations: monitor interval=15s timeout=60s (p_ping-monitor-15s)
>>>>>                  start interval=0s timeout=60s (p_ping-start-0s)
>>>>>                  stop interval=0s (p_ping-stop-0s)
>>>>>    Resource: opensips (class=lsb type=opensips)
>>>>>     Meta Attrs: target-role=Started
>>>>>     Operations: start interval=0 timeout=120 (opensips-start-0)
>>>>>                 stop interval=0 timeout=120 (opensips-stop-0)
>>>>>
>>>>> Stonith Devices:
>>>>>    Resource: fence_sip1 (class=stonith type=fence_bladecenter_snmp)
>>>>>     Attributes: action=off ipaddr=172.30.0.2 port=8 community=test
>>>>> login=snmp8
>>>>> passwd=soft1234
>>>>>     Meta Attrs: target-role=Started
>>>>>    Resource: fence_sip2 (class=stonith type=fence_bladecenter_snmp)
>>>>>     Attributes: action=off ipaddr=172.30.0.2 port=9 community=test1
>>>>> login=snmp8 passwd=soft1234
>>>>>     Meta Attrs: target-role=Started
>>>>> Fencing Levels:
>>>>>
>>>>> Location Constraints:
>>>>>     Resource: ms_drbd_mysql
>>>>>       Constraint: l_drbd_master_on_ping
>>>>>         Rule: score=-INFINITY role=Master boolean-op=or
>>>>> (id:l_drbd_master_on_ping-rule)
>>>>>           Expression: not_defined ping
>>>>> (id:l_drbd_master_on_ping-expression)
>>>>>           Expression: ping lte 0 type=number
>>>>> (id:l_drbd_master_on_ping-expression-0)
>>>>> Ordering Constraints:
>>>>>     promote ms_drbd_mysql then start g_mysql (INFINITY)
>>>>> (id:o_drbd_before_mysql)
>>>>>     g_mysql then start opensips (INFINITY) (id:opensips_after_mysql)
>>>>> Colocation Constraints:
>>>>>     g_mysql with ms_drbd_mysql (INFINITY) (with-rsc-role:Master)
>>>>> (id:c_mysql_on_drbd)
>>>>>     opensips with g_mysql (INFINITY) (id:c_opensips_on_mysql)
>>>>>
>>>>> Cluster Properties:
>>>>>    cluster-infrastructure: cman
>>>>>    dc-version: 1.1.10-14.el6-368c726
>>>>>    no-quorum-policy: ignore
>>>>>    stonith-enabled: true
>>>>> Node Attributes:
>>>>>    sip1: standby=off
>>>>>    sip2: standby=off
>>>>>
>>>>>
>>>>> br
>>>>> miha
>>>>>
>>>>> Dne 8/14/2014 3:05 PM, piše emmanuel segura:
>>>>>
>>>>>> ncomplete=10, Source=/var/lib/pacemaker/pengine/pe-warn-7.bz2):
>>>>>> Stopped
>>>>>> Jul 03 14:10:51 [2701] sip2       crmd:   notice:
>>>>>> too_many_st_failures:         No devices found in cluster to fence
>>>>>> sip1, giving up
>>>>>>
>>>>>> Jul 03 14:10:54 [2697] sip2 stonith-ng:     info: stonith_command:
>>>>>>     Processed st_query reply from sip2: OK (0)
>>>>>> Jul 03 14:10:54 [2697] sip2 stonith-ng:    error: remote_op_done:
>>>>>>     Operation reboot of sip1 by sip2 for
>>>>>> stonith_admin.cman.28299 at sip2.94474607: No such device
>>>>>>
>>>>>> Jul 03 14:10:54 [2697] sip2 stonith-ng:     info: stonith_command:
>>>>>>     Processed st_notify reply from sip2: OK (0)
>>>>>> Jul 03 14:10:54 [2701] sip2       crmd:   notice:
>>>>>> tengine_stonith_notify:       Peer sip1 was not terminated (reboot) by
>>>>>> sip2 for sip2: No such device
>>>>>> (ref=94474607-8cd2-410d-bbf7-5bc7df614a50) by client
>>>>>> stonith_admin.cman.28299
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
>>>>>>
>>>>>> Sorry for the short answer, have you tested your cluster fencing ? can
>>>>>> you show your cluster.conf xml?
>>>>>>
>>>>>> 2014-08-14 14:44 GMT+02:00 Miha <miha at softnet.si>:
>>>>>>>
>>>>>>> emmanuel,
>>>>>>>
>>>>>>> tnx. But how to know why fancing stop working?
>>>>>>>
>>>>>>> br
>>>>>>> miha
>>>>>>>
>>>>>>> Dne 8/14/2014 2:35 PM, piše emmanuel segura:
>>>>>>>
>>>>>>>> Node sip2: UNCLEAN (offline) is unclean because the cluster fencing
>>>>>>>> failed to complete the operation
>>>>>>>>
>>>>>>>> 2014-08-14 14:13 GMT+02:00 Miha <miha at softnet.si>:
>>>>>>>>>
>>>>>>>>> hi.
>>>>>>>>>
>>>>>>>>> another thing.
>>>>>>>>>
>>>>>>>>> On node I pcs is running:
>>>>>>>>> [root at sip1 ~]# pcs status
>>>>>>>>> Cluster name: sipproxy
>>>>>>>>> Last updated: Thu Aug 14 14:13:37 2014
>>>>>>>>> Last change: Sat Feb  1 20:10:48 2014 via crm_attribute on sip1
>>>>>>>>> Stack: cman
>>>>>>>>> Current DC: sip1 - partition with quorum
>>>>>>>>> Version: 1.1.10-14.el6-368c726
>>>>>>>>> 2 Nodes configured
>>>>>>>>> 10 Resources configured
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Node sip2: UNCLEAN (offline)
>>>>>>>>> Online: [ sip1 ]
>>>>>>>>>
>>>>>>>>> Full list of resources:
>>>>>>>>>
>>>>>>>>>      Master/Slave Set: ms_drbd_mysql [p_drbd_mysql]
>>>>>>>>>          Masters: [ sip2 ]
>>>>>>>>>          Slaves: [ sip1 ]
>>>>>>>>>      Resource Group: g_mysql
>>>>>>>>>          p_fs_mysql (ocf::heartbeat:Filesystem):    Started sip2
>>>>>>>>>          p_ip_mysql (ocf::heartbeat:IPaddr2):       Started sip2
>>>>>>>>>          p_mysql    (ocf::heartbeat:mysql): Started sip2
>>>>>>>>>      Clone Set: cl_ping [p_ping]
>>>>>>>>>          Started: [ sip1 sip2 ]
>>>>>>>>>      opensips       (lsb:opensips): Stopped
>>>>>>>>>      fence_sip1     (stonith:fence_bladecenter_snmp):       Started
>>>>>>>>> sip2
>>>>>>>>>      fence_sip2     (stonith:fence_bladecenter_snmp):       Started
>>>>>>>>> sip2
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> [root at sip1 ~]#
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Dne 8/14/2014 2:12 PM, piše Miha:
>>>>>>>>>
>>>>>>>>>> Hi emmanuel,
>>>>>>>>>>
>>>>>>>>>> i think so, what is the best way to check?
>>>>>>>>>>
>>>>>>>>>> Sorry for my noob question, I have confiured this 6 mouths ago and
>>>>>>>>>> everything was working fine till now. Now I need to find out what
>>>>>>>>>> realy
>>>>>>>>>> heppend beffor I do something stupid.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> tnx
>>>>>>>>>>
>>>>>>>>>> Dne 8/14/2014 1:58 PM, piše emmanuel segura:
>>>>>>>>>>>
>>>>>>>>>>> are you sure your cluster fencing is working?
>>>>>>>>>>>
>>>>>>>>>>> 2014-08-14 13:40 GMT+02:00 Miha <miha at softnet.si>:
>>>>>>>>>>>>
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> I noticed today that I am having some problem with cluster. I
>>>>>>>>>>>> noticed
>>>>>>>>>>>> the
>>>>>>>>>>>> master server is offilne but still virutal ip is assigned to it
>>>>>>>>>>>> and
>>>>>>>>>>>> all
>>>>>>>>>>>> services are running properly (for production).
>>>>>>>>>>>>
>>>>>>>>>>>> If I do this I am getting this notifications:
>>>>>>>>>>>>
>>>>>>>>>>>> [root at sip2 cluster]# pcs status
>>>>>>>>>>>> Error: cluster is not currently running on this node
>>>>>>>>>>>> [root at sip2 cluster]# /etc/init.d/corosync status
>>>>>>>>>>>> corosync dead but pid file exists
>>>>>>>>>>>> [root at sip2 cluster]# pcs status
>>>>>>>>>>>> Error: cluster is not currently running on this node
>>>>>>>>>>>> [root at sip2 cluster]#
>>>>>>>>>>>> [root at sip2 cluster]#
>>>>>>>>>>>> [root at sip2 cluster]# tailf fenced.log
>>>>>>>>>>>> Aug 14 13:34:25 fenced cman_get_cluster error -1 112
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> The main thing is what to do now? Do "pcs start" and hope for
>>>>>>>>>>>> the
>>>>>>>>>>>> best
>>>>>>>>>>>> or
>>>>>>>>>>>> what?
>>>>>>>>>>>>
>>>>>>>>>>>> I have pasted log in pastebin: http://pastebin.com/SUp2GcmN
>>>>>>>>>>>>
>>>>>>>>>>>> tnx!
>>>>>>>>>>>>
>>>>>>>>>>>> miha
>>>>>>>>>>>>
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>>>>>
>>>>>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>>>>>> Getting started:
>>>>>>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>>
>>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>>> Getting started:
>>>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>
>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>> Getting started:
>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>
>>>>>>
>>>>>>
>>>>> _______________________________________________
>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>
>>>>> Project Home: http://www.clusterlabs.org
>>>>> Getting started:
>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>> Bugs: http://bugs.clusterlabs.org
>>>>
>>>>
>>>>
>>>
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>
>>
>>
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org


-- 
esta es mi vida e me la vivo hasta que dios quiera