[ClusterLabs] pacemaker remote node ofgline after reboot

Ignazio Cassano ignaziocassano at gmail.com
Fri May 12 07:25:46 EDT 2017


Hello, there are no constraints for node compute-1.

The following is the corosync.log on the cluste node  :

ay 12 13:14:47 [7281] tst-controller-01        cib:     info:
cib_process_request:    Forwarding cib_delete operation for section
//node_state[@uname='compute-0']//lrm_resource[@id='compute-1'] to all
(origin=local/crmd/2856)
May 12 13:14:47 [7281] tst-controller-01        cib:     info:
cib_process_request:    Completed cib_delete operation for section
//node_state[@uname='compute-0']//lrm_resource[@id='compute-1']: OK (rc=0,
origin=tst-controller-01/crmd/2856, version=0.555.6)
May 12 13:14:47 [7286] tst-controller-01       crmd:     info:
delete_resource:Removing resource compute-1 for
328a6a8b-e4f1-4b48-9dc5-e418ba0e2850 (root) on tst-controller-01
May 12 13:14:47 [7286] tst-controller-01       crmd:     info:
notify_deleted:    Notifying 328a6a8b-e4f1-4b48-9dc5-e418ba0e2850 on
tst-controller-01 that compute-1 was deleted
May 12 13:14:47 [7281] tst-controller-01        cib:     info:
cib_process_request:    Forwarding cib_delete operation for section
//node_state[@uname='compute-0']//lrm_resource[@id='compute-1'] to all
(origin=local/crmd/2857)
May 12 13:14:47 [7281] tst-controller-01        cib:     info:
cib_process_request:    Completed cib_delete operation for section
//node_state[@uname='compute-0']//lrm_resource[@id='compute-1']: OK (rc=0,
origin=tst-controller-01/crmd/2857, version=0.555.6)
May 12 13:14:47 [7281] tst-controller-01        cib:     info:
cib_process_request:    Forwarding cib_delete operation for section
//node_state[@uname='tst-controller-01']//lrm_resource[@id='compute-1'] to
all (origin=local/crmd/2860)
May 12 13:14:47 [7281] tst-controller-01        cib:     info:
cib_process_request:    Completed cib_delete operation for section
//node_state[@uname='tst-controller-01']//lrm_resource[@id='compute-1']: OK
(rc=0, origin=tst-controller-01/crmd/2860, version=0.556.0)
May 12 13:14:47 [7286] tst-controller-01       crmd:     info:
delete_resource:Removing resource compute-1 for
328a6a8b-e4f1-4b48-9dc5-e418ba0e2850 (root) on tst-controller-01
May 12 13:14:47 [7286] tst-controller-01       crmd:     info:
notify_deleted:    Notifying 328a6a8b-e4f1-4b48-9dc5-e418ba0e2850 on
tst-controller-01 that compute-1 was deleted
May 12 13:14:47 [7281] tst-controller-01        cib:     info:
cib_process_request:    Completed cib_delete operation for section
//node_state[@uname='tst-controller-03']//lrm_resource[@id='compute-1']: OK
(rc=0, origin=tst-controller-03/crmd/2021, version=0.556.0)
May 12 13:14:47 [7281] tst-controller-01        cib:     info:
cib_process_request:    Forwarding cib_delete operation for section
//node_state[@uname='tst-controller-01']//lrm_resource[@id='compute-1'] to
all (origin=local/crmd/2861)
May 12 13:14:47 [7281] tst-controller-01        cib:     info:
cib_process_request:    Completed cib_delete operation for section
//node_state[@uname='tst-controller-03']//lrm_resource[@id='compute-1']: OK
(rc=0, origin=tst-controller-03/crmd/2022, version=0.556.0)
May 12 13:14:47 [7281] tst-controller-01        cib:     info:
cib_process_request:    Completed cib_delete operation for section
//node_state[@uname='tst-controller-01']//lrm_resource[@id='compute-1']: OK
(rc=0, origin=tst-controller-01/crmd/2861, version=0.556.0)
May 12 13:14:47 [7281] tst-controller-01        cib:     info:
cib_process_request:    Completed cib_delete operation for section
//node_state[@uname='tst-controller-02']//lrm_resource[@id='compute-1']: OK
(rc=0, origin=tst-controller-02/crmd/4840, version=0.556.0)
May 12 13:14:47 [7281] tst-controller-01        cib:     info:
cib_process_request:    Completed cib_delete operation for section
//node_state[@uname='tst-controller-02']//lrm_resource[@id='compute-1']: OK
(rc=0, origin=tst-controller-02/crmd/4841, version=0.556.0)



On 05/12/2017 12:32 PM, Ignazio Cassano wrote:
> Hello, some updates.
> Now I am not able enable compute-1 like yesterday: removing and
> readding it.
> Must If I remove it and add in the /etc/hosts of the cluster nodes an
> alias like compute1 , removing compute-1 and addiing compute1, it goes
> online .
>
>
> 2017-05-12 12:08 GMT+02:00 Ignazio Cassano <ignaziocassano at gmail.com
> <mailto:ignaziocassano at gmail.com>>:
>
>     Hello, I do not know if it is the correct mode to answer in this
>     mailing list.
>     Anycase, either shutdown the remote node or fencing it with ipmi ,
>     it does not retrurn online.
>     them pacemaker-remote service is enabled and restart at reboot.
>     But I continue to have the following on my cluster:
>
>     Online: [ tst-controller-01 tst-controller-02 tst-controller-03 ]
>     RemoteOnline: [ compute-0 ]
>     RemoteOFFLINE: [ compute-1 ]
>
>     Full list of resources:
>
>      Resource Group: vip
>          vipmanagement    (ocf::heartbeat:IPaddr2):    Started
>     tst-controller-03
>          vipinternalpi    (ocf::heartbeat:IPaddr2):    Started
>     tst-controller-03
>          lb-haproxy    (systemd:haproxy):    Started tst-controller-03
>      Clone Set: httpd-clone [httpd]
>          Started: [ tst-controller-01 tst-controller-02
>     tst-controller-03 ]
>          Stopped: [ compute-0 compute-1 ]
>      Clone Set: openstack-glance-api-clone [openstack-glance-api]
>          Started: [ tst-controller-01 tst-controller-02
>     tst-controller-03 ]
>          Stopped: [ compute-0 compute-1 ]
>      Clone Set: openstack-glance-registry-clone
>     [openstack-glance-registry]
>          Started: [ tst-controller-01 tst-controller-02
>     tst-controller-03 ]
>          Stopped: [ compute-0 compute-1 ]
>      Clone Set: openstack-nova-api-clone [openstack-nova-api]
>          Started: [ tst-controller-01 tst-controller-02
>     tst-controller-03 ]
>          Stopped: [ compute-0 compute-1 ]
>      Clone Set: openstack-nova-consoleauth-clone
>     [openstack-nova-consoleauth]
>          Started: [ tst-controller-01 tst-controller-02
>     tst-controller-03 ]
>          Stopped: [ compute-0 compute-1 ]
>      Clone Set: openstack-nova-scheduler-clone [openstack-nova-scheduler]
>          Started: [ tst-controller-01 tst-controller-02
>     tst-controller-03 ]
>          Stopped: [ compute-0 compute-1 ]
>      Clone Set: openstack-nova-conductor-clone [openstack-nova-conductor]
>          Started: [ tst-controller-01 tst-controller-02
>     tst-controller-03 ]
>          Stopped: [ compute-0 compute-1 ]
>      Clone Set: openstack-nova-novncproxy-clone
>     [openstack-nova-novncproxy]
>          Started: [ tst-controller-01 tst-controller-02
>     tst-controller-03 ]
>          Stopped: [ compute-0 compute-1 ]
>      Clone Set: neutron-server-clone [neutron-server]
>          Started: [ tst-controller-01 tst-controller-02
>     tst-controller-03 ]
>          Stopped: [ compute-0 compute-1 ]
>      Clone Set: neutron-openvswitch-agent-clone
>     [neutron-openvswitch-agent]
>          Started: [ tst-controller-01 tst-controller-02
>     tst-controller-03 ]
>          Stopped: [ compute-0 compute-1 ]
>      Clone Set: neutron-dhcp-agent-clone [neutron-dhcp-agent]
>          Started: [ tst-controller-01 tst-controller-02
>     tst-controller-03 ]
>          Stopped: [ compute-0 compute-1 ]
>      Clone Set: neutron-metadata-agent-clone [neutron-metadata-agent]
>          Started: [ tst-controller-01 tst-controller-02
>     tst-controller-03 ]
>          Stopped: [ compute-0 compute-1 ]
>      Clone Set: neutron-l3-agent-clone [neutron-l3-agent]
>          Started: [ tst-controller-01 tst-controller-02
>     tst-controller-03 ]
>          Stopped: [ compute-0 compute-1 ]
>      Clone Set: openstack-cinder-api-clone [openstack-cinder-api]
>          Started: [ tst-controller-01 tst-controller-02
>     tst-controller-03 ]
>          Stopped: [ compute-0 compute-1 ]
>      Clone Set: openstack-cinder-scheduler-clone
>     [openstack-cinder-scheduler]
>          Started: [ tst-controller-01 tst-controller-02
>     tst-controller-03 ]
>          Stopped: [ compute-0 compute-1 ]
>      openstack-cinder-volume    (systemd:openstack-cinder-volume):
>     Started tst-controller-02
>      Clone Set: openstack-cinder-backup-clone [openstack-cinder-backup]
>          Started: [ tst-controller-01 tst-controller-02
>     tst-controller-03 ]
>          Stopped: [ compute-0 compute-1 ]
>      Clone Set: cinder-perm-check-clone [cinder-perm-check]
>          Started: [ tst-controller-01 tst-controller-02
>     tst-controller-03 ]
>          Stopped: [ compute-0 compute-1 ]
>      Clone Set: lbaasv2-clone [lbaasv2]
>          Started: [ tst-controller-01 tst-controller-02
>     tst-controller-03 ]
>          Stopped: [ compute-0 compute-1 ]
>      lbaas-check    (systemd:lbaas-check):    Started tst-controller-01
>      Clone Set: clean-haproxy-clone [clean-haproxy]
>          Started: [ tst-controller-01 tst-controller-02
>     tst-controller-03 ]
>          Stopped: [ compute-0 compute-1 ]
>      Clone Set: openstack-heat-api-clone [openstack-heat-api]
>          Started: [ tst-controller-01 tst-controller-02
>     tst-controller-03 ]
>          Stopped: [ compute-0 compute-1 ]
>      Clone Set: openstack-heat-api-cfn-clone [openstack-heat-api-cfn]
>          Started: [ tst-controller-01 tst-controller-02
>     tst-controller-03 ]
>          Stopped: [ compute-0 compute-1 ]
>      Clone Set: openstack-heat-engine-clone [openstack-heat-engine]
>          Started: [ tst-controller-01 tst-controller-02
>     tst-controller-03 ]
>          Stopped: [ compute-0 compute-1 ]
>      Clone Set: openstack-ceilometer-notification-clone
>     [openstack-ceilometer-notification]
>          Started: [ tst-controller-01 tst-controller-02
>     tst-controller-03 ]
>          Stopped: [ compute-0 compute-1 ]
>      Clone Set: openstack-ceilometer-central-clone
>     [openstack-ceilometer-central]
>          Started: [ tst-controller-01 tst-controller-02
>     tst-controller-03 ]
>          Stopped: [ compute-0 compute-1 ]
>      Clone Set: openstack-ceilometer-collector-clone
>     [openstack-ceilometer-collector]
>          Started: [ tst-controller-01 tst-controller-02
>     tst-controller-03 ]
>          Stopped: [ compute-0 compute-1 ]
>      Clone Set: openstack-aodh-evaluator-clone [openstack-aodh-evaluator]
>          Started: [ tst-controller-01 tst-controller-02
>     tst-controller-03 ]
>          Stopped: [ compute-0 compute-1 ]
>      Clone Set: openstack-aodh-notifier-clone [openstack-aodh-notifier]
>          Started: [ tst-controller-01 tst-controller-02
>     tst-controller-03 ]
>          Stopped: [ compute-0 compute-1 ]
>      Clone Set: openstack-aodh-listener-clone [openstack-aodh-listener]
>          Started: [ tst-controller-01 tst-controller-02
>     tst-controller-03 ]
>          Stopped: [ compute-0 compute-1 ]
>      Clone Set: openstack-swift-proxy-clone [openstack-swift-proxy]
>          Started: [ tst-controller-01 tst-controller-02
>     tst-controller-03 ]
>          Stopped: [ compute-0 compute-1 ]
>      Clone Set: openstack-nova-compute-clone [openstack-nova-compute]
>          Started: [ compute-0 ]
>          Stopped: [ compute-1 tst-controller-01 tst-controller-02
>     tst-controller-03 ]
>      Clone Set: openstack-ceilometer-compute-clone
>     [openstack-ceilometer-compute]
>          Started: [ compute-0 ]
>          Stopped: [ compute-1 tst-controller-01 tst-controller-02
>     tst-controller-03 ]
>      Clone Set: neutron-openvswitch-agent-compute-clone
>     [neutron-openvswitch-agent-compute]
>          Started: [ compute-0 ]
>          Stopped: [ compute-1 tst-controller-01 tst-controller-02
>     tst-controller-03 ]
>      compute-0    (ocf::pacemaker:remote):    Started tst-controller-01
>      evacuate    (systemd:evacuate):    Started tst-controller-02
>      clean-resources    (systemd:clean_resources):    Started
>     tst-controller-02
>      compute-1    (ocf::pacemaker:remote):    Stopped
>

What we definitely see here is that the remote-connection-resource is
stopped.
(The part that communicates with pacemaker_remoted on the remote-node)
Do you see any signs in the logs why it is stopped? Is it the target-role?
Is there a constraint forbidding it to start? Does it fail starting?

Regards,
Klaus
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20170512/75d7b937/attachment-0003.html>


More information about the Users mailing list