[ClusterLabs] All IP resources deleted once a fenced node rejoins

Ken Gaillot kgaillot at redhat.com
Fri Jan 15 12:08:31 EST 2016


On 01/15/2016 05:02 AM, Arjun Pandey wrote:
> Based  on corosync logs from orana ( The node that did the actual
> fencing  and is the current master node)
> 
> I also tried looking at pengine outputs based on crm_simulate. Uptil
> the fenced node rejoins things look good.
> 
> [root at ucc1 orana]# crm_simulate -S --xml-file
> ./pengine/pe-input-1450.bz2  -u kamet
> Current cluster status:
> Node kamet: pending
> Online: [ orana ]

Above, "pending" means that the node has started to join the cluster,
but has not yet fully joined.


> Jan 13 19:32:53 [4295] orana    pengine:     info: probe_resources:
> Action probe_complete-kamet on kamet is unrunnable (pending)

Any action on kamet is unrunnable until it finishes joining the cluster.


> Jan 13 19:32:59 [4292] orana stonith-ng:     info:
> crm_update_peer_proc: pcmk_cpg_membership: Node kamet[2] -
> corosync-cpg is now online

The pacemaker daemons on orana each report when they see kamet come up
at the corosync level. Here, stonith-ng sees it.


> Jan 13 19:32:59 [4291] orana        cib:     info:
> crm_update_peer_proc: pcmk_cpg_membership: Node kamet[2] -
> corosync-cpg is now online

Now, the cib sees it.


> Jan 13 19:33:00 [4296] orana       crmd:     info:
> crm_update_peer_proc: pcmk_cpg_membership: Node kamet[2] -
> corosync-cpg is now online

Now, crmd sees it.


>>>> [Arjun] Why does pengine declare that the following monitor actions are now unrunnable ?
> 
> Jan 13 19:33:00 [4295] orana    pengine:  warning: custom_action:
> Action foo:0_monitor_0 on kamet is unrunnable (pending)

At this point, pengine still hasn't seen kamet join yet, so actions on
it are still unrunnable.


> Jan 13 19:33:00 [4296] orana       crmd:     info: join_make_offer:
> join-2: Sending offer to kamet

Having seen kamet at the corosync level, crmd now offers cluster-level
membership to kamet.


> Jan 13 19:33:00 [4291] orana        cib:     info:
> cib_process_replace: Replacement 0.4.0 from kamet not applied to
> 0.74.1: current epoch is greater than the replacement
> Jan 13 19:33:00 [4291] orana        cib:  warning:
> cib_process_request: Completed cib_replace operation for section
> 'all': Update was older than existing configuration (rc=-205,
> origin=kamet/cibadmin/2, version=0.74.1)
> Jan 13 19:33:00 [4291] orana        cib:     info: cib_perform_op:
> Diff: --- 0.74.1 2
> Jan 13 19:33:00 [4291] orana        cib:     info: cib_perform_op:
> Diff: +++ 0.75.0 (null)
> Jan 13 19:33:00 [4291] orana        cib:     info: cib_perform_op: --
> /cib/configuration/nodes/node[@id='kamet']
> Jan 13 19:33:00 [4291] orana        cib:     info: cib_perform_op: --
> /cib/configuration/nodes/node[@id='orana']
> Jan 13 19:33:00 [4291] orana        cib:     info: cib_perform_op: --
> /cib/configuration/resources/primitive[@id='fence-uc-orana']/meta_attributes[@id='fence-uc-orana-meta_attributes']
> Jan 13 19:33:00 [4291] orana        cib:     info: cib_perform_op: --
> /cib/configuration/resources/primitive[@id='fence-uc-kamet']
> Jan 13 19:33:00 [4291] orana        cib:     info: cib_perform_op: --
> /cib/configuration/resources/primitive[@id='C-3']
> Jan 13 19:33:00 [4291] orana        cib:     info: cib_perform_op: --
> /cib/configuration/resources/primitive[@id='C-FLT']
> Jan 13 19:33:00 [4291] orana        cib:     info: cib_perform_op: --
> /cib/configuration/resources/primitive[@id='C-FLT2']
> Jan 13 19:33:00 [4291] orana        cib:     info: cib_perform_op: --
> /cib/configuration/resources/primitive[@id='E-3']
> Jan 13 19:33:00 [4291] orana        cib:     info: cib_perform_op: --
> /cib/configuration/resources/primitive[@id='MGMT-FLT']
> Jan 13 19:33:00 [4291] orana        cib:     info: cib_perform_op: --
> /cib/configuration/resources/primitive[@id='M-FLT']
> Jan 13 19:33:00 [4291] orana        cib:     info: cib_perform_op: --
> /cib/configuration/resources/primitive[@id='M-FLT2']
> Jan 13 19:33:00 [4291] orana        cib:     info: cib_perform_op: --
> /cib/configuration/resources/primitive[@id='S-FLT']
> Jan 13 19:33:00 [4291] orana        cib:     info: cib_perform_op: --
> /cib/configuration/resources/primitive[@id='S-FLT2']
> Jan 13 19:33:00 [4291] orana        cib:     info: cib_perform_op: --
> /cib/configuration/constraints/rsc_colocation[@id='colocation-C-3-foo-master-INFINITY']
> Jan 13 19:33:00 [4291] orana        cib:     info: cib_perform_op: --
> /cib/configuration/constraints/rsc_order[@id='order-C-3-foo-master-mandatory']
> Jan 13 19:33:00 [4291] orana        cib:     info: cib_perform_op: --
> /cib/configuration/constraints/rsc_colocation[@id='colocation-C-FLT-foo-master-INFINITY']
> Jan 13 19:33:00 [4291] orana        cib:     info: cib_perform_op: --
> /cib/configuration/constraints/rsc_order[@id='order-C-FLT-foo-master-mandatory']
> Jan 13 19:33:00 [4291] orana        cib:     info: cib_perform_op: --
> /cib/configuration/constraints/rsc_colocation[@id='colocation-C-FLT2-foo-master-INFINITY']
> Jan 13 19:33:00 [4291] orana        cib:     info: cib_perform_op: --
> /cib/configuration/constraints/rsc_order[@id='order-C-FLT2-foo-master-mandatory']
> Jan 13 19:33:00 [4291] orana        cib:     info: cib_perform_op: --
> /cib/configuration/constraints/rsc_colocation[@id='colocation-E-3-foo-master-INFINITY']
> Jan 13 19:33:00 [4291] orana        cib:     info: cib_perform_op: --
> /cib/configuration/constraints/rsc_order[@id='order-E-3-foo-master-mandatory']
> Jan 13 19:33:00 [4291] orana        cib:     info: cib_perform_op: --
> /cib/configuration/constraints/rsc_colocation[@id='colocation-MGMT-FLT-foo-master-INFINITY']
> Jan 13 19:33:00 [4291] orana        cib:     info: cib_perform_op: --
> /cib/configuration/constraints/rsc_order[@id='order-MGMT-FLT-foo-master-mandatory']
> Jan 13 19:33:00 [4291] orana        cib:     info: cib_perform_op: --
> /cib/configuration/constraints/rsc_colocation[@id='colocation-M-FLT-foo-master-INFINITY']
> Jan 13 19:33:00 [4291] orana        cib:     info: cib_perform_op: --
> /cib/configuration/constraints/rsc_order[@id='order-M-FLT-foo-master-mandatory']
> Jan 13 19:33:00 [4291] orana        cib:     info: cib_perform_op: --
> /cib/configuration/constraints/rsc_colocation[@id='colocation-M-FLT2-foo-master-INFINITY']
> Jan 13 19:33:00 [4291] orana        cib:     info: cib_perform_op: --
> /cib/configuration/constraints/rsc_order[@id='order-M-FLT2-foo-master-mandatory']
> Jan 13 19:33:00 [4291] orana        cib:     info: cib_perform_op: --
> /cib/configuration/constraints/rsc_colocation[@id='colocation-S-FLT-foo-master-INFINITY']
> Jan 13 19:33:00 [4291] orana        cib:     info: cib_perform_op: --
> /cib/configuration/constraints/rsc_order[@id='order-S-FLT-foo-master-mandatory']
> Jan 13 19:33:00 [4291] orana        cib:     info: cib_perform_op: --
> /cib/configuration/constraints/rsc_colocation[@id='colocation-S-FLT2-foo-master-INFINITY']
> Jan 13 19:33:00 [4291] orana        cib:     info: cib_perform_op: --
> /cib/configuration/constraints/rsc_order[@id='order-S-FLT2-foo-master-mandatory']
> Jan 13 19:33:00 [4291] orana        cib:     info: cib_perform_op: --
> /cib/configuration/constraints/rsc_colocation[@id='colocation-fence-uc-orana-foo-master-INFINITY']
> Jan 13 19:33:00 [4291] orana        cib:     info: cib_perform_op: --
> /cib/configuration/constraints/rsc_colocation[@id='colocation-fence-uc-kamet-foo-master-INFINITY']
> Jan 13 19:33:00 [4291] orana        cib:     info: cib_perform_op: --
> /cib/configuration/constraints/rsc_order[@id='order-fence-uc-kamet-foo-master-mandatory']
> Jan 13 19:33:00 [4291] orana        cib:     info: cib_perform_op: --
> /cib/configuration/constraints/rsc_order[@id='order-fence-uc-orana-foo-master-mandatory']
> Jan 13 19:33:00 [4291] orana        cib:     info: cib_perform_op: +
> /cib:  @epoch=75, @num_updates=0
> Jan 13 19:33:00 [4291] orana        cib:     info: cib_perform_op: +
> /cib/configuration/resources/primitive[@id='fence-uc-orana']/instance_attributes[@id='fence-uc-orana-instance_attributes']/nvpair[@id='fence-uc-orana-instance_attributes-delay']:
>  @value=0
> Jan 13 19:33:00 [4291] orana        cib:     info:
> cib_process_request: Completed cib_replace operation for section
> configuration: OK (rc=0, origin=kamet/cibadmin/2, version=0.75.0)

The above is the problem. You can see all the resources being deleted
from the CIB ("--" indicates lines being removed from the CIB, and "+"
indicates lines being added). For some reason, the cluster used a much
older CIB on kamet to replace the current one used by the cluster.

I'm not sure why this happened; it may be a bug.

What version of pacemaker are you using?

Check the permissions on /var/lib/pacemaker/cib and the files in it on
both nodes. I'd expect everything to be owned and writeable by the
hacluster user.

>>>>>> [Arjun]  What do the following logs signify ?
> Jan 13 19:33:00 [4292] orana stonith-ng:     info:
> stonith_device_remove: Device 'C-3' not found (2 active devices)

These are not important in themselves, but are follow-up effects from
the resources being removed from the CIB above. Whenever the CIB
changes, stonith-ng will re-check what fencing devices are available.







More information about the Users mailing list