[ClusterLabs] PCMK_node_start_state=standby sometimes does not work

井上 和徳 inouekazu at intellilink.co.jp
Tue Dec 5 08:56:08 UTC 2017


Hi Ken,

Thank you for your comment. ("cibadmin --empty" is interesting.)

I registered in CLBZ :
https://bugs.clusterlabs.org/show_bug.cgi?id=5331

Best Regards

> -----Original Message-----
> From: Ken Gaillot [mailto:kgaillot at redhat.com]
> Sent: Saturday, December 02, 2017 8:02 AM
> To: Cluster Labs - All topics related to open-source clustering welcomed
> Subject: Re: [ClusterLabs] PCMK_node_start_state=standby sometimes does not work
> 
> On Tue, 2017-11-28 at 09:36 +0000, 井上 和徳 wrote:
> > Hi,
> >
> > Sometimes a node with 'PCMK_node_start_state=standby' will start up
> > Online.
> >
> > [ reproduction scenario ]
> >  * Set 'PCMK_node_start_state=standby' to /etc/sysconfig/pacemaker.
> >  * Delete cib (/var/lib/pacemaker/cib/*).
> >  * Start pacemaker at the same time on 2 nodes.
> >   # for i in rhel74-1 rhel74-3 ; do ssh -f $i systemctl start
> > pacemaker ; done
> >
> > [ actual result ]
> >  * crm_mon
> >   Stack: corosync
> >   Current DC: rhel74-3 (version 1.1.18-2b07d5c) - partition with
> > quorum
> >   Last change: Wed Nov 22 06:22:50 2017 by hacluster via crmd on
> > rhel74-3
> >
> >   2 nodes configured
> >   0 resources configured
> >
> >   Node rhel74-3: standby
> >   Online: [ rhel74-1 ]
> >
> >  * cib.xml
> >   <nodes>
> >     <node id="3232261507" uname="rhel74-1"/>
> >     <node id="3232261509" uname="rhel74-3">
> >       <instance_attributes id="nodes-3232261509">
> >         <nvpair id="nodes-3232261509-standby" name="standby"
> > value="on"/>
> >       </instance_attributes>
> >     </node>
> >   </nodes>
> >
> >  * pacemaker.log
> >   Nov 22 06:22:50 [20755] rhel74-1   crmd: (cib_native.c:462 )
> > warning: cib_native_perform_op_delegate:	Call failed: No such
> > device or address
> >   Nov 22 06:22:50 [20755] rhel74-1   crmd: ( cib_attrs.c:320
> > )    info: update_attr_delegate:	Update   <node
> > id="3232261507">
> >   Nov 22 06:22:50 [20755] rhel74-1   crmd: ( cib_attrs.c:320
> > )    info: update_attr_delegate:	Update     <instance_attribut
> > es id="nodes-3232261507">
> >   Nov 22 06:22:50 [20755] rhel74-1   crmd: ( cib_attrs.c:320
> > )    info: update_attr_delegate:	Update       <nvpair
> > id="nodes-3232261507-standby" name="standby" value="on"/>
> >   Nov 22 06:22:50 [20755] rhel74-1   crmd: ( cib_attrs.c:320
> > )    info: update_attr_delegate:	Update     </instance_attribu
> > tes>
> >   Nov 22 06:22:50 [20755] rhel74-1   crmd: ( cib_attrs.c:320
> > )    info: update_attr_delegate:	Update   </node>
> >
> >  * I attached crm_report to GitHub (too big to attach to this email),
> > so look at it.
> >    https://github.com/inouekazu/pcmk_report/blob/master/pcmk-Wed-22-N
> > ov-2017.tar.bz2
> >
> >
> > I think that the additional timing of <node id="3232261507">*1 and
> > <instance_attributes id="nodes-3232261507">*2 is the cause.
> > *1 <node id="3232261507" uname="rhel74-1"/>'
> > *2 <instance_attributes id="nodes-3232261507">
> >      <nvpair id="nodes-3232261507-standby" name="standby"
> > value="on"/>
> >
> > I expect to be fixed, but if it's difficult, I have two questions.
> > 1) Does this only occur if there is no cib.xml (in other words, there
> > is no <node> element)?
> 
> I believe so. I think this is the key message:
> 
> Nov 22 06:22:50 [20750] rhel74-1        cib: ( callbacks.c:1101  )
> warning: cib_process_request:        Completed cib_modify operation for
> section nodes: No such device or address (rc=-6, origin=rhel74-
> 1/crmd/12, version=0.3.0)
> 
> PCMK_node_start_state works by setting the "standby" node attribute in
> the CIB. However, it does this via a "modify" command that assumes the
> <nodes> tag already exists.
> 
> If there is no CIB, pacemaker will quickly create one -- but in this
> case, the node tries to set the attribute before that's happened.
> 
> Hopefully we can come up with a fix. If you want, you can file a bug
> report at bugs.clusterlabs.org, to track the progress.
> 
> > 2) Is there any workaround other than "Do not start at the same
> > time"?
> >
> > Best Regards
> 
> Before starting pacemaker, if /var/lib/pacemaker/cib is empty, you can
> create a skeleton CIB with:
> 
>  cibadmin --empty > /var/lib/pacemaker/cib/cib.xml
> 
> That will include an empty <nodes/> tag, and the modify command should
> work when pacemaker starts.
> --
> Ken Gaillot <kgaillot at redhat.com>
> 
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org


More information about the Users mailing list