[Pacemaker] strage monitor behaivour

Dejan Muhamedagic dejanmm at fastmail.fm
Wed Jun 8 06:38:24 EDT 2011


On Wed, Jun 08, 2011 at 12:22:44PM +0400, ruslan usifov wrote:
> 2011/6/8 Dejan Muhamedagic <dejanmm at fastmail.fm>
> 
> > On Tue, Jun 07, 2011 at 11:19:25AM -0600, Serge Dubrouski wrote:
> > > On Tue, Jun 7, 2011 at 9:55 AM, Dejan Muhamedagic <dejanmm at fastmail.fm
> > >wrote:
> > >
> > > > On Tue, Jun 07, 2011 at 09:47:17AM -0600, Serge Dubrouski wrote:
> > > > > On Tue, Jun 7, 2011 at 9:39 AM, Dejan Muhamedagic <
> > dejanmm at fastmail.fm
> > > > >wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > On Tue, Jun 07, 2011 at 08:45:07AM -0600, Serge Dubrouski wrote:
> > > > > > > No, RA acts like it should. It can't find necessary software and
> > > > returns
> > > > > > > OCF_NOT_CONFIGURED, all RAs act this way. You have to install all
> > > > > > software
> > > > > > > used in you cluster on all nodes even if you are not actually
> > > > planning to
> > > > > > > run that software on some of them.
> > > > > >
> > > > > > This is not so. A resource agents should be able to figure out if
> > > > > > there's no software installed and then return NOT_INSTALLED or
> > > > > > NOT_RUNNING.
> > > > > >
> > > > >
> > > > > That changes exit code but doesn't change the requirements to have
> > that
> > > > > software installed and able to report that it's down.
> > > >
> > > > If it's not installed, then it cannot run. The difference between
> > > > NOT_INSTALLED and NOT_CONFIGURED is that in the former case
> > > > pacemaker won't try to start the resource on that node whereas in
> > > > the latter it will give up on the resource completely.
> > > >
> > >
> > > Thanks for clarifications. Now it better explains Ruslan's case. DRBD
> > isn't
> > > installed so it returns NOT_INSTALLED and that's treated as definitely
> > DOWN
> > > by Pacemaker when it runs status/monitor operations. iSCSI in it's turn
> > is
> > > installed but not configured and that's treated as state is UNKNOWN. Am I
> > > correct?
> >
> > No idea, he didn't provide enough information (logs) :)
> >
> > Thanks,
> >
> > Dejan
> >
> > > >
> > > > Cheers,
> > > >
> > > > Dejan
> > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Dejan
> > > > > >
> > > > > > > On Tue, Jun 7, 2011 at 8:02 AM, ruslan usifov <
> > > > ruslan.usifov at gmail.com
> > > > > > >wrote:
> > > > > > >
> > > > > > > > Thanks fo replay i undestend already this moment. Now i think
> > that
> > > > this
> > > > > > is
> > > > > > > > problem of ocf:heartbeat:iSCSITarget RA, which return
> > unproperly
> > > > return
> > > > > > > > code, when no any iscssi target software installed
> > > > > > > >
> > > > > > > > 2011/6/7 Serge Dubrouski <sergeyfd at gmail.com>
> > > > > > > >
> > > > > > > >> This questions pops up over and over again. Pacemaker has to
> > make
> > > > sure
> > > > > > > >> that your resources aren't up anywhere in the cluster before
> > start
> > > > > > them up
> > > > > > > >> on designated nodes. That means that it has to be able to run
> > > > > > status/monitor
> > > > > > > >> operations for all configured resources on all configured
> > nodes.
> > > > You
> > > > > > can't
> > > > > > > >> just add a 3rd quorum node into cluster you have to make sure
> > that
> > > > all
> > > > > > RAs
> > > > > > > >> that you use can run on that 3rd node properly.
> > > > > > > >>
> > > > > > > >> On Tue, Jun 7, 2011 at 1:58 AM, ruslan usifov <
> > > > > > ruslan.usifov at gmail.com>wrote:
> > > > > > > >>
> > > > > > > >>> Hello
> > > > > > > >>>
> > > > > > > >>> I have 3 node cluster (in future we add another one node)
> > with
> > > > follow
> > > > > > > >>> configuration:
> > > > > > > >>>
> > > > > > > >>> crm(live)configure# show
> > > > > > > >>> node drbd1
> > > > > > > >>> node drbd2
> > > > > > > >>> node drbd3
> > > > > > > >>> primitive drbd_web ocf:linbit:drbd \
> > > > > > > >>>         params drbd_resource="web" \
> > > > > > > >>>         op monitor interval="10s" timeout="60s"
> > > > > > > >>> primitive drbd_web-U ocf:linbit:drbd \
> > > > > > > >>>         params drbd_resource="web-U" \
> > > > > > > >>>         op monitor interval="10s" timeout="60s"
> > > > > > > >>> primitive iscsi_ip_web ocf:heartbeat:IPaddr2 \
> > > > > > > >>>         params ip="192.168.19.91" nic="eth1:1"
> > cidr_netmask="24"
> > > > > > > >>> primitive iscsi_web_target ocf:heartbeat:iSCSITarget \
> > > > > > > >>>         params iqn="iqn.2010-06.playrix.local:san.web" \
> > > > > > > >>>         op monitor interval="10s" timeout="30s"
> > > > > > > >>> primitive iscsi_web_target_lun0
> > ocf:heartbeat:iSCSILogicalUnit \
> > > > > > > >>>         params lun="0" path="/dev/drbd10"
> > > > > > > >>> target_iqn="iqn.2010-06.playrix.local:san.web"
> > > > > > > >>> group iscsi_web iscsi_ip_web iscsi_web_target
> > > > iscsi_web_target_lun0
> > > > > > > >>> ms ms_drbd_web drbd_web \
> > > > > > > >>>         meta master-max="1" master-node-max="1" clone-max="2"
> > > > > > > >>> clone-node-max="1" notify="true" globally-unique="false"
> > > > > > > >>> target-role="Started" is-managed="true"
> > > > > > > >>> ms ms_drbd_web-U drbd_web-U \
> > > > > > > >>>         meta master-max="1" master-node-max="1" clone-max="1"
> > > > > > > >>> clone-node-max="1" notify="true" is-managed="true"
> > > > > > globally-unique="false"
> > > > > > > >>> location ms_drbd_web-U_on_drbd1_or_drbd2 ms_drbd_web-U \
> > > > > > > >>>         rule $id="ms_drbd_web-U_on_drbd1_or_drbd2-rule" -inf:
> > > > #uname
> > > > > > ne
> > > > > > > >>> drbd1 and #uname ne drbd2
> > > > > > > >>> location ms_drbd_web_on_drbd1_or_drbd2 ms_drbd_web \
> > > > > > > >>>         rule $id="ms_drbd_web_on_drbd1_or_drbd2-rule" -inf:
> > > > #uname ne
> > > > > > > >>> drbd1 and #uname ne drbd2
> > > > > > > >>> colocation drbd_web-U_on_drbd_web inf: ms_drbd_web-U:Master
> > > > > > > >>> ms_drbd_web:Master
> > > > > > > >>> colocation iscsi_ip_web_on_drbd_web inf: iscsi_ip_web
> > > > > > ms_drbd_web:Master
> > > > > > > >>> colocation iscsi_web_on_drbd_web-U inf: iscsi_web
> > > > > > ms_drbd_web-U:Master
> > > > > > > >>> order iscsi_web_after_ms_drbd_web-U inf: ms_drbd_web-U:start
> > > > > > iscsi_web
> > > > > > > >>> order ms_drbd_web-U_after_iscsi_ip_web inf:
> > iscsi_ip_web:start
> > > > > > > >>> ms_drbd_web-U:start
> > > > > > > >>> order ms_drbd_web-U_before_ms_drbd_web inf:
> > ms_drbd_web:promote
> > > > > > > >>> iscsi_ip_web:start
> > > > > > > >>> property $id="cib-bootstrap-options" \
> > > > > > > >>>
> > > > dc-version="1.0.11-db98485d06ed3fe0fe236509f023e1bd4a5566f1"
> > > > > > \
> > > > > > > >>>         cluster-infrastructure="openais" \
> > > > > > > >>>         expected-quorum-votes="3" \
> > > > > > > >>>         stonith-enabled="false" \
> > > > > > > >>>         last-lrm-refresh="1307432239" \
> > > > > > > >>>         symmetric-cluster="true"
> > > > > > > >>>
> > > > > > > >>>
> > > > > > > >>>
> > > > > > > >>> In this configuration i want that all resources ran only on
> > drbd1
> > > > and
> > > > > > > >>> drb2 nodes. And as i understand wit location constraint i
> > should
> > > > > > reach this
> > > > > > > >>> objective. And all resource mast run on drbd1 and drbb2
> > nodes.
> > > > But i
> > > > > > got
> > > > > > > >>> follow error:
> > > > > > > >>>
> > > > > > > >>> Failed actions:
> > > > > > > >>>     iscsi_web_target_monitor_0 (node=drbd3, call=5, rc=6,
> > > > > > > >>> status=complete): not configured
> > > > > > > >>>     iscsi_web_target_lun0_monitor_0 (node=drbd3, call=6,
> > rc=6,
> > > > > > > >>> status=complete): not configured
> > > > > > > >>>
> > > > > > > >>>
> > > > > > > >>>
> > > > > > > >>> And i confused why drbd3? There is nothing must run o
> > monitored.
> > > > > > Please
> > > > > > > >>> if it is possible explain this behavior
> > > > > > > >>>
> > > > > > > >>>
> > > > > > > >>>
> > > > > > > >>> _______________________________________________
> > > > > > > >>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > > > > > > >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > > > > > > >>>
> > > > > > > >>> Project Home: http://www.clusterlabs.org
> > > > > > > >>> Getting started:
> > > > > > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > > > > > > >>> Bugs:
> > > > > > > >>>
> > > > > >
> > > >
> > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
> > > > > > > >>>
> > > > > > > >>>
> > > > > > > >>
> > > > > > > >>
> > > > > > > >> --
> > > > > > > >> Serge Dubrouski.
> > > > > > > >>
> > > > > > > >> _______________________________________________
> > > > > > > >> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > > > > > > >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > > > > > > >>
> > > > > > > >> Project Home: http://www.clusterlabs.org
> > > > > > > >> Getting started:
> > > > > > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > > > > > > >> Bugs:
> > > > > > > >>
> > > > > >
> > > >
> > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
> > > > > > > >>
> > > > > > > >>
> > > > > > > >
> > > > > > > > _______________________________________________
> > > > > > > > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > > > > > > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > > > > > > >
> > > > > > > > Project Home: http://www.clusterlabs.org
> > > > > > > > Getting started:
> > > > > > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > > > > > > > Bugs:
> > > > > > > >
> > > > > >
> > > >
> > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Serge Dubrouski.
> > > > > >
> > > > > > > _______________________________________________
> > > > > > > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > > > > > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > > > > > >
> > > > > > > Project Home: http://www.clusterlabs.org
> > > > > > > Getting started:
> > > > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > > > > > > Bugs:
> > > > > >
> > > >
> > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
> > > > > >
> > > > > >
> > > > > > _______________________________________________
> > > > > > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > > > > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > > > > >
> > > > > > Project Home: http://www.clusterlabs.org
> > > > > > Getting started:
> > > > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > > > > > Bugs:
> > > > > >
> > > >
> > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Serge Dubrouski.
> > > >
> > > > > _______________________________________________
> > > > > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > > > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > > > >
> > > > > Project Home: http://www.clusterlabs.org
> > > > > Getting started:
> > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > > > > Bugs:
> > > >
> > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
> > > >
> > > >
> > > > _______________________________________________
> > > > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > > >
> > > > Project Home: http://www.clusterlabs.org
> > > > Getting started:
> > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > > > Bugs:
> > > >
> > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
> > > >
> > >
> > >
> > >
> > > --
> > > Serge Dubrouski.
> >
> > > _______________________________________________
> > > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > >
> > > Project Home: http://www.clusterlabs.org
> > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > > Bugs:
> > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
> >
> >
> > _______________________________________________
> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs:
> > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
> >
> 
> 
> Here was in my logs on drbd3(where iscsitarget doesn't installed)
> 
> 
> Jun 07 11:33:10 drbd3 lrmd: [2391]: info: RA output:
> (iscsi_web_target:probe:stderr) 2011/06/07_11:33:10 ERROR: Unsupported iSCSI
> target implementation ""!
> Jun 07 11:33:11 drbd3 lrmd: [2391]: info: RA output:
> (iscsi_web_target_lun0:probe:stderr) 2011/06/07_11:33:11 ERROR: Missing
> resource parameter "implementation"!

What happens is that the agent looks for one of the three iscsi
target implementations, finds none, and none is configured. It's
a border case. But it could arguably in this case return
OCF_ERR_INSTALLED. The worst that can happen is that all nodes
return the same exit code. Florian?

Thanks,

Dejan




More information about the Pacemaker mailing list