<div dir="ltr">Sorry for the delayed response, but I was out last week. I've applied this patch to 1.1.10-rc5 and have been testing:<div><br></div><blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px"><div>
<div><font face="courier new, monospace"># crm_attribute --type status --node "db02" --name "service_postgresql" --update "true"</font></div></div><div><div><font face="courier new, monospace"># crm_attribute --type status --node "db02" --name "service_postgresql"</font></div>
</div><div><div><font face="courier new, monospace">scope=status name=service_postgresql value=true</font></div></div><div><div><font face="courier new, monospace"># crm resource stop vm-db02</font></div></div><div><div>
<font face="courier new, monospace"># crm resource start vm-db02</font></div></div><div><div><font face="courier new, monospace">### Wait a bit</font></div></div><div><div><font face="courier new, monospace"># crm_attribute --type status --node "db02" --name "service_postgresql" </font></div>
</div><div><div><font face="courier new, monospace">scope=status name=service_postgresql value=(null)</font></div></div><div><div><font face="courier new, monospace">Error performing operation: No such device or address</font></div>
</div><div><div><font face="courier new, monospace"># crm_attribute --type status --node "db02" --name "service_postgresql" --update "true"</font></div></div><div><div><font face="courier new, monospace"># crm_attribute --type status --node "db02" --name "service_postgresql" </font></div>
</div><div><div><font face="courier new, monospace">scope=status name=service_postgresql value=true</font></div></div></blockquote><div><br></div><div>Good so far. But now look at this (every node was clean, and all services were running, before we started):</div>
<blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px"><div><div><font face="courier new, monospace"># crm status</font></div></div><div><div><font face="courier new, monospace">Last updated: Tue Jul 2 16:15:14 2013</font></div>
</div><div><div><font face="courier new, monospace">Last change: Tue Jul 2 16:15:12 2013 via crmd on cvmh02</font></div></div><div><div><font face="courier new, monospace">Stack: cman</font></div></div><div><div><font face="courier new, monospace">Current DC: cvmh02 - partition with quorum</font></div>
</div><div><div><font face="courier new, monospace">Version: 1.1.10rc5-1.el6.ccni-2718638</font></div></div><div><div><font face="courier new, monospace">9 Nodes configured, unknown expected votes</font></div></div><div><div>
<font face="courier new, monospace">59 Resources configured.</font></div></div><div><div><font face="courier new, monospace"><br></font></div></div><div><div><font face="courier new, monospace"><br></font></div></div><div>
<div><font face="courier new, monospace">Node db02: UNCLEAN (offline)</font></div></div><div><div><font face="courier new, monospace">Online: [ cvmh01 cvmh02 cvmh03 cvmh04 db02:vm-db02 ldap01:vm-ldap01 ldap02:vm-ldap02 ]</font></div>
</div><div><div><font face="courier new, monospace">OFFLINE: [ swbuildsl6:vm-swbuildsl6 ]</font></div></div><div><div><font face="courier new, monospace"><br></font></div></div><div><div><font face="courier new, monospace">Full list of resources:</font></div>
</div><div><div><font face="courier new, monospace"><br></font></div></div><div><div><font face="courier new, monospace"> fence-cvmh01 (stonith:fence_ipmilan): Started cvmh04 </font></div></div><div><div><font face="courier new, monospace"> fence-cvmh02 (stonith:fence_ipmilan): Started cvmh04 </font></div>
</div><div><div><font face="courier new, monospace"> fence-cvmh03 (stonith:fence_ipmilan): Started cvmh04 </font></div></div><div><div><font face="courier new, monospace"> fence-cvmh04 (stonith:fence_ipmilan): Started cvmh01 </font></div>
</div><div><div><font face="courier new, monospace"> Clone Set: c-fs-libvirt-VM-xcm [fs-libvirt-VM-xcm]</font></div></div><div><div><font face="courier new, monospace"> Started: [ cvmh01 cvmh02 cvmh03 cvmh04 ]</font></div>
</div><div><div><font face="courier new, monospace"> Stopped: [ db02 ldap01 ldap02 swbuildsl6 ]</font></div></div><div><div><font face="courier new, monospace"> Clone Set: c-p-libvirtd [p-libvirtd]</font></div></div><div>
<div><font face="courier new, monospace"> Started: [ cvmh01 cvmh02 cvmh03 cvmh04 ]</font></div></div><div><div><font face="courier new, monospace"> Stopped: [ db02 ldap01 ldap02 swbuildsl6 ]</font></div></div><div>
<div><font face="courier new, monospace"> Clone Set: c-fs-bind-libvirt-VM-cvmh [fs-bind-libvirt-VM-cvmh]</font></div></div><div><div><font face="courier new, monospace"> Started: [ cvmh01 cvmh02 cvmh03 cvmh04 ]</font></div>
</div><div><div><font face="courier new, monospace"> Stopped: [ db02 ldap01 ldap02 swbuildsl6 ]</font></div></div><div><div><font face="courier new, monospace"> Clone Set: c-watch-ib0 [p-watch-ib0]</font></div></div><div>
<div><font face="courier new, monospace"> Started: [ cvmh01 cvmh02 cvmh03 cvmh04 ]</font></div></div><div><div><font face="courier new, monospace"> Stopped: [ db02 ldap01 ldap02 swbuildsl6 ]</font></div></div><div>
<div><font face="courier new, monospace"> Clone Set: c-fs-gpfs [p-fs-gpfs]</font></div></div><div><div><font face="courier new, monospace"> Started: [ cvmh01 cvmh02 cvmh03 cvmh04 ]</font></div></div><div><div><font face="courier new, monospace"> Stopped: [ db02 ldap01 ldap02 swbuildsl6 ]</font></div>
</div><div><div><font face="courier new, monospace"> vm-compute-test (ocf::ccni:xcatVirtualDomain): Started cvmh03 </font></div></div><div><div><font face="courier new, monospace"> vm-swbuildsl6 (ocf::ccni:xcatVirtualDomain): Stopped </font></div>
</div><div><div><font face="courier new, monospace"> vm-db02 (ocf::ccni:xcatVirtualDomain): Started cvmh02 </font></div></div><div><div><font face="courier new, monospace"> vm-ldap01 (ocf::ccni:xcatVirtualDomain): Started cvmh03 </font></div>
</div><div><div><font face="courier new, monospace"> vm-ldap02 (ocf::ccni:xcatVirtualDomain): Started cvmh04 </font></div></div><div><div><font face="courier new, monospace"> DummyOnVM (ocf::pacemaker:Dummy): Started cvmh01 </font></div>
</div></blockquote><div><br></div><div>Not so good, and I'm not sure how to clean this up. I can't seem to stop vm-db02 any more, even after I've entered:</div><blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px">
<div><div><font face="courier new, monospace"># crm_node -R db02 --force</font></div></div><div><font face="courier new, monospace"># crm resource start vm-db02<br></font></div></blockquote><blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px">
<div><font face="courier new, monospace">### Wait a bit</font></div></blockquote><blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px"><div><font face="courier new, monospace"># crm status</font></div><div><font face="courier new, monospace"><div>
Last updated: Tue Jul 2 16:32:38 2013</div><div>Last change: Tue Jul 2 16:27:28 2013 via cibadmin on cvmh01</div><div>Stack: cman</div><div>Current DC: cvmh02 - partition with quorum</div><div>Version: 1.1.10rc5-1.el6.ccni-2718638</div>
<div>8 Nodes configured, unknown expected votes</div><div>54 Resources configured.</div><div><br></div><div><br></div><div>Online: [ cvmh01 cvmh02 cvmh03 cvmh04 ldap01:vm-ldap01 ldap02:vm-ldap02 swbuildsl6:vm-swbuildsl6 ]</div>
<div>OFFLINE: [ db02:vm-db02 ]</div><div><br></div><div> fence-cvmh01 (stonith:fence_ipmilan): Started cvmh03 </div><div> fence-cvmh02 (stonith:fence_ipmilan): Started cvmh03 </div><div> fence-cvmh03 (stonith:fence_ipmilan): Started cvmh04 </div>
<div> fence-cvmh04 (stonith:fence_ipmilan): Started cvmh01 </div><div> Clone Set: c-fs-libvirt-VM-xcm [fs-libvirt-VM-xcm]</div><div> Started: [ cvmh01 cvmh02 cvmh03 cvmh04 ]</div><div> Stopped: [ db02 ldap01 ldap02 swbuildsl6 ]</div>
<div> Clone Set: c-p-libvirtd [p-libvirtd]</div><div> Started: [ cvmh01 cvmh02 cvmh03 cvmh04 ]</div><div><div> Stopped: [ db02 ldap01 ldap02 swbuildsl6 ]</div><div> Clone Set: c-fs-bind-libvirt-VM-cvmh [fs-bind-libvirt-VM-cvmh]</div>
<div> Started: [ cvmh01 cvmh02 cvmh03 cvmh04 ]</div><div> Stopped: [ db02 ldap01 ldap02 swbuildsl6 ]</div><div> Clone Set: c-watch-ib0 [p-watch-ib0]</div><div> Started: [ cvmh01 cvmh02 cvmh03 cvmh04 ]</div><div>
Stopped: [ db02 ldap01 ldap02 swbuildsl6 ]</div><div> Clone Set: c-fs-gpfs [p-fs-gpfs]</div><div> Started: [ cvmh01 cvmh02 cvmh03 cvmh04 ]</div><div> Stopped: [ db02 ldap01 ldap02 swbuildsl6 ]</div><div> vm-compute-test (ocf::ccni:xcatVirtualDomain): Started cvmh02 </div>
<div> vm-swbuildsl6 (ocf::ccni:xcatVirtualDomain): Started cvmh01 </div></div><div><div> vm-ldap01 (ocf::ccni:xcatVirtualDomain): Started cvmh03 </div><div> vm-ldap02 (ocf::ccni:xcatVirtualDomain): Started cvmh04 </div>
<div> DummyOnVM (ocf::pacemaker:Dummy): Started cvmh01</div></div><div><br></div></font></div></blockquote><div>My only recourse has been to reboot the cluster. So let's do that and try setting a location constraint on DummyOnVM, to force it on db02...</div>
<div><br></div><blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px"></blockquote><div><blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px"><div><div><font face="courier new, monospace">Last updated: Tue Jul 2 16:43:46 2013</font></div>
<div><font face="courier new, monospace">Last change: Tue Jul 2 16:27:28 2013 via cibadmin on cvmh01</font></div><div><font face="courier new, monospace">Stack: cman</font></div><div><font face="courier new, monospace">Current DC: cvmh02 - partition with quorum</font></div>
<div><font face="courier new, monospace">Version: 1.1.10rc5-1.el6.ccni-2718638</font></div><div><font face="courier new, monospace">8 Nodes configured, unknown expected votes</font></div><div><font face="courier new, monospace">54 Resources configured.</font></div>
<div><font face="courier new, monospace"><br></font></div><div><font face="courier new, monospace"><br></font></div><div><font face="courier new, monospace">Online: [ cvmh01 cvmh02 cvmh03 cvmh04 db02:vm-db02 ldap01:vm-ldap01 ldap02:vm-ldap02 swbuildsl6:vm-swbuildsl6 ]</font></div>
</div><div><font face="courier new, monospace"><div><br></div><div> fence-cvmh01 (stonith:fence_ipmilan): Started cvmh04 </div><div> fence-cvmh02 (stonith:fence_ipmilan): Started cvmh03 </div><div> fence-cvmh03 (stonith:fence_ipmilan): Started cvmh04 </div>
<div> fence-cvmh04 (stonith:fence_ipmilan): Started cvmh01 </div><div> Clone Set: c-fs-libvirt-VM-xcm [fs-libvirt-VM-xcm]</div><div> Started: [ cvmh01 cvmh02 cvmh03 cvmh04 ]</div><div> Stopped: [ db02 ldap01 ldap02 swbuildsl6 ]</div>
<div> Clone Set: c-p-libvirtd [p-libvirtd]</div><div> Started: [ cvmh01 cvmh02 cvmh03 cvmh04 ]</div><div> Stopped: [ db02 ldap01 ldap02 swbuildsl6 ]</div><div> Clone Set: c-fs-bind-libvirt-VM-cvmh [fs-bind-libvirt-VM-cvmh]</div>
<div> Started: [ cvmh01 cvmh02 cvmh03 cvmh04 ]</div><div> Stopped: [ db02 ldap01 ldap02 swbuildsl6 ]</div><div> Clone Set: c-watch-ib0 [p-watch-ib0]</div><div> Started: [ cvmh01 cvmh02 cvmh03 cvmh04 ]</div><div>
Stopped: [ db02 ldap01 ldap02 swbuildsl6 ]</div><div> Clone Set: c-fs-gpfs [p-fs-gpfs]</div><div> Started: [ cvmh01 cvmh02 cvmh03 cvmh04 ]</div><div> Stopped: [ db02 ldap01 ldap02 swbuildsl6 ]</div><div> vm-compute-test (ocf::ccni:xcatVirtualDomain): Started cvmh01 </div>
<div> vm-swbuildsl6 (ocf::ccni:xcatVirtualDomain): Started cvmh01 </div><div> vm-db02 (ocf::ccni:xcatVirtualDomain): Started cvmh02 </div><div> vm-ldap01 (ocf::ccni:xcatVirtualDomain): Started cvmh03 </div>
<div> vm-ldap02 (ocf::ccni:xcatVirtualDomain): Started cvmh04 </div><div> DummyOnVM (ocf::pacemaker:Dummy): Started cvmh03</div><div><div><br></div><div># pcs constraint location DummyOnVM prefers db02</div><div>
# crm status</div><div>...</div><div>Online: [ cvmh01 cvmh02 cvmh03 cvmh04 db02:vm-db02 ldap01:vm-ldap01 ldap02:vm-ldap02 swbuildsl6:vm-swbuildsl6 ]</div><div>...</div><div> DummyOnVM (ocf::pacemaker:Dummy): Started db02</div>
</div><div><br></div></font></div></blockquote></div><div><br></div><div>That's what we want to see. It would be interesting to stop db02. I expect DummyOnVM to stop.</div><blockquote style="margin:0 0 0 40px;border:none;padding:0px">
<div><div><font face="courier new, monospace"># crm resource stop vm-db02</font></div></div><div><div><font face="courier new, monospace"># crm status</font></div></div><div><div><font face="courier new, monospace">...</font></div>
</div><div><div><font face="courier new, monospace">Online: [ cvmh01 cvmh02 cvmh03 cvmh04 ldap01:vm-ldap01 ldap02:vm-ldap02 ]</font></div></div><div><div><font face="courier new, monospace">OFFLINE: [ db02:vm-db02 swbuildsl6:vm-swbuildsl6 ]</font></div>
</div><div><div><font face="courier new, monospace">...</font></div></div><div><div><font face="courier new, monospace"> DummyOnVM (ocf::pacemaker:Dummy): Started cvmh02 </font></div></div><div><div><font face="courier new, monospace"><br>
</font></div></div><div><div><font face="courier new, monospace">Failed actions:</font></div></div><div><div><font face="courier new, monospace"> vm-compute-test_migrate_from_0 (node=cvmh02, call=147, rc=1, status=Timed Out, last-rc-change=Tue Jul 2 16:48:17 2013</font></div>
</div><div><div><font face="courier new, monospace">, queued=20003ms, exec=0ms</font></div></div><div><div><font face="courier new, monospace">): unknown error</font></div></div></blockquote><div><br></div><div>Well, that is odd. (It is the case that vm-swbuildsl6 has an order dependency on vm-compute-test, as I was trying to understand how migrations worked with order dependencies (not very well). Once vm-compute-test recovers, vm-swbuildsl6 does come back up.) This isn't really very good -- if I am running services in VM or other containers, I need them to run only in that container!</div>
<div><br></div><div>If I start vm-db02 back up, I see that DummyOnVM is stopped and moved to db02.</div><div><br></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Thu, Jun 20, 2013 at 4:16 PM, David Vossel <span dir="ltr"><<a href="mailto:dvossel@redhat.com" target="_blank">dvossel@redhat.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">----- Original Message -----<br>
> From: "David Vossel" <<a href="mailto:dvossel@redhat.com">dvossel@redhat.com</a>><br>
> To: "The Pacemaker cluster resource manager" <<a href="mailto:pacemaker@oss.clusterlabs.org">pacemaker@oss.clusterlabs.org</a>><br>
</div><div><div class="h5">> Sent: Thursday, June 20, 2013 1:35:44 PM<br>
> Subject: Re: [Pacemaker] Pacemaker remote nodes, naming, and attributes<br>
><br>
> ----- Original Message -----<br>
> > From: "David Vossel" <<a href="mailto:dvossel@redhat.com">dvossel@redhat.com</a>><br>
> > To: "The Pacemaker cluster resource manager"<br>
> > <<a href="mailto:pacemaker@oss.clusterlabs.org">pacemaker@oss.clusterlabs.org</a>><br>
> > Sent: Wednesday, June 19, 2013 4:47:58 PM<br>
> > Subject: Re: [Pacemaker] Pacemaker remote nodes, naming, and attributes<br>
> ><br>
> > ----- Original Message -----<br>
> > > From: "Lindsay Todd" <<a href="mailto:rltodd.ml1@gmail.com">rltodd.ml1@gmail.com</a>><br>
> > > To: "The Pacemaker cluster resource manager"<br>
> > > <<a href="mailto:Pacemaker@oss.clusterlabs.org">Pacemaker@oss.clusterlabs.org</a>><br>
> > > Sent: Wednesday, June 19, 2013 4:11:58 PM<br>
> > > Subject: [Pacemaker] Pacemaker remote nodes, naming, and attributes<br>
> > ><br>
> > > I built a set of rpms for pacemaker 1.1.0-rc4 and updated my test cluster<br>
> > > (hopefully won't be a "test" cluster forever), as well as my VMs running<br>
> > > pacemaker-remote. The OS everywhere is Scientific Linux 6.4. I am wanting<br>
> > > to<br>
> > > set some attributes on remote nodes, which I can use to control where<br>
> > > services run.<br>
> > ><br>
> > > The first deviation I note from the documentation is the naming of the<br>
> > > remote<br>
> > > nodes. I see:<br>
> > ><br>
> > ><br>
> > ><br>
> > ><br>
> > > Last updated: Wed Jun 19 16:50:39 2013<br>
> > > Last change: Wed Jun 19 16:19:53 2013 via cibadmin on cvmh04<br>
> > > Stack: cman<br>
> > > Current DC: cvmh02 - partition with quorum<br>
> > > Version: 1.1.10rc4-1.el6.ccni-d19719c<br>
> > > 8 Nodes configured, unknown expected votes<br>
> > > 49 Resources configured.<br>
> > ><br>
> > ><br>
> > > Online: [ cvmh01 cvmh02 cvmh03 cvmh04 db02:vm-db02 ldap01:vm-ldap01<br>
> > > ldap02:vm-ldap02 swbuildsl6:vm-swbuildsl6 ]<br>
> > ><br>
> > > Full list of resources:<br>
> > ><br>
> > > and so forth. The "remote-node" names are simply the hostname, so the<br>
> > > vm-db02<br>
> > > VirtualDomain resource has a remote-node name of db02. The "Pacemaker<br>
> > > Remote" manual suggests this should be displayed as "db02", not<br>
> > > "db02:vm-db02", although I can see how the latter format would be useful.<br>
> ><br>
> > Yep, this got changed since the documentation was published. We wanted<br>
> > people to be able to recognize which remote-node went with which resource<br>
> > easily.<br>
> ><br>
> > ><br>
> > > So now let's set an attribute on this remote node. What name do I use?<br>
> > > How<br>
> > > about:<br>
> > ><br>
> > ><br>
> > ><br>
> > ><br>
> > > # crm_attribute --node "db02:vm-db02" \<br>
> > > --name "service_postgresql" \<br>
> > > --update "true"<br>
> > > Could not map name=db02:vm-db02 to a UUID<br>
> > > Please choose from one of the matches above and suppy the 'id' with<br>
> > > --attr-id<br>
> > ><br>
> > > Perhaps not the most informative output, but obviously it fails. Let's<br>
> > > try<br>
> > > the unqualified name:<br>
> > ><br>
> > ><br>
> > ><br>
> > ><br>
> > > # crm_attribute --node "db02" \<br>
> > > --name "service_postgresql" \<br>
> > > --update "true"<br>
> > > Remote-nodes do not maintain permanent attributes,<br>
> > > 'service_postgresql=true'<br>
> > > will be removed after db02 reboots.<br>
> > > Error setting service_postgresql=true (section=status, set=status-db02):<br>
> > > No<br>
> > > such device or address<br>
> > > Error performing operation: No such device or address<br>
><br>
> I just tested this and ran into the same errors you did. Turns out this<br>
> happens when the remote-node's status section is empty. If you start a<br>
> resource on the node and then set the attribute it will work... obviously<br>
> this is a bug. I'm working on a fix.<br>
<br>
</div></div>This should help with the attributes bit.<br>
<br>
<a href="https://github.com/ClusterLabs/pacemaker/commit/26d34a9171bddae67c56ebd8c2513ea8fa770204" target="_blank">https://github.com/ClusterLabs/pacemaker/commit/26d34a9171bddae67c56ebd8c2513ea8fa770204</a><br>
<span class="HOEnZb"><font color="#888888"><br>
-- Vossel<br>
</font></span><div class="HOEnZb"><div class="h5"><br>
_______________________________________________<br>
Pacemaker mailing list: <a href="mailto:Pacemaker@oss.clusterlabs.org">Pacemaker@oss.clusterlabs.org</a><br>
<a href="http://oss.clusterlabs.org/mailman/listinfo/pacemaker" target="_blank">http://oss.clusterlabs.org/mailman/listinfo/pacemaker</a><br>
<br>
Project Home: <a href="http://www.clusterlabs.org" target="_blank">http://www.clusterlabs.org</a><br>
Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>
Bugs: <a href="http://bugs.clusterlabs.org" target="_blank">http://bugs.clusterlabs.org</a><br>
</div></div></blockquote></div><br></div></div>