[Pacemaker] Wiki example problems

Fri May 29 03:50:44 EDT 2009

Try this:
   http://clusterlabs.org/wiki/FAQ#I_Killed_a_Node_but_the_Cluster_Didn.27t_Recover

On Thu, May 28, 2009 at 10:35 PM, Ryan Steele <ryans at aweber.com> wrote:
> After following the wiki example for sharing an IP address
> (http://clusterlabs.org/wiki/Example_configurations), I'm able to manually
> fail over the resource with crm using the following statement (my nodes are
> ha1 and ha2):
>
>
>        crm resource migrate failover-ip ha2
>
>
> However, if I halt the box which currently owns the floating IP, or
> otherwise abruptly kill networking on it, the failover never automatically
> happens.
>
> I did follow the example explicitly, and the resource was
> initially created with:
>
>
>        primitive failover-ip ocf:heartbeat:IPaddr params ip=192.168.7.250 op
> monitor interval=10
>
>
> ...so I'm not quite sure what the issue is.  The messaging layer seems to
> work since crm status shows the node as being down, but the resource
> allocation layer seems to be failing, probably somewhere in the CRM...?
>
>
> I have no firewall between these nodes, so I haven't run tcpdump either to
> see if the messages are making it, but I can't imagine that that's the issue
> here.  This is what things look like after the simulated problem:
>
>
> root at ha1:~# crm status
>
>
> ============
> Last updated: Thu May 28 16:31:20 2009
> Current DC: ha1 (ha1)
> Version: 1.0.2-c02b459053bfa44d509a2a0e0247b291d93662b7
> 2 Nodes configured.
> 1 Resources configured.
> ============
>
> Node: ha1 (ha1): online
> Node: ha2 (ha2): UNCLEAN (offline)
>
> root at ha1:~# ifconfig
> eth0      Link encap:Ethernet  HWaddr 00:0c:29:cd:78:4e
>          inet addr:192.168.7.134  Bcast:192.168.7.255  Mask:255.255.255.0
>          inet6 addr: fe80::20c:29ff:fecd:784e/64 Scope:Link
>          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>          RX packets:7212 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:12373 errors:0 dropped:0 overruns:0 carrier:0
>          collisions:0 txqueuelen:1000
>          RX bytes:919781 (898.2 KB)  TX bytes:1489819 (1.4 MB)
>          Base address:0x2000 Memory:d8920000-d8940000
>
> lo        Link encap:Local Loopback
>          inet addr:127.0.0.1  Mask:255.0.0.0
>          inet6 addr: ::1/128 Scope:Host
>          UP LOOPBACK RUNNING  MTU:16436  Metric:1
>          RX packets:624 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:624 errors:0 dropped:0 overruns:0 carrier:0
>          collisions:0 txqueuelen:0
>          RX bytes:61572 (60.1 KB)  TX bytes:61572 (60.1 KB)
>
>
> root at ha1:~# crm_resource -L
> failover-ip     (ocf::heartbeat:IPaddr) Started
>
>
> As you can see, nothing has happened.  Hopefully someone else can identify
> my mistake before I do after having read this.  Thanks in advance for any
> help.
>
>
> -Ryan
>
> _______________________________________________
> Pacemaker mailing list
> Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>