[ClusterLabs] Why Do All The Services Go Down When Just One Fails?

Andrei Borzenkov arvidjaar at gmail.com
Sat Feb 16 21:33:45 UTC 2019


17.02.2019 0:03, Eric Robinson пишет:
> Here are the relevant corosync logs.
> 
> It appears that the stop action for resource p_mysql_002 failed, and that caused a cascading series of service changes. However, I don't understand why, since no other resources are dependent on p_mysql_002.
> 

You have mandatory colocation constraints for each SQL resource with
VIP. it means that to move SQL resource to another node pacemaker also
must move VIP to another node which in turn means it needs to move all
other dependent resources as well.
...
> Feb 16 14:06:39 [3912] 001db01a    pengine:  warning: check_migration_threshold:        Forcing p_mysql_002 away from 001db01a after 1000000 failures (max=1000000)
...
> Feb 16 14:06:39 [3912] 001db01a    pengine:   notice: LogAction:         * Stop       p_vip_clust01     (                   001db01a )   blocked
...
> Feb 16 14:06:39 [3912] 001db01a    pengine:   notice: LogAction:         * Stop       p_mysql_001       (                   001db01a )   due to colocation with p_vip_clust01



More information about the Users mailing list