[ClusterLabs] cluster loses state (randomly) every few minutes.
lejeczek
peljasz at yahoo.co.uk
Sat Jan 16 14:49:31 EST 2021
hi guys,
I have a very basic two-node cluster, not even a single
resource on it, but very troublesome - it keeps braking.
Journal for 'pacemaker' shows constantly (on both nodes):
...
warning: Input I_DC_TIMEOUT received in state S_PENDING from
crm_timer_popped
notice: State transition S_ELECTION -> S_PENDING
notice: State transition S_PENDING -> S_NOT_DC
notice: Lost attribute writer swir
notice: Node swir state is now lost
notice: Our peer on the DC (swir) is dead
notice: Purged 1 peer with id=2 and/or uname=swir from the
membership cache
notice: Node swir state is now lost
notice: State transition S_NOT_DC -> S_ELECTION
notice: Removing all swir attributes for peer loss
notice: Purged 1 peer with id=2 and/or uname=swir from the
membership cache
notice: Node swir state is now lost
notice: Node swir state is now lost
notice: Recorded local node as attribute writer (was unset)
notice: Purged 1 peer with id=2 and/or uname=swir from the
membership cache
notice: State transition S_ELECTION -> S_INTEGRATION
warning: Blind faith: not fencing unseen nodes
notice: Delaying fencing operations until there are
resources to manage
notice: Calculated transition 0, saving inputs in
/var/lib/pacemaker/pengine/pe-input-627.bz2
notice: Transition 0 (Complete=0, Pending=0, Fired=0,
Skipped=0, Incomplete=0,
Source=/var/lib/pacemaker/pengine/pe-input-627.bz2): Complete
notice: State transition S_TRANSITION_ENGINE -> S_IDLE
notice: Node swir state is now member
notice: Node swir state is now member
notice: Node swir state is now member
notice: Node swir state is now member
notice: State transition S_IDLE -> S_INTEGRATION
warning: Another DC detected: swir (op=noop)
notice: Detected another attribute writer (swir), starting
new election
notice: Setting #attrd-protocol[swir]: (unset) -> 2
notice: State transition S_ELECTION -> S_RELEASE_DC
notice: State transition S_PENDING -> S_NOT_DC
notice: Recorded local node as attribute writer (was unset)
It's the same hardware on which "this same" cluster ran okey
and then, only a couple of days ago, I upgraded Centos on
these two boxes to "Steam"
I'm hoping it's something trivial I'm missing with new
version(s) of software came with upgrace, perhaps some (new)
settings for two-node cluster which I missed.
Any suggestions greatly appreciated.
many thanks, L.
More information about the Users
mailing list