[ClusterLabs] single node fails to start the ocfs2 resource

Valentin Vidic Valentin.Vidic at CARNet.hr
Mon Mar 12 14:22:43 EDT 2018


On Mon, Mar 12, 2018 at 04:31:46PM +0100, Klaus Wenninger wrote:
> Nope. Whenever the cluster is completely down...
> Otherwise nodes would come up - if not seeing each other -
> happily with both starting all services because they don't
> know what already had been running on the other node.
> Technically it wouldn't even be possible to remember that
> they've seen once as Corosync doesn't have "non-volatile-storage"
> apart from the config-file.

Interesting, I have the following config in a test cluster:

nodelist {
        node {
                ring0_addr: sid1
                nodeid: 1
        }

        node {
                ring0_addr: sid2
                nodeid: 2
        }
}

quorum {

        # Enable and configure quorum subsystem (default: off)
        # see also corosync.conf.5 and votequorum.5
        provider: corosync_votequorum
        expected_votes: 1
        two_node: 1
}

And the behaviour when both nodes are down seems to be:

1. One node up
2. Fence other node
3. Start services

Mar 12 18:15:01 sid1 crmd[555]:   notice: Connecting to cluster infrastructure: corosync
Mar 12 18:15:01 sid1 crmd[555]:   notice: Quorum acquired
Mar 12 18:15:01 sid1 crmd[555]:   notice: Node sid1 state is now member
Mar 12 18:15:01 sid1 crmd[555]:   notice: State transition S_STARTING -> S_PENDING
Mar 12 18:15:23 sid1 crmd[555]:  warning: Input I_DC_TIMEOUT received in state S_PENDING from crm_timer_popped
Mar 12 18:15:23 sid1 crmd[555]:   notice: State transition S_ELECTION -> S_INTEGRATION
Mar 12 18:15:23 sid1 crmd[555]:  warning: Input I_ELECTION_DC received in state S_INTEGRATION from do_election_check
Mar 12 18:15:23 sid1 crmd[555]:   notice: Result of probe operation for stonith-sbd on sid1: 7 (not running)
Mar 12 18:15:23 sid1 crmd[555]:   notice: Result of probe operation for dlm on sid1: 7 (not running)
Mar 12 18:15:23 sid1 crmd[555]:   notice: Result of probe operation for admin-ip on sid1: 7 (not running)
Mar 12 18:15:23 sid1 crmd[555]:   notice: Result of probe operation for clusterfs on sid1: 7 (not running)
Mar 12 18:15:57 sid1 stonith-ng[551]:   notice: Operation 'reboot' [1454] (call 2 from crmd.555) for host 'sid2' with device 'stonith-sbd' returned: 0 (OK)
Mar 12 18:15:57 sid1 stonith-ng[551]:   notice: Operation reboot of sid2 by sid1 for crmd.555 at sid1.ece4f9c5: OK
Mar 12 18:15:57 sid1 crmd[555]:   notice: Node sid2 state is now lost
Mar 12 18:15:58 sid1 crmd[555]:   notice: Result of start operation for dlm on sid1: 0 (ok)
Mar 12 18:15:58 sid1 crmd[555]:   notice: Result of start operation for admin-ip on sid1: 0 (ok)
Mar 12 18:15:58 sid1 crmd[555]:   notice: Result of start operation for stonith-sbd on sid1: 0 (ok)
Mar 12 18:15:58 sid1 crmd[555]:   notice: Result of start operation for clusterfs on sid1: 0 (ok)
Mar 12 18:15:58 sid1 crmd[555]:   notice: Transition 0 (Complete=18, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-warn-32.bz2): Complete
Mar 12 18:15:58 sid1 crmd[555]:   notice: State transition S_TRANSITION_ENGINE -> S_IDLE

-- 
Valentin



More information about the Users mailing list