[ClusterLabs] info: mcp_cpg_deliver: Ignoring process list sent by peer for local node

lejeczek peljasz at yahoo.co.uk
Thu Jun 6 07:20:15 EDT 2019


On 30/05/2019 14:06, Jan Pokorný wrote:
> On 30/05/19 11:01 +0100, lejeczek wrote:
>> On 29/05/2019 21:04, Ken Gaillot wrote:
>>> On Wed, 2019-05-29 at 17:28 +0100, lejeczek wrote:
>>>> and:
>>>> $ systemctl status -l pacemaker.service 
>>>> ● pacemaker.service - Pacemaker High Availability Cluster Manager
>>>>    Loaded: loaded (/usr/lib/systemd/system/pacemaker.service;
>>>> disabled; vendor preset: disabled)
>>>>    Active: active (running) since Wed 2019-05-29 17:21:45 BST; 7s ago
>>>>      Docs: man:pacemakerd
>>>>            
>>>> https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_Explained/index.html
>>>>  Main PID: 51617 (pacemakerd)
>>>>     Tasks: 1
>>>>    Memory: 3.3M
>>>>    CGroup: /system.slice/pacemaker.service
>>>>            └─51617 /usr/sbin/pacemakerd -f
>>>>
>>>> May 29 17:21:45 rider.private pacemakerd[51617]:   notice: Tracking existing pengine process (pid=51528)
>>>> May 29 17:21:45 rider.private pacemakerd[51617]:   notice: Tracking existing lrmd process (pid=51542)
>>>> May 29 17:21:45 rider.private pacemakerd[51617]:   notice: Tracking existing stonithd process (pid=51558)
>>>> May 29 17:21:45 rider.private pacemakerd[51617]:   notice: Tracking existing attrd process (pid=51559)
>>>> May 29 17:21:45 rider.private pacemakerd[51617]:   notice: Tracking existing cib process (pid=51560)
>>>> May 29 17:21:45 rider.private pacemakerd[51617]:   notice: Tracking existing crmd process (pid=51566)
>>>> May 29 17:21:45 rider.private pacemakerd[51617]:   notice: Quorum acquired
>>>> May 29 17:21:45 rider.private pacemakerd[51617]:   notice: Node whale.private state is now member
>>>> May 29 17:21:45 rider.private pacemakerd[51617]:   notice: Node swir.private state is now member
>>>> May 29 17:21:45 rider.private pacemakerd[51617]:   notice: Node rider.private state is now member
> I grok that you've, in parallel, started asking about this part also
> on the systemd ML, and I redirected that thread here (but my message
> still didn't hit here for being stuck in the moderation queue, since
> I use different addresses on these two lists -- you can still respond
> right away to that as readily available via said systemd list, just
> make sure you only target users at cl.o, it was really unrelated to
> systemd).
>
> In a nutshell, we want to know how you get into such situation that
> entirely detached subdaemons would be flowing in your environment,
> prior to starting pacemaker.service (or after stopping it).
> That's rather unexpected.
> If you can dig up traces of any pacemaker associated processes
> (search pattern: pacemaker*|attrd|cib|crmd|lrmd|stonithd|pengine)
> dying (+ the messages logged immediately before that if at all),
> it could help up diagnose your situation.

I think.... it's time. I cannot afford to investigate it by trying to
revert to state when it failed. Should be easy for devel in a lab to try
to reproduce it - time, was not in sync between three nodes, a few
minutes discrepancy between the nodes.

On that one node with crippled systemd's service I was getting:

$ pcs status --all
Error: cluster is not currently running on this node

If it really is time, then maybe some checks should be put in
place(pacemaker/corosync). Everybody knows how time is vital for
everything, but sometimes it can escape our attention, checks would be
of great help value.

many thanks, L

p.s !!!! be aware of the time, always!!!

>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/


-------------- next part --------------
A non-text attachment was scrubbed...
Name: pEpkey.asc
Type: application/pgp-keys
Size: 1757 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20190606/b58aaac6/attachment.bin>


More information about the Users mailing list