[ClusterLabs] Antw: Re: Trouble with drbd/pacemaker: switch to secondary/secondary

Fri Oct 21 08:33:27 EDT 2016

Le 19/10/2016 à 08:53, Ulrich Windl a écrit :
>>>> Ken Gaillot <kgaillot at redhat.com> schrieb am 18.10.2016 um 17:07 in Nachricht
> <9d3b547c-6035-e41d-18ef-9950db01e9dc at redhat.com>:
>> On 10/14/2016 03:22 PM, Anne Nicolas wrote:
> 
> [...]
>>> cluster logs are flooded by :
>>> Oct 14 17:42:28 [3445] bzvairsvr      attrd:   notice:
>>> attrd_trigger_update:    Sending flush op to all hosts for:
>>> master-drbdserv (10000)
>>> Oct 14 17:42:28 [3445] bzvairsvr      attrd:   notice:
>>> attrd_perform_update:    Sent update master-drbdserv=10000 failed:
>>> Transport endpoint is not connected
>>
>> This is strange, and the cause of the problem. A master/slave resource
>> agent will try to set node attributes indicating which node should
>> become the master. Here, we see that this is failing -- it appears attrd
>> (Pacemaker's node attribute daemon) is unable to talk to any other daemons.
>>
>> I'm not sure why this would happen, especially if the rest of the
>> daemons do not have a problem talking to each other. But that's where
>> you need to investigate.
> 
> From my little experience it's a bad idea to route I/O traffic and cluster communication over the same link: We had cases where cluster communication (especially when using SCTP) showed errors when traffic was high. Maybe that applies...
> 
>>
>> One thing I would say is that 1.1.8 is really old at this point, which
>> means you're using the "legacy" attrd, which I'm not very familiar with.
> 
> I agree: Even SLES11 SP4 uses old software, but it's at "pacemaker-1.1.12-13.1" at least. Things _really_ got better with later releases.
> 

I finally updated Pacemaker package ti the last version. Things are much
more reactive and all my  problems are gone. Thanks a lot for your
advice. Just need now to propose some backport packages to my
distribution :)

> 

-- 
Anne Nicolas
http://mageia.org