[ClusterLabs] Antw: Re: Trouble with drbd/pacemaker: switch to secondary/secondary
Anne Nicolas
ennael1 at gmail.com
Fri Oct 21 12:33:27 UTC 2016
Le 19/10/2016 à 08:53, Ulrich Windl a écrit :
>>>> Ken Gaillot <kgaillot at redhat.com> schrieb am 18.10.2016 um 17:07 in Nachricht
> <9d3b547c-6035-e41d-18ef-9950db01e9dc at redhat.com>:
>> On 10/14/2016 03:22 PM, Anne Nicolas wrote:
>
> [...]
>>> cluster logs are flooded by :
>>> Oct 14 17:42:28 [3445] bzvairsvr attrd: notice:
>>> attrd_trigger_update: Sending flush op to all hosts for:
>>> master-drbdserv (10000)
>>> Oct 14 17:42:28 [3445] bzvairsvr attrd: notice:
>>> attrd_perform_update: Sent update master-drbdserv=10000 failed:
>>> Transport endpoint is not connected
>>
>> This is strange, and the cause of the problem. A master/slave resource
>> agent will try to set node attributes indicating which node should
>> become the master. Here, we see that this is failing -- it appears attrd
>> (Pacemaker's node attribute daemon) is unable to talk to any other daemons.
>>
>> I'm not sure why this would happen, especially if the rest of the
>> daemons do not have a problem talking to each other. But that's where
>> you need to investigate.
>
> From my little experience it's a bad idea to route I/O traffic and cluster communication over the same link: We had cases where cluster communication (especially when using SCTP) showed errors when traffic was high. Maybe that applies...
>
>>
>> One thing I would say is that 1.1.8 is really old at this point, which
>> means you're using the "legacy" attrd, which I'm not very familiar with.
>
> I agree: Even SLES11 SP4 uses old software, but it's at "pacemaker-1.1.12-13.1" at least. Things _really_ got better with later releases.
>
I finally updated Pacemaker package ti the last version. Things are much
more reactive and all my problems are gone. Thanks a lot for your
advice. Just need now to propose some backport packages to my
distribution :)
>
--
Anne Nicolas
http://mageia.org
More information about the Users
mailing list