[Pacemaker] Problem with configuring stonith rcd_serial

Eberhard Kuemmerle E.Kuemmerle at fz-juelich.de
Wed Nov 3 12:08:50 EDT 2010


On 3 Nov 2010 11:06, Dejan Muhamedagic wrote:
> On Tue, Nov 02, 2010 at 06:45:08PM +0100, Dejan Muhamedagic wrote:
>
>> On Tue, Nov 02, 2010 at 04:26:40PM +0100, Eberhard Kuemmerle wrote:
>>
>>> On 2 Nov 2010 16:15 02.11.2010 16:18, Eberhard Kuemmerle wrote:
>>>
>>>> Hi,
>>>> here is what you requested:
>>>>
>>>> TEST 1:
>>>> stonith -t rcd_serial -p "test /dev/ttyS0 rts 2000" test
>>>> ** (process:2928): DEBUG: rcd_serial_set_config:called
>>>> Alarm clock
>>>> # echo $?
>>>> 142
>>>>
>>>> TEST 2:
>>>> stonith -t rcd_serial hostlist="node2" ttydev="/dev/ttyS0" dtr_rts="rts"
>>>> msduration="2000" -S
>>>> ** (process:6851): DEBUG: rcd_serial_set_config:called
>>>> stonith: rcd_serial device OK.
>>>> # echo $?
>>>> 0
>>>>
>>>> TEST 3:
>>>> stonith -t rcd_serial hostlist="node2" ttydev="/dev/ttyS0" dtr_rts="rts"
>>>> msduration="2000" -T reset node2
>>>> ** (process:8142): DEBUG: rcd_serial_set_config:called
>>>> Alarm clock
>>>> # echo $?
>>>> 142
>>>>
>>>> TEST 1 as well as TEST 2 caused a reboot of node2!
>>>>
>>>>
>>> SORRY, that's wrong!
>>> I wanted to say:
>>> TEST 1 as well as TEST 3 caused a reboot of node2!
>>>
>> Well, then there seems to be a problem with rcd_serial.
>> According to the exit code (142 = 128 + 14), it seems like the
>> plugin instance gets killed by the ALRM signal. The signal
>> should've been caught, but there is something wrong with the
>> registration of the signal handler.
>>
>> Looks like this fails unexpectedly:
>>
>> #if !defined(HAVE_POSIX_SIGNALS)
>>
>> because our autoconf doesn't do tests for signal implementation.
>>
>> Can you please try the attached patch? You'll have to rebuild
>> the package for that.
>>
> If you've wondered which patch, here's finally one.
>
> Thanks,
>
> Dejan
>
> -------------- next part --------------
> A non-text attachment was scrubbed...
> Name: have-posix-signals.patch
> Type: text/x-diff
> Size: 1032 bytes
> Desc: not available
> URL: <http://oss.clusterlabs.org/pipermail/pacemaker/attachments/20101103/a5cd5005/attachment-0001.bin>
>
Wow, success!

With your patch and additionally replacing 'dtr|rts' by 'dtr_rts' in
rcd_serial.c, everything works fine!!!

There are still some strange entries in /var/log/messages, but the
STONITH action is performed correctly!

Just for your information, here are the messages:

Nov  3 16:41:50 node2 pengine: [5327]: WARN: stage6: Scheduling Node
node1 for STONITH
Nov  3 16:41:50 node2 stonith-ng: [5323]: WARN: parse_host_line: Could
not parse (0 2): ** (process:8669): DEBUG: rcd_serial_set_config:called
Nov  3 16:41:50 node2 stonith-ng: [5323]: WARN: parse_host_line: Could
not parse (3 18): (process:8669): DEBUG: rcd_serial_set_config:called
Nov  3 16:41:50 node2 stonith-ng: [5323]: WARN: parse_host_line: Could
not parse (0 0):
Nov  3 16:41:50 node2 pengine: [5327]: WARN: process_pe_message:
Transition 102: WARNINGs found during PE processing. PEngine Input
stored in: /var/lib/pengine/pe-warn-0.bz2
Nov  3 16:41:52 node2 crmd: [5328]: notice: crmd_peer_update: Status
update: Client node1/crmd now has status [offline] (DC=true)
Nov  3 16:41:52 node2 crmd: [5328]: notice: run_graph: Transition 102
(Complete=11, Pending=0, Fired=0, Skipped=23, Incomplete=11,
Source=/var/lib/pengine/pe-warn-0.bz2): Stopped
Nov  3 16:41:52 node2 lrmd: [5325]: ERROR: crm_abort: crm_strdup_fn:
Triggered assert at utils.c:964 : src != NULL
Nov  3 16:41:52 node2 lrmd: [5325]: ERROR: crm_strdup_fn: Could not
perform copy at st_client.c:514 (stonith_api_device_metadata)
Nov  3 16:41:52 node2 lrmd: [5325]: WARN: stonith_api_device_metadata:
no short description in rcd_serial's metadata.

Thank you very much!
  Eberhard



------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDirig Dr. Karl Eugen Huthmacher
Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender),
Dr. Ulrich Krafft (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Prof. Dr. Sebastian M. Schmidt
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------




More information about the Pacemaker mailing list