[ClusterLabs] sbd: Cannot open watchdog device: /dev/watchdog

Muhammad Sharfuddin M.Sharfuddin at nds.com.pk
Tue Jan 3 21:57:19 UTC 2017


On 01/04/2017 01:15 AM, Klaus Wenninger wrote:
> On 01/03/2017 09:02 PM, Muhammad Sharfuddin wrote:
>> On 01/03/2017 09:49 PM, Kristoffer Grönlund wrote:
>>> Muhammad Sharfuddin <M.Sharfuddin at nds.com.pk> writes:
>>>
>>>> Hello,
>>>>
>>>> pacemaker does not start on this machine(Fujitsu PRIMERGY RX2540 M1)
>>>> with following error in  the logs:
>>>>
>>>> sbd: [13236]: ERROR: Cannot open watchdog device: /dev/watchdog: No such
>>>> file or directory
>>> Does /dev/watchdog exist?
>> No
>> ls -l /dev/watch*
>> ls: cannot access /dev/watch*: No such file or directory
> Then you probably don't have one at all.
> Maybe no hardware,the driver is not loaded or e.g.
> udev doesn't create the node for some reason.
> For a test at least you can try loading loading softdog.
>
> [kwenning at kwenning pacemaker]$ sudo modprobe softdog
> [sudo] password for kwenning:
> [kwenning at kwenning pacemaker]$ ls -l /dev/watchdog
> crw-------. 1 root root 10, 130 Jan  3 13:39 /dev/watchdog
On the node where I am getting the sbd error, by loading the "softdog" 
module helps fix the issue

Now the strange part.. this is happening on a single  node only, i.e 
only on a single
node we are getting the error:
sbd: ERROR: Cannot open watchdog device: /dev/watchdog: No such file or 
directory

while on another node(100% identical, same OS/Software/configurations 
and same Hardware) even though
/dev/watchdog is also missing there, but still pacemaker and sbd starts 
there without any
error and without loading the "softdog" module.

>>> If you have more than one watchdog device, you can configure
>>> sbd to use a different device using the -w option.
>>>
>>> Cheers,
>>> Kristoffer
>>>
>>>> System Info:
>>>>
>>>> sbd-1.2.1-8.7.x86_64  corosync-2.3.3-7.12.x86_64 pacemaker-1.1.12-7.1.x86_64
>>>>
>>>> lsmod | egrep "(wd|dog)"
>>>> iTCO_wdt               13480  0
>>>> iTCO_vendor_support    13718  1 iTCO_wdt
>>>>
>>>> dmidecode | grep -A3 '^System Information'
>>>> System Information
>>>>           Manufacturer: FUJITSU
>>>>           Product Name: PRIMERGY RX2540 M1
>>>>           Version: GS01
>>>>
>>>> logs:
>>>>
>>>> 2017-01-03T21:00:26.890503+05:00 prdnode1 sbd: [13235]: info: Watchdog
>>>> enabled.
>>>> 2017-01-03T21:00:26.899817+05:00 prdnode1 sbd: [13238]: info: Servant
>>>> starting for device
>>>> /dev/disk/by-id/wwn-0x600000e00d280000002825b500000000-part1
>>>> 2017-01-03T21:00:26.900175+05:00 prdnode1 sbd: [13238]: info: Device
>>>> /dev/disk/by-id/wwn-0x600000e00d280000002825b500000000-part1 uuid:
>>>> fda42d64-ca74-4578-90c8-976ea7ff5f6e
>>>> 2017-01-03T21:00:26.900418+05:00 prdnode1 sbd: [13239]: info: Monitoring
>>>> Pacemaker health
>>>> 2017-01-03T21:00:27.901022+05:00 prdnode1 sbd: [13236]: ERROR: Cannot
>>>> open watchdog device: /dev/watchdog: No such file or directory
>>>> 2017-01-03T21:00:27.912098+05:00 prdnode1 sbd: [13236]: WARN: Servant
>>>> for pcmk (pid: 13239) has terminated
>>>> 2017-01-03T21:00:27.941950+05:00 prdnode1 sbd: [13236]: WARN: Servant
>>>> for /dev/disk/by-id/wwn-0x600000e00d280000002825b500000000-part1 (pid:
>>>> 13238) has terminated
>>>> 2017-01-03T21:00:27.949401+05:00 prdnode1 sbd.sh[13231]: sbd failed;
>>>> please check the logs.
>>>> 2017-01-03T21:00:27.992606+05:00 prdnode1 sbd.sh[13231]: SBD failed to
>>>> start; aborting.
>>>> 2017-01-03T21:00:27.993061+05:00 prdnode1 systemd[1]: sbd.service:
>>>> control process exited, code=exited status=1
>>>> 2017-01-03T21:00:27.993339+05:00 prdnode1 systemd[1]: Failed to start
>>>> Shared-storage based fencing daemon.
>>>> 2017-01-03T21:00:27.993610+05:00 prdnode1 systemd[1]: Dependency failed
>>>> for Pacemaker High Availability Cluster Manager.
>>>> 2017-01-03T21:00:27.994054+05:00 prdnode1 systemd[1]: Unit sbd.service
>>>> entered failed state.
>>>>
>>>> please help.
>>>>
>>>> -- 
>>>> Regards,
>>>>
>>>> Muhammad Sharfuddin
>>>> <http://www.nds.com.pk>
>>>>
>>>> _______________________________________________
>>>> Users mailing list: Users at clusterlabs.org
>>>> http://lists.clusterlabs.org/mailman/listinfo/users
>>>>
>>>> Project Home: http://www.clusterlabs.org
>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>> Bugs: http://bugs.clusterlabs.org
>> Regards,
>>
>> Muhammad Sharfuddin
>>
>>
>> _______________________________________________
>> Users mailing list: Users at clusterlabs.org
>> http://lists.clusterlabs.org/mailman/listinfo/users
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
--
Regards,

Muhammad Sharfuddin




More information about the Users mailing list