[Pacemaker] ccm returning with exit code 100 and system rebooting

akshay punja akshay.punja at gmail.com
Tue Jan 18 06:25:38 EST 2011


Hi,

Thanks for the help,

As suggest I have changed the crm on to respawn, after the configuration
change Rebooting has stopped.

I are using tomcat, apache httpd and mysql master - slave replication,  I
have set this up in multiple environments and its working fine. I are see
this issue only in one of the nodes and so I isolated the node to find a
solution. The log file is only printing info and warn is there a way to
enable debug logging too.

*ERROR: Unable to set scheduler parameters.: Operation not permitted*  - we
are seeing this issue in health clusters in other environment too. I think
this would not be the root cause


bash-3.2# ps -ef
UID        PID  PPID  C STIME TTY          TIME CMD
root      2662 21268  0 11:17 pts/0    00:00:00 ps -ef
root         1     0  0 11:01 ?        00:00:00 init [3]
root     20382     1  0 11:01 ?        00:00:00 /usr/sbin/syslog_adapter
root     20389     1  0 11:01 ?        00:00:00 syslogd -m 0 -p
/dev/log_adapted
root     20400     1  0 11:01 ?        00:00:00 logmgr
root     21236  2953  0 11:02 pts/0    00:00:00 /bin/sh /bin/console
root     21268 21236  0 11:02 pts/0    00:00:00 /bin/bash --login
root     30746     1  0 11:12 ?        00:00:00 heartbeat: master control
proces
root     30748 30746  0 11:12 ?        00:00:00 heartbeat: FIFO reader
root     30749 30746  0 11:12 ?        00:00:00 heartbeat: write: ucast eth0
root     30750 30746  0 11:12 ?        00:00:00 heartbeat: read: ucast eth0
root     32731 30746  0 11:14 ?        00:00:00 /usr/lib/heartbeat/lrmd -r
root     32732 30746  0 11:14 ?        00:00:00 /usr/lib/heartbeat/stonithd

Log file after the configuration ifle
Jan 18 11:12:47 mysqlis1 heartbeat: [30745]: info: Pacemaker support:
respawn
Jan 18 11:12:47 mysqlis1 heartbeat: [30745]: info: Pacemaker support: false
Jan 18 11:12:47 mysqlis1 heartbeat: [30745]: WARN: Logging daemon is
disabled --enabling logging daemon is recommended
Jan 18 11:12:47 mysqlis1 heartbeat: [30745]: info:
**************************
Jan 18 11:12:47 mysqlis1 heartbeat: [30745]: info: Configuration validated.
Starting heartbeat 3.0.2
Jan 18 11:12:47 mysqlis1 heartbeat: [30746]: info: heartbeat: version 3.0.2
Jan 18 11:12:47 mysqlis1 heartbeat: [30746]: info: Heartbeat generation:
1293183350
Jan 18 11:12:47 mysqlis1 heartbeat: [30746]: info: glib: ucast: write socket
priority set to IPTOS_LOWDELAY on eth0
Jan 18 11:12:47 mysqlis1 heartbeat: [30746]: info: glib: ucast: bound send
socket to device: eth0
Jan 18 11:12:47 mysqlis1 heartbeat: [30746]: info: glib: ucast: bound
receive socket to device: eth0
Jan 18 11:12:47 mysqlis1 heartbeat: [30746]: info: glib: ucast: started on
port 694 interface eth0 to 172.21.52.135
Jan 18 11:12:47 mysqlis1 heartbeat: [30746]: info:
G_main_add_TriggerHandler: Added signal manual handler
Jan 18 11:12:47 mysqlis1 heartbeat: [30746]: info:
G_main_add_TriggerHandler: Added signal manual handler
Jan 18 11:12:47 mysqlis1 heartbeat: [30746]: info: G_main_add_SignalHandler:
Added signal handler for signal 17
Jan 18 11:12:47 mysqlis1 heartbeat: [30746]: ERROR: Unable to set scheduler
parameters.: Operation not permitted
Jan 18 11:12:47 mysqlis1 heartbeat: [30746]: info: Local status now set to:
'up'
Jan 18 11:12:48 mysqlis1 heartbeat: [30748]: ERROR: Unable to set scheduler
parameters.: Operation not permitted
Jan 18 11:12:48 mysqlis1 heartbeat: [30749]: ERROR: Unable to set scheduler
parameters.: Operation not permitted
Jan 18 11:12:48 mysqlis1 heartbeat: [30750]: ERROR: Unable to set scheduler
parameters.: Operation not permitted
Jan 18 11:14:48 mysqlis1 heartbeat: [30746]: WARN: node mysql3: is dead
Jan 18 11:14:48 mysqlis1 heartbeat: [30746]: info: Comm_now_up(): updating
status to active
Jan 18 11:14:48 mysqlis1 heartbeat: [30746]: info: Local status now set to:
'active'
Jan 18 11:14:48 mysqlis1 heartbeat: [30746]: info: Starting child client
"/usr/lib/heartbeat/ccm" (100,101)
Jan 18 11:14:48 mysqlis1 heartbeat: [30746]: info: Starting child client
"/usr/lib/heartbeat/cib" (100,101)
Jan 18 11:14:48 mysqlis1 heartbeat: [30746]: info: Starting child client
"/usr/lib/heartbeat/lrmd -r" (0,0)
Jan 18 11:14:48 mysqlis1 heartbeat: [30746]: info: Starting child client
"/usr/lib/heartbeat/stonithd" (0,0)
Jan 18 11:14:48 mysqlis1 heartbeat: [30746]: info: Starting child client
"/usr/lib/heartbeat/attrd" (100,101)
Jan 18 11:14:48 mysqlis1 heartbeat: [30746]: info: Starting child client
"/usr/lib/heartbeat/crmd" (100,101)
Jan 18 11:14:48 mysqlis1 heartbeat: [32729]: info: Starting
"/usr/lib/heartbeat/ccm" as uid 100  gid 101 (pid 32729)
Jan 18 11:14:48 mysqlis1 heartbeat: [32730]: info: Starting
"/usr/lib/heartbeat/cib" as uid 100  gid 101 (pid 32730)
Jan 18 11:14:48 mysqlis1 heartbeat: [32731]: info: Starting
"/usr/lib/heartbeat/lrmd -r" as uid 0  gid 0 (pid 32731)
Jan 18 11:14:48 mysqlis1 lrmd: [32731]: info: G_main_add_SignalHandler:
Added signal handler for signal 15
Jan 18 11:14:48 mysqlis1 lrmd: [32731]: info: G_main_add_SignalHandler:
Added signal handler for signal 17
Jan 18 11:14:48 mysqlis1 lrmd: [32731]: info: enabling coredumps
Jan 18 11:14:48 mysqlis1 lrmd: [32731]: info: G_main_add_SignalHandler:
Added signal handler for signal 10
Jan 18 11:14:48 mysqlis1 lrmd: [32731]: info: G_main_add_SignalHandler:
Added signal handler for signal 12
Jan 18 11:14:48 mysqlis1 lrmd: [32731]: info: Started.
Jan 18 11:14:48 mysqlis1 heartbeat: [32732]: info: Starting
"/usr/lib/heartbeat/stonithd" as uid 0  gid 0 (pid 32732)
Jan 18 11:14:48 mysqlis1 heartbeat: [32733]: info: Starting
"/usr/lib/heartbeat/attrd" as uid 100  gid 101 (pid 32733)
Jan 18 11:14:48 mysqlis1 heartbeat: [30746]: WARN: Managed
/usr/lib/heartbeat/ccm process 32729 exited with return code 100.
Jan 18 11:14:48 mysqlis1 heartbeat: [30746]: WARN: Managed
/usr/lib/heartbeat/cib process 32730 exited with return code 100.
Jan 18 11:14:48 mysqlis1 heartbeat: [30746]: WARN: Managed
/usr/lib/heartbeat/attrd process 32733 exited with return code 100.
Jan 18 11:14:48 mysqlis1 stonithd: [32732]: info: G_main_add_SignalHandler:
Added signal handler for signal 10
Jan 18 11:14:48 mysqlis1 stonithd: [32732]: info: G_main_add_SignalHandler:
Added signal handler for signal 12
Jan 18 11:14:48 mysqlis1 stonithd: [32732]: info: crm_cluster_connect:
Connecting to Heartbeat
Jan 18 11:14:48 mysqlis1 heartbeat: [32734]: info: Starting
"/usr/lib/heartbeat/crmd" as uid 100  gid 101 (pid 32734)
Jan 18 11:14:48 mysqlis1 heartbeat: [30746]: WARN: Managed
/usr/lib/heartbeat/crmd process 32734 exited with return code 100.
Jan 18 11:14:48 mysqlis1 stonithd: [32732]: info: register_heartbeat_conn:
Hostname: mysqlis1
Jan 18 11:14:48 mysqlis1 stonithd: [32732]: info: register_heartbeat_conn:
UUID: d26dfd2b-5412-42b5-84d2-86567676c849
Jan 18 11:14:48 mysqlis1 stonithd: [32732]: notice:
/usr/lib/heartbeat/stonithd start up successfully.
Jan 18 11:14:48 mysqlis1 stonithd: [32732]: info: G_main_add_SignalHandler:
Added signal handler for signal 17

Regards,
Akshay

On Tue, Jan 18, 2011 at 2:25 PM, Andrew Beekhof <andrew at beekhof.net> wrote:

> On Tue, Jan 18, 2011 at 4:04 AM, akshay punja <akshay.punja at gmail.com>
> wrote:
> > Please let me know if any one has solved this issue.
>
> Can you try "crm respawn" instead of "crm on" so the node stays up
> long enough to see why the ccm is unhappy.
>
> Lars, you really aught to think about changing the default behavior
> and adding "crm fatal" or something.
>
> > CCM exiting with return code 100 and system rebooting
> >
> > On Mon, Jan 17, 2011 at 1:29 PM, akshay punja <akshay.punja at gmail.com>
> > wrote:
> >>
> >> Hi All,
> >>
> >> We am using pacemaker(pacemaker-1.0.9.1-1.15.el5.i386.rpm) with
> >> heartbeat(heartbeat-3.0.3-2.3.el5.i386.rpm) for a production deployment.
> >>
> >> Node : we are using two node in a cluster and hosting a bunch of
> >> application on the HA.
> >>
> >> We are seeing a strange rebooting of one of the nodes Managed
> >> /usr/lib/heartbeat/ccm process 22115 exited with return code 100. What
> could
> >> be possible issue and how could we fix it.
> >>
> >> Jan 17 07:50:38 mysqlis1 heartbeat: [17619]: info: Pacemaker support:
> yes
> >> Jan 17 07:50:38 mysqlis1 heartbeat: [17619]: info: Pacemaker support:
> >> false
> >> Jan 17 07:50:38 mysqlis1 heartbeat: [17619]: WARN: Logging daemon is
> >> disabled --enabling logging daemon is recommended
> >> Jan 17 07:50:38 mysqlis1 heartbeat: [17619]: info:
> >> **************************
> >> Jan 17 07:50:38 mysqlis1 heartbeat: [17619]: info: Configuration
> >> validated. Starting heartbeat 3.0.2
> >> Jan 17 07:50:38 mysqlis1 heartbeat: [17620]: info: heartbeat: version
> >> 3.0.2
> >> Jan 17 07:50:38 mysqlis1 heartbeat: [17620]: info: Heartbeat generation:
> >> 1293182645
> >> Jan 17 07:50:38 mysqlis1 heartbeat: [17620]: info: glib: ucast: write
> >> socket priority set to IPTOS_LOWDELAY on eth0
> >> Jan 17 07:50:38 mysqlis1 heartbeat: [17620]: info: glib: ucast: bound
> send
> >> socket to device: eth0
> >> Jan 17 07:50:38 mysqlis1 heartbeat: [17620]: info: glib: ucast: bound
> >> receive socket to device: eth0
> >> Jan 17 07:50:38 mysqlis1 heartbeat: [17620]: info: glib: ucast: started
> on
> >> port 694 interface eth0 to 172.21.52.135
> >> Jan 17 07:50:38 mysqlis1 heartbeat: [17620]: info:
> >> G_main_add_TriggerHandler: Added signal manual handler
> >> Jan 17 07:50:38 mysqlis1 heartbeat: [17620]: info:
> >> G_main_add_TriggerHandler: Added signal manual handler
> >> Jan 17 07:50:38 mysqlis1 heartbeat: [17620]: info:
> >> G_main_add_SignalHandler: Added signal handler for signal 17
> >> Jan 17 07:50:38 mysqlis1 heartbeat: [17620]: ERROR: Unable to set
> >> scheduler parameters.: Operation not permitted
> >> Jan 17 07:50:38 mysqlis1 heartbeat: [17620]: info: Local status now set
> >> to: 'up'
> >> Jan 17 07:50:39 mysqlis1 heartbeat: [17627]: ERROR: Unable to set
> >> scheduler parameters.: Operation not permitted
> >> Jan 17 07:50:39 mysqlis1 heartbeat: [17629]: ERROR: Unable to set
> >> scheduler parameters.: Operation not permitted
> >> Jan 17 07:50:39 mysqlis1 heartbeat: [17628]: ERROR: Unable to set
> >> scheduler parameters.: Operation not permitted
> >> Jan 17 07:52:39 mysqlis1 heartbeat: [17620]: WARN: node mysql3: is dead
> >> Jan 17 07:52:39 mysqlis1 heartbeat: [17620]: info: Comm_now_up():
> updating
> >> status to active
> >> Jan 17 07:52:39 mysqlis1 heartbeat: [17620]: info: Local status now set
> >> to: 'active'
> >> Jan 17 07:52:39 mysqlis1 heartbeat: [17620]: info: Starting child client
> >> "/usr/lib/heartbeat/ccm" (100,101)
> >> Jan 17 07:52:39 mysqlis1 heartbeat: [17620]: info: Starting child client
> >> "/usr/lib/heartbeat/cib" (100,101)
> >> Jan 17 07:52:39 mysqlis1 heartbeat: [17620]: info: Starting child client
> >> "/usr/lib/heartbeat/lrmd -r" (0,0)
> >> Jan 17 07:52:39 mysqlis1 heartbeat: [17620]: info: Starting child client
> >> "/usr/lib/heartbeat/stonithd" (0,0)
> >> Jan 17 07:52:39 mysqlis1 heartbeat: [17620]: info: Starting child client
> >> "/usr/lib/heartbeat/attrd" (100,101)
> >> Jan 17 07:52:39 mysqlis1 heartbeat: [17620]: info: Starting child client
> >> "/usr/lib/heartbeat/crmd" (100,101)
> >> Jan 17 07:52:39 mysqlis1 heartbeat: [19576]: info: Starting
> >> "/usr/lib/heartbeat/ccm" as uid 100  gid 101 (pid 19576)
> >> Jan 17 07:52:39 mysqlis1 heartbeat: [19577]: info: Starting
> >> "/usr/lib/heartbeat/cib" as uid 100  gid 101 (pid 19577)
> >> Jan 17 07:52:39 mysqlis1 heartbeat: [19578]: info: Starting
> >> "/usr/lib/heartbeat/lrmd -r" as uid 0  gid 0 (pid 19578)
> >> Jan 17 07:52:39 mysqlis1 lrmd: [19578]: info: G_main_add_SignalHandler:
> >> Added signal handler for signal 15
> >> Jan 17 07:52:39 mysqlis1 lrmd: [19578]: info: G_main_add_SignalHandler:
> >> Added signal handler for signal 17
> >> Jan 17 07:52:39 mysqlis1 lrmd: [19578]: info: enabling coredumps
> >> Jan 17 07:52:39 mysqlis1 lrmd: [19578]: info: G_main_add_SignalHandler:
> >> Added signal handler for signal 10
> >> Jan 17 07:52:39 mysqlis1 lrmd: [19578]: info: G_main_add_SignalHandler:
> >> Added signal handler for signal 12
> >> Jan 17 07:52:39 mysqlis1 lrmd: [19578]: info: Started.
> >> Jan 17 07:52:39 mysqlis1 heartbeat: [19579]: info: Starting
> >> "/usr/lib/heartbeat/stonithd" as uid 0  gid 0 (pid 19579)
> >> Jan 17 07:52:39 mysqlis1 heartbeat: [19580]: info: Starting
> >> "/usr/lib/heartbeat/attrd" as uid 100  gid 101 (pid 19580)
> >> Jan 17 07:52:39 mysqlis1 heartbeat: [17620]: WARN: Managed
> >> /usr/lib/heartbeat/ccm process 19576 exited with return code 100.
> >> Jan 17 07:52:39 mysqlis1 heartbeat: [17620]: EMERG: Rebooting system.
> >> Reason: /usr/lib/heartbeat/ccm
> >> Jan 17 07:52:39 mysqlis1 stonithd: [19579]: info:
> >> G_main_add_SignalHandler: Added signal handler for signal 10
> >> Jan 17 07:52:39 mysqlis1 stonithd: [19579]: info:
> >> G_main_add_SignalHandler: Added signal handler for signal 12
> >> Jan 17 07:52:39 mysqlis1 stonithd: [19579]: info: crm_cluster_connect:
> >> Connecting to Heartbeat
> >> Jan 17 07:52:39 mysqlis1 heartbeat: [19581]: info: Starting
> >> "/usr/lib/heartbeat/crmd" as uid 100  gid 101 (pid 19581)
> >> Jan 17 07:52:41 mysqlis1 heartbeat: [17620]: EMERG: ALL REBOOT OPTIONS
> >> FAILED: /sbin/reboot -nf returned 0
> >> Jan 17 07:52:41 mysqlis1 stonithd: [19579]: ERROR:
> >> register_heartbeat_conn: Cannot sign on with heartbeat:
> >> Jan 17 07:52:41 mysqlis1 stonithd: [19579]: ERROR: failed to connect to
> >> cluster
> >> Jan 17 07:52:41 mysqlis1 stonithd: [19579]: ERROR:
> >> /usr/lib/heartbeat/stonithd abnormally abort.
> >> Jan 17 07:52:42 mysqlis1 heartbeat: [17627]: CRIT: Emergency Shutdown:
> >> Master Control process died.
> >> Jan 17 07:52:42 mysqlis1 heartbeat: [17627]: CRIT: Killing pid 17620
> with
> >> SIGTERM
> >> Jan 17 07:52:42 mysqlis1 heartbeat: [17627]: CRIT: Killing pid 17628
> with
> >> SIGTERM
> >> Jan 17 07:52:42 mysqlis1 heartbeat: [17627]: CRIT: Killing pid 17629
> with
> >> SIGTERM
> >> Jan 17 07:52:42 mysqlis1 heartbeat: [17627]: CRIT: Emergency
> Shutdown(MCP
> >> dead): Killing ourselves.
> >>
> >> Regards,
> >> Akshay
> >>
> >>
> >
> >
> > _______________________________________________
> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs:
> >
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
> >
> >
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs:
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20110118/7184b8a3/attachment-0001.html>


More information about the Pacemaker mailing list