[Pacemaker] [Openais] Linux HA on debian sparc
william felipe_welter
wfelipew at gmail.com
Tue Jun 7 07:44:02 EDT 2011
More two questions.. The patch for mmap calls will be on the mainly
development for all archs ?
Any problems if i send this patch's for Debian project ?
2011/6/3 Steven Dake <sdake at redhat.com>:
> On 06/02/2011 08:16 PM, william felipe_welter wrote:
>> Well,
>>
>> Now with this patch, the pacemakerd process starts and up his other
>> process ( crmd, lrmd, pengine....) but after the process pacemakerd do
>> a fork, the forked process pacemakerd dies due to "signal 10, Bus
>> error".. And on the log, the process of pacemark ( crmd, lrmd,
>> pengine....) cant connect to open ais plugin (possible because the
>> "death" of the pacemakerd process).
>> But this time when the forked pacemakerd dies, he generates a coredump.
>>
>> gdb -c "/usr/var/lib/heartbeat/cores/root/ pacemakerd 7986" -se
>> /usr/sbin/pacemakerd :
>> GNU gdb (GDB) 7.0.1-debian
>> Copyright (C) 2009 Free Software Foundation, Inc.
>> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
>> This is free software: you are free to change and redistribute it.
>> There is NO WARRANTY, to the extent permitted by law. Type "show copying"
>> and "show warranty" for details.
>> This GDB was configured as "sparc-linux-gnu".
>> For bug reporting instructions, please see:
>> <http://www.gnu.org/software/gdb/bugs/>...
>> Reading symbols from /usr/sbin/pacemakerd...done.
>> Reading symbols from /usr/lib64/libuuid.so.1...(no debugging symbols
>> found)...done.
>> Loaded symbols for /usr/lib64/libuuid.so.1
>> Reading symbols from /usr/lib/libcoroipcc.so.4...done.
>> Loaded symbols for /usr/lib/libcoroipcc.so.4
>> Reading symbols from /usr/lib/libcpg.so.4...done.
>> Loaded symbols for /usr/lib/libcpg.so.4
>> Reading symbols from /usr/lib/libquorum.so.4...done.
>> Loaded symbols for /usr/lib/libquorum.so.4
>> Reading symbols from /usr/lib64/libcrmcommon.so.2...done.
>> Loaded symbols for /usr/lib64/libcrmcommon.so.2
>> Reading symbols from /usr/lib/libcfg.so.4...done.
>> Loaded symbols for /usr/lib/libcfg.so.4
>> Reading symbols from /usr/lib/libconfdb.so.4...done.
>> Loaded symbols for /usr/lib/libconfdb.so.4
>> Reading symbols from /usr/lib64/libplumb.so.2...done.
>> Loaded symbols for /usr/lib64/libplumb.so.2
>> Reading symbols from /usr/lib64/libpils.so.2...done.
>> Loaded symbols for /usr/lib64/libpils.so.2
>> Reading symbols from /lib/libbz2.so.1.0...(no debugging symbols found)...done.
>> Loaded symbols for /lib/libbz2.so.1.0
>> Reading symbols from /usr/lib/libxslt.so.1...(no debugging symbols
>> found)...done.
>> Loaded symbols for /usr/lib/libxslt.so.1
>> Reading symbols from /usr/lib/libxml2.so.2...(no debugging symbols
>> found)...done.
>> Loaded symbols for /usr/lib/libxml2.so.2
>> Reading symbols from /lib/libc.so.6...(no debugging symbols found)...done.
>> Loaded symbols for /lib/libc.so.6
>> Reading symbols from /lib/librt.so.1...(no debugging symbols found)...done.
>> Loaded symbols for /lib/librt.so.1
>> Reading symbols from /lib/libdl.so.2...(no debugging symbols found)...done.
>> Loaded symbols for /lib/libdl.so.2
>> Reading symbols from /lib/libglib-2.0.so.0...(no debugging symbols
>> found)...done.
>> Loaded symbols for /lib/libglib-2.0.so.0
>> Reading symbols from /usr/lib/libltdl.so.7...(no debugging symbols
>> found)...done.
>> Loaded symbols for /usr/lib/libltdl.so.7
>> Reading symbols from /lib/ld-linux.so.2...(no debugging symbols found)...done.
>> Loaded symbols for /lib/ld-linux.so.2
>> Reading symbols from /lib/libpthread.so.0...(no debugging symbols found)...done.
>> Loaded symbols for /lib/libpthread.so.0
>> Reading symbols from /lib/libm.so.6...(no debugging symbols found)...done.
>> Loaded symbols for /lib/libm.so.6
>> Reading symbols from /usr/lib/libz.so.1...(no debugging symbols found)...done.
>> Loaded symbols for /usr/lib/libz.so.1
>> Reading symbols from /lib/libpcre.so.3...(no debugging symbols found)...done.
>> Loaded symbols for /lib/libpcre.so.3
>> Reading symbols from /lib/libnss_compat.so.2...(no debugging symbols
>> found)...done.
>> Loaded symbols for /lib/libnss_compat.so.2
>> Reading symbols from /lib/libnsl.so.1...(no debugging symbols found)...done.
>> Loaded symbols for /lib/libnsl.so.1
>> Reading symbols from /lib/libnss_nis.so.2...(no debugging symbols found)...done.
>> Loaded symbols for /lib/libnss_nis.so.2
>> Reading symbols from /lib/libnss_files.so.2...(no debugging symbols
>> found)...done.
>> Loaded symbols for /lib/libnss_files.so.2
>> Core was generated by `pacemakerd'.
>> Program terminated with signal 10, Bus error.
>> #0 cpg_dispatch (handle=17861288972693536769, dispatch_types=7986) at cpg.c:339
>> 339 switch (dispatch_data->id) {
>> (gdb) bt
>> #0 cpg_dispatch (handle=17861288972693536769, dispatch_types=7986) at cpg.c:339
>> #1 0xf6f100f0 in ?? ()
>> #2 0xf6f100f4 in ?? ()
>> Backtrace stopped: previous frame identical to this frame (corrupt stack?)
>>
>>
>>
>> I take a look at the cpg.c and see that the dispatch_data was aquired
>> by coroipcc_dispatch_get (that was defined on lib/coroipcc.c)
>> function:
>>
>> do {
>> error = coroipcc_dispatch_get (
>> cpg_inst->handle,
>> (void **)&dispatch_data,
>> timeout);
>>
>>
>>
>
> Try the recent patch sent to fix alignment.
>
> Regards
> -steve
>
>>
>> Resumed log:
>> ...
>> un 02 23:12:20 corosync [CPG ] got mcast request on 0x62500
>> Jun 02 23:12:20 corosync [TOTEM ] mcasted message added to pending queue
>> Jun 02 23:12:20 corosync [TOTEM ] Delivering f to 10
>> Jun 02 23:12:20 corosync [TOTEM ] Delivering MCAST message with seq 10
>> to pending delivery queue
>> Jun 02 23:12:20 corosync [TOTEM ] releasing messages up to and including f
>> Jun 02 23:12:20 corosync [TOTEM ] releasing messages up to and including 10
>> Jun 02 23:12:20 xxxxxxxxxx pacemakerd: [7986]: info: start_child:
>> Forked child 7991 for process lrmd
>> Jun 02 23:12:20 xxxxxxxxxx pacemakerd: [7986]: info:
>> update_node_processes: Node xxxxxxxxxx now has process list:
>> 00000000000000000000000000100112 (was
>> 00000000000000000000000000100102)
>> Jun 02 23:12:20 corosync [CPG ] got mcast request on 0x62500
>> Jun 02 23:12:20 corosync [TOTEM ] mcasted message added to pending queue
>> Jun 02 23:12:20 corosync [TOTEM ] Delivering 10 to 11
>> Jun 02 23:12:20 corosync [TOTEM ] Delivering MCAST message with seq 11
>> to pending delivery queue
>> Jun 02 23:12:20 corosync [TOTEM ] releasing messages up to and including 11
>> Jun 02 23:12:20 xxxxxxxxxx pacemakerd: [7986]: info: start_child:
>> Forked child 7992 for process attrd
>> Jun 02 23:12:20 xxxxxxxxxx pacemakerd: [7986]: info:
>> update_node_processes: Node xxxxxxxxxx now has process list:
>> 00000000000000000000000000101112 (was
>> 00000000000000000000000000100112)
>> Jun 02 23:12:20 corosync [CPG ] got mcast request on 0x62500
>> Jun 02 23:12:20 corosync [TOTEM ] mcasted message added to pending queue
>> Jun 02 23:12:20 corosync [TOTEM ] Delivering 11 to 12
>> Jun 02 23:12:20 corosync [TOTEM ] Delivering MCAST message with seq 12
>> to pending delivery queue
>> Jun 02 23:12:20 corosync [TOTEM ] releasing messages up to and including 12
>> Jun 02 23:12:20 xxxxxxxxxx pacemakerd: [7986]: info: start_child:
>> Forked child 7993 for process pengine
>> Jun 02 23:12:20 xxxxxxxxxx pacemakerd: [7986]: info:
>> update_node_processes: Node xxxxxxxxxx now has process list:
>> 00000000000000000000000000111112 (was
>> 00000000000000000000000000101112)
>> Jun 02 23:12:20 corosync [CPG ] got mcast request on 0x62500
>> Jun 02 23:12:20 corosync [TOTEM ] mcasted message added to pending queue
>> Jun 02 23:12:20 corosync [TOTEM ] Delivering 12 to 13
>> Jun 02 23:12:20 corosync [TOTEM ] Delivering MCAST message with seq 13
>> to pending delivery queue
>> Jun 02 23:12:20 corosync [TOTEM ] releasing messages up to and including 13
>> Jun 02 23:12:20 xxxxxxxxxx pacemakerd: [7986]: info: start_child:
>> Forked child 7994 for process crmd
>> Jun 02 23:12:20 xxxxxxxxxx pacemakerd: [7986]: info:
>> update_node_processes: Node xxxxxxxxxx now has process list:
>> 00000000000000000000000000111312 (was
>> 00000000000000000000000000111112)
>> Jun 02 23:12:20 corosync [CPG ] got mcast request on 0x62500
>> Jun 02 23:12:20 xxxxxxxxxx pacemakerd: [7986]: info: main: Starting mainloop
>> Jun 02 23:12:20 corosync [TOTEM ] mcasted message added to pending queue
>> Jun 02 23:12:20 corosync [TOTEM ] Delivering 13 to 14
>> Jun 02 23:12:20 corosync [TOTEM ] Delivering MCAST message with seq 14
>> to pending delivery queue
>> Jun 02 23:12:20 corosync [TOTEM ] releasing messages up to and including 14
>> Jun 02 23:12:20 corosync [CPG ] got mcast request on 0x62500
>> Jun 02 23:12:20 corosync [TOTEM ] mcasted message added to pending queue
>> Jun 02 23:12:20 corosync [TOTEM ] Delivering 14 to 15
>> Jun 02 23:12:20 corosync [TOTEM ] Delivering MCAST message with seq 15
>> to pending delivery queue
>> Jun 02 23:12:20 corosync [TOTEM ] releasing messages up to and including 15
>> Jun 02 23:12:20 xxxxxxxxxx stonith-ng: [7989]: info: Invoked:
>> /usr/lib64/heartbeat/stonithd
>> Jun 02 23:12:20 xxxxxxxxxx stonith-ng: [7989]: info:
>> crm_log_init_worker: Changed active directory to
>> /usr/var/lib/heartbeat/cores/root
>> Jun 02 23:12:20 xxxxxxxxxx stonith-ng: [7989]: info: get_cluster_type:
>> Cluster type is: 'openais'.
>> Jun 02 23:12:20 xxxxxxxxxx stonith-ng: [7989]: info:
>> crm_cluster_connect: Connecting to cluster infrastructure: classic
>> openais (with plugin)
>> Jun 02 23:12:20 xxxxxxxxxx stonith-ng: [7989]: info:
>> init_ais_connection_classic: Creating connection to our Corosync
>> plugin
>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: info: crm_log_init_worker:
>> Changed active directory to /usr/var/lib/heartbeat/cores/hacluster
>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: info: retrieveCib: Reading
>> cluster configuration from: /usr/var/lib/heartbeat/crm/cib.xml
>> (digest: /usr/var/lib/heartbeat/crm/cib.xml.sig)
>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: WARN: retrieveCib: Cluster
>> configuration not found: /usr/var/lib/heartbeat/crm/cib.xml
>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: WARN: readCibXmlFile: Primary
>> configuration corrupt or unusable, trying backup...
>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: debug: get_last_sequence:
>> Series file /usr/var/lib/heartbeat/crm/cib.last does not exist
>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: debug: readCibXmlFile: Backup
>> file /usr/var/lib/heartbeat/crm/cib-99.raw not found
>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: WARN: readCibXmlFile:
>> Continuing with an empty configuration.
>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: debug: readCibXmlFile[on-disk]
>> <cib epoch="0" num_updates="0" admin_epoch="0"
>> validate-with="pacemaker-1.2" >
>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: debug: readCibXmlFile[on-disk]
>> <configuration >
>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: debug: readCibXmlFile[on-disk]
>> <crm_config />
>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: debug: readCibXmlFile[on-disk]
>> <nodes />
>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: debug: readCibXmlFile[on-disk]
>> <resources />
>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: debug: readCibXmlFile[on-disk]
>> <constraints />
>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: debug: readCibXmlFile[on-disk]
>> </configuration>
>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: debug: readCibXmlFile[on-disk]
>> <status />
>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: debug: readCibXmlFile[on-disk] </cib>
>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: info: validate_with_relaxng:
>> Creating RNG parser context
>> Jun 02 23:12:20 xxxxxxxxxx stonith-ng: [7989]: info:
>> init_ais_connection_classic: Connection to our AIS plugin (9) failed:
>> Doesn't exist (12)
>> Jun 02 23:12:20 xxxxxxxxxx stonith-ng: [7989]: CRIT: main: Cannot sign
>> in to the cluster... terminating
>> Jun 02 23:12:20 xxxxxxxxxx crmd: [7994]: info: Invoked:
>> /usr/lib64/heartbeat/crmd
>> Jun 02 23:12:20 xxxxxxxxxx pengine: [7993]: info: Invoked:
>> /usr/lib64/heartbeat/pengine
>> Jun 02 23:12:20 xxxxxxxxxx crmd: [7994]: info: crm_log_init_worker:
>> Changed active directory to /usr/var/lib/heartbeat/cores/hacluster
>> Jun 02 23:12:20 xxxxxxxxxx pengine: [7993]: info: crm_log_init_worker:
>> Changed active directory to /usr/var/lib/heartbeat/cores/hacluster
>> Jun 02 23:12:20 xxxxxxxxxx crmd: [7994]: info: main: CRM Hg Version:
>> e872eeb39a5f6e1fdb57c3108551a5353648c4f4
>>
>> Jun 02 23:12:20 xxxxxxxxxx pengine: [7993]: debug: main: Checking for
>> old instances of pengine
>> Jun 02 23:12:20 xxxxxxxxxx pengine: [7993]: debug:
>> init_client_ipc_comms_nodispatch: Attempting to talk on:
>> /usr/var/run/crm/pengine
>> Jun 02 23:12:20 xxxxxxxxxx lrmd: [7991]: info: enabling coredumps
>> Jun 02 23:12:20 xxxxxxxxxx crmd: [7994]: info: crmd_init: Starting crmd
>> Jun 02 23:12:20 xxxxxxxxxx pengine: [7993]: debug:
>> init_client_ipc_comms_nodispatch: Could not init comms on:
>> /usr/var/run/crm/pengine
>> Jun 02 23:12:20 xxxxxxxxxx lrmd: [7991]: debug: main: run the loop...
>> Jun 02 23:12:20 xxxxxxxxxx lrmd: [7991]: info: Started.
>> Jun 02 23:12:20 xxxxxxxxxx pengine: [7993]: debug: main: Init server comms
>> Jun 02 23:12:20 xxxxxxxxxx crmd: [7994]: debug: s_crmd_fsa: Processing
>> I_STARTUP: [ state=S_STARTING cause=C_STARTUP origin=crmd_init ]
>> Jun 02 23:12:20 xxxxxxxxxx crmd: [7994]: debug: do_fsa_action:
>> actions:trace: // A_LOG
>> Jun 02 23:12:20 xxxxxxxxxx crmd: [7994]: debug: do_fsa_action:
>> actions:trace: // A_STARTUP
>> Jun 02 23:12:20 xxxxxxxxxx pengine: [7993]: info: main: Starting pengine
>> Jun 02 23:12:20 xxxxxxxxxx crmd: [7994]: debug: do_startup:
>> Registering Signal Handlers
>> Jun 02 23:12:20 xxxxxxxxxx crmd: [7994]: debug: do_startup: Creating
>> CIB and LRM objects
>> Jun 02 23:12:20 xxxxxxxxxx crmd: [7994]: debug: do_fsa_action:
>> actions:trace: // A_CIB_START
>> Jun 02 23:12:20 xxxxxxxxxx crmd: [7994]: debug:
>> init_client_ipc_comms_nodispatch: Attempting to talk on:
>> /usr/var/run/crm/cib_rw
>> Jun 02 23:12:20 xxxxxxxxxx crmd: [7994]: debug:
>> init_client_ipc_comms_nodispatch: Could not init comms on:
>> /usr/var/run/crm/cib_rw
>> Jun 02 23:12:20 xxxxxxxxxx crmd: [7994]: debug: cib_native_signon_raw:
>> Connection to command channel failed
>> Jun 02 23:12:20 xxxxxxxxxx crmd: [7994]: debug:
>> init_client_ipc_comms_nodispatch: Attempting to talk on:
>> /usr/var/run/crm/cib_callback
>> Jun 02 23:12:20 xxxxxxxxxx crmd: [7994]: debug:
>> init_client_ipc_comms_nodispatch: Could not init comms on:
>> /usr/var/run/crm/cib_callback
>> Jun 02 23:12:20 xxxxxxxxxx crmd: [7994]: debug: cib_native_signon_raw:
>> Connection to callback channel failed
>> Jun 02 23:12:20 xxxxxxxxxx crmd: [7994]: debug: cib_native_signon_raw:
>> Connection to CIB failed: connection failed
>> Jun 02 23:12:20 xxxxxxxxxx crmd: [7994]: debug: cib_native_signoff:
>> Signing out of the CIB Service
>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: debug: activateCibXml:
>> Triggering CIB write for start op
>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: info: startCib: CIB
>> Initialization completed successfully
>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: info: get_cluster_type:
>> Cluster type is: 'openais'.
>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: info: crm_cluster_connect:
>> Connecting to cluster infrastructure: classic openais (with plugin)
>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: info:
>> init_ais_connection_classic: Creating connection to our Corosync
>> plugin
>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: info:
>> init_ais_connection_classic: Connection to our AIS plugin (9) failed:
>> Doesn't exist (12)
>> Jun 02 23:12:20 xxxxxxxxxx cib: [7990]: CRIT: cib_init: Cannot sign in
>> to the cluster... terminating
>> Jun 02 23:12:21 corosync [CPG ] exit_fn for conn=0x62500
>> Jun 02 23:12:21 corosync [TOTEM ] mcasted message added to pending queue
>> Jun 02 23:12:21 corosync [TOTEM ] Delivering 15 to 16
>> Jun 02 23:12:21 corosync [TOTEM ] Delivering MCAST message with seq 16
>> to pending delivery queue
>> Jun 02 23:12:21 corosync [CPG ] got procleave message from cluster
>> node 1377289226
>> Jun 02 23:12:21 corosync [TOTEM ] releasing messages up to and including 16
>> Jun 02 23:12:21 xxxxxxxxxx attrd: [7992]: info: Invoked:
>> /usr/lib64/heartbeat/attrd
>> Jun 02 23:12:21 xxxxxxxxxx attrd: [7992]: info: crm_log_init_worker:
>> Changed active directory to /usr/var/lib/heartbeat/cores/hacluster
>> Jun 02 23:12:21 xxxxxxxxxx attrd: [7992]: info: main: Starting up
>> Jun 02 23:12:21 xxxxxxxxxx attrd: [7992]: info: get_cluster_type:
>> Cluster type is: 'openais'.
>> Jun 02 23:12:21 xxxxxxxxxx attrd: [7992]: info: crm_cluster_connect:
>> Connecting to cluster infrastructure: classic openais (with plugin)
>> Jun 02 23:12:21 xxxxxxxxxx attrd: [7992]: info:
>> init_ais_connection_classic: Creating connection to our Corosync
>> plugin
>> Jun 02 23:12:21 xxxxxxxxxx attrd: [7992]: info:
>> init_ais_connection_classic: Connection to our AIS plugin (9) failed:
>> Doesn't exist (12)
>> Jun 02 23:12:21 xxxxxxxxxx attrd: [7992]: ERROR: main: HA Signon failed
>> Jun 02 23:12:21 xxxxxxxxxx attrd: [7992]: info: main: Cluster connection active
>> Jun 02 23:12:21 xxxxxxxxxx attrd: [7992]: info: main: Accepting
>> attribute updates
>> Jun 02 23:12:21 xxxxxxxxxx attrd: [7992]: ERROR: main: Aborting startup
>> Jun 02 23:12:21 xxxxxxxxxx crmd: [7994]: debug:
>> init_client_ipc_comms_nodispatch: Attempting to talk on:
>> /usr/var/run/crm/cib_rw
>> Jun 02 23:12:21 xxxxxxxxxx crmd: [7994]: debug:
>> init_client_ipc_comms_nodispatch: Could not init comms on:
>> /usr/var/run/crm/cib_rw
>> Jun 02 23:12:21 xxxxxxxxxx crmd: [7994]: debug: cib_native_signon_raw:
>> Connection to command channel failed
>> Jun 02 23:12:21 xxxxxxxxxx crmd: [7994]: debug:
>> init_client_ipc_comms_nodispatch: Attempting to talk on:
>> /usr/var/run/crm/cib_callback
>> ...
>>
>>
>> 2011/6/2 Steven Dake <sdake at redhat.com>:
>>> On 06/01/2011 11:05 PM, william felipe_welter wrote:
>>>> I recompile my kernel without hugetlb .. and the result are the same..
>>>>
>>>> My test program still resulting:
>>>> PATH=/dev/shm/teste123XXXXXX
>>>> page size=20000
>>>> fd=3
>>>> ADDR_ORIG:0xe000a000 ADDR:0xffffffff
>>>> Erro
>>>>
>>>> And Pacemaker still resulting because the mmap error:
>>>> Could not initialize Cluster Configuration Database API instance error 2
>>>>
>>>
>>> Give the patch I posted recently a spin - corosync WFM with this patch
>>> on sparc64 with hugetlb set. Please report back results.
>>>
>>> Regards
>>> -steve
>>>
>>>> For make sure that i have disable the hugetlb there is my /proc/meminfo:
>>>> MemTotal: 33093488 kB
>>>> MemFree: 32855616 kB
>>>> Buffers: 5600 kB
>>>> Cached: 53480 kB
>>>> SwapCached: 0 kB
>>>> Active: 45768 kB
>>>> Inactive: 28104 kB
>>>> Active(anon): 18024 kB
>>>> Inactive(anon): 1560 kB
>>>> Active(file): 27744 kB
>>>> Inactive(file): 26544 kB
>>>> Unevictable: 0 kB
>>>> Mlocked: 0 kB
>>>> SwapTotal: 6104680 kB
>>>> SwapFree: 6104680 kB
>>>> Dirty: 0 kB
>>>> Writeback: 0 kB
>>>> AnonPages: 14936 kB
>>>> Mapped: 7736 kB
>>>> Shmem: 4624 kB
>>>> Slab: 39184 kB
>>>> SReclaimable: 10088 kB
>>>> SUnreclaim: 29096 kB
>>>> KernelStack: 7088 kB
>>>> PageTables: 1160 kB
>>>> Quicklists: 17664 kB
>>>> NFS_Unstable: 0 kB
>>>> Bounce: 0 kB
>>>> WritebackTmp: 0 kB
>>>> CommitLimit: 22651424 kB
>>>> Committed_AS: 519368 kB
>>>> VmallocTotal: 1069547520 kB
>>>> VmallocUsed: 11064 kB
>>>> VmallocChunk: 1069529616 kB
>>>>
>>>>
>>>> 2011/6/1 Steven Dake <sdake at redhat.com>:
>>>>> On 06/01/2011 07:42 AM, william felipe_welter wrote:
>>>>>> Steven,
>>>>>>
>>>>>> cat /proc/meminfo
>>>>>> ...
>>>>>> HugePages_Total: 0
>>>>>> HugePages_Free: 0
>>>>>> HugePages_Rsvd: 0
>>>>>> HugePages_Surp: 0
>>>>>> Hugepagesize: 4096 kB
>>>>>> ...
>>>>>>
>>>>>
>>>>> It definitely requires a kernel compile and setting the config option to
>>>>> off. I don't know the debian way of doing this.
>>>>>
>>>>> The only reason you may need this option is if you have very large
>>>>> memory sizes, such as 48GB or more.
>>>>>
>>>>> Regards
>>>>> -steve
>>>>>
>>>>>> Its 4MB..
>>>>>>
>>>>>> How can i disable hugetlb ? ( passing CONFIG_HUGETLBFS=n at boot to
>>>>>> kernel ?)
>>>>>>
>>>>>> 2011/6/1 Steven Dake <sdake at redhat.com <mailto:sdake at redhat.com>>
>>>>>>
>>>>>> On 06/01/2011 01:05 AM, Steven Dake wrote:
>>>>>> > On 05/31/2011 09:44 PM, Angus Salkeld wrote:
>>>>>> >> On Tue, May 31, 2011 at 11:52:48PM -0300, william felipe_welter
>>>>>> wrote:
>>>>>> >>> Angus,
>>>>>> >>>
>>>>>> >>> I make some test program (based on the code coreipcc.c) and i
>>>>>> now i sure
>>>>>> >>> that are problems with the mmap systems call on sparc..
>>>>>> >>>
>>>>>> >>> Source code of my test program:
>>>>>> >>>
>>>>>> >>> #include <stdlib.h>
>>>>>> >>> #include <sys/mman.h>
>>>>>> >>> #include <stdio.h>
>>>>>> >>>
>>>>>> >>> #define PATH_MAX 36
>>>>>> >>>
>>>>>> >>> int main()
>>>>>> >>> {
>>>>>> >>>
>>>>>> >>> int32_t fd;
>>>>>> >>> void *addr_orig;
>>>>>> >>> void *addr;
>>>>>> >>> char path[PATH_MAX];
>>>>>> >>> const char *file = "teste123XXXXXX";
>>>>>> >>> size_t bytes=10024;
>>>>>> >>>
>>>>>> >>> snprintf (path, PATH_MAX, "/dev/shm/%s", file);
>>>>>> >>> printf("PATH=%s\n",path);
>>>>>> >>>
>>>>>> >>> fd = mkstemp (path);
>>>>>> >>> printf("fd=%d \n",fd);
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> addr_orig = mmap (NULL, bytes, PROT_NONE,
>>>>>> >>> MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> addr = mmap (addr_orig, bytes, PROT_READ | PROT_WRITE,
>>>>>> >>> MAP_FIXED | MAP_SHARED, fd, 0);
>>>>>> >>>
>>>>>> >>> printf("ADDR_ORIG:%p ADDR:%p\n",addr_orig,addr);
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> if (addr != addr_orig) {
>>>>>> >>> printf("Erro");
>>>>>> >>> }
>>>>>> >>> }
>>>>>> >>>
>>>>>> >>> Results on x86:
>>>>>> >>> PATH=/dev/shm/teste123XXXXXX
>>>>>> >>> fd=3
>>>>>> >>> ADDR_ORIG:0x7f867d8e6000 ADDR:0x7f867d8e6000
>>>>>> >>>
>>>>>> >>> Results on sparc:
>>>>>> >>> PATH=/dev/shm/teste123XXXXXX
>>>>>> >>> fd=3
>>>>>> >>> ADDR_ORIG:0xf7f72000 ADDR:0xffffffff
>>>>>> >>
>>>>>> >> Note: 0xffffffff == MAP_FAILED
>>>>>> >>
>>>>>> >> (from man mmap)
>>>>>> >> RETURN VALUE
>>>>>> >> On success, mmap() returns a pointer to the mapped area. On
>>>>>> >> error, the value MAP_FAILED (that is, (void *) -1) is
>>>>>> returned,
>>>>>> >> and errno is set appropriately.
>>>>>> >>
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> But im wondering if is really needed to call mmap 2 times ?
>>>>>> What are the
>>>>>> >>> reason to call the mmap 2 times, on the second time using the
>>>>>> address of the
>>>>>> >>> first?
>>>>>> >>>
>>>>>> >>>
>>>>>> >> Well there are 3 calls to mmap()
>>>>>> >> 1) one to allocate 2 * what you need (in pages)
>>>>>> >> 2) maps the first half of the mem to a real file
>>>>>> >> 3) maps the second half of the mem to the same file
>>>>>> >>
>>>>>> >> The point is when you write to an address over the end of the
>>>>>> >> first half of memory it is taken care of the the third mmap which
>>>>>> maps
>>>>>> >> the address back to the top of the file for you. This means you
>>>>>> >> don't have to worry about ringbuffer wrapping which can be a
>>>>>> headache.
>>>>>> >>
>>>>>> >> -Angus
>>>>>> >>
>>>>>> >
>>>>>> > interesting this mmap operation doesn't work on sparc linux.
>>>>>> >
>>>>>> > Not sure how I can help here - Next step would be a follow up with the
>>>>>> > sparc linux mailing list. I'll do that and cc you on the message
>>>>>> - see
>>>>>> > if we get any response.
>>>>>> >
>>>>>> > http://vger.kernel.org/vger-lists.html
>>>>>> >
>>>>>> >>>
>>>>>> >>>
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> 2011/5/31 Angus Salkeld <asalkeld at redhat.com
>>>>>> <mailto:asalkeld at redhat.com>>
>>>>>> >>>
>>>>>> >>>> On Tue, May 31, 2011 at 06:25:56PM -0300, william felipe_welter
>>>>>> wrote:
>>>>>> >>>>> Thanks Steven,
>>>>>> >>>>>
>>>>>> >>>>> Now im try to run on the MCP:
>>>>>> >>>>> - Uninstall the pacemaker 1.0
>>>>>> >>>>> - Compile and install 1.1
>>>>>> >>>>>
>>>>>> >>>>> But now i have problems to initialize the pacemakerd: Could not
>>>>>> >>>> initialize
>>>>>> >>>>> Cluster Configuration Database API instance error 2
>>>>>> >>>>> Debbuging with gdb i see that the error are on the confdb.. most
>>>>>> >>>> specificaly
>>>>>> >>>>> the errors start on coreipcc.c at line:
>>>>>> >>>>>
>>>>>> >>>>>
>>>>>> >>>>> 448 if (addr != addr_orig) {
>>>>>> >>>>> 449 goto error_close_unlink; <- enter here
>>>>>> >>>>> 450 }
>>>>>> >>>>>
>>>>>> >>>>> Some ideia about what can cause this ?
>>>>>> >>>>>
>>>>>> >>>>
>>>>>> >>>> I tried porting a ringbuffer (www.libqb.org
>>>>>> <http://www.libqb.org>) to sparc and had the same
>>>>>> >>>> failure.
>>>>>> >>>> There are 3 mmap() calls and on sparc the third one keeps failing.
>>>>>> >>>>
>>>>>> >>>> This is a common way of creating a ring buffer, see:
>>>>>> >>>>
>>>>>> http://en.wikipedia.org/wiki/Circular_buffer#Exemplary_POSIX_Implementation
>>>>>> >>>>
>>>>>> >>>> I couldn't get it working in the short time I tried. It's probably
>>>>>> >>>> worth looking at the clib implementation to see why it's failing
>>>>>> >>>> (I didn't get to that).
>>>>>> >>>>
>>>>>> >>>> -Angus
>>>>>> >>>>
>>>>>>
>>>>>> Note, we sorted this out we believe. Your kernel has hugetlb enabled,
>>>>>> probably with 4MB pages. This requires corosync to allocate 4MB pages.
>>>>>>
>>>>>> Can you verify your hugetlb settings?
>>>>>>
>>>>>> If you can turn this option off, you should have atleast a working
>>>>>> corosync.
>>>>>>
>>>>>> Regards
>>>>>> -steve
>>>>>> >>>>
>>>>>> >>>> _______________________________________________
>>>>>> >>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>> <mailto:Pacemaker at oss.clusterlabs.org>
>>>>>> >>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>> >>>>
>>>>>> >>>> Project Home: http://www.clusterlabs.org
>>>>>> >>>> Getting started:
>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>> >>>> Bugs:
>>>>>> >>>>
>>>>>> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>>>>>> >>>>
>>>>>> >>>
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> --
>>>>>> >>> William Felipe Welter
>>>>>> >>> ------------------------------
>>>>>> >>> Consultor em Tecnologias Livres
>>>>>> >>> william.welter at 4linux.com.br <mailto:william.welter at 4linux.com.br>
>>>>>> >>> www.4linux.com.br <http://www.4linux.com.br>
>>>>>> >>
>>>>>> >>> _______________________________________________
>>>>>> >>> Openais mailing list
>>>>>> >>> Openais at lists.linux-foundation.org
>>>>>> <mailto:Openais at lists.linux-foundation.org>
>>>>>> >>> https://lists.linux-foundation.org/mailman/listinfo/openais
>>>>>> >>
>>>>>> >>
>>>>>> >> _______________________________________________
>>>>>> >> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>> <mailto:Pacemaker at oss.clusterlabs.org>
>>>>>> >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>> >>
>>>>>> >> Project Home: http://www.clusterlabs.org
>>>>>> >> Getting started:
>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>> >> Bugs:
>>>>>> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>>>>>> >
>>>>>> > _______________________________________________
>>>>>> > Openais mailing list
>>>>>> > Openais at lists.linux-foundation.org
>>>>>> <mailto:Openais at lists.linux-foundation.org>
>>>>>> > https://lists.linux-foundation.org/mailman/listinfo/openais
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>> <mailto:Pacemaker at oss.clusterlabs.org>
>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>
>>>>>> Project Home: http://www.clusterlabs.org
>>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>> Bugs:
>>>>>> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> William Felipe Welter
>>>>>> ------------------------------
>>>>>> Consultor em Tecnologias Livres
>>>>>> william.welter at 4linux.com.br <mailto:william.welter at 4linux.com.br>
>>>>>> www.4linux.com.br <http://www.4linux.com.br>
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>
>>>>>> Project Home: http://www.clusterlabs.org
>>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>>
>
>
--
William Felipe Welter
------------------------------
Consultor em Tecnologias Livres
william.welter at 4linux.com.br
www.4linux.com.br
More information about the Pacemaker
mailing list