[Pacemaker] Cluster Summit Report

Fabio M. Di Nitto fdinitto at redhat.com
Fri Oct 10 00:53:45 EDT 2008

Hi all,

The general feeling was that the Cluster Summit was a very good
experience for everybody and that the amount of work done during those
3 days would have taken months on normal communication media. Of the 3
days schedule only 2 and half were required as the people have been way
more efficient than expected. A lot of the pre-scheduled discussions
have been dropped in a natural fashion as they were absorbed, discussed
or deprecated at the source into other talks. People, coming from
different environments with different experience and use cases, made a
huge difference.

While we did discuss to a greater level of technical details, this is a
short summary of what will happen (in no particular order):

Tree's splitting:
- This item should be first and last at the same time.
  As a consequence of what has been decided, almost all trees will need
  to be divided and reorganized differently.
  As an example, RedHat specific bits will remain in one tree, while
  common components (such as dlm and fencing+fencing agents) will leave 
  in their own separate projects.
  Details of the split are still to be determined. Low hanging fruits
  will be done first (gnbd and gfs* for example).

- We discussed using clusterlabs.org as the go-to page for users,
  listing the versions of the latest (stable) components from all
  sources. The openSUSE Build Service could then be used as a hosting
  provider for this "community distro".

- For the heartbeat tree, all that will eventually remain in it is the
  heartbeat "cluster infrastructure layer" (can't drop for backwards
  compatibility for a while).

- Eventually some core libraries will migrate into corosync.

- fabbione to coordinate the splitting.

- lmb will coordinate the Linux-HA split and help with the build service
  stuff (if we go ahead with that).

Standard fencing:
- fencing daemon, libraries and agents will be merged (from RedHat and
  heartbeat) into two new projects (so that agents can be released
  independently from the daemon/libs).

- fencing project will grow a simulator for regression testing (honza). 
  The simulator will be a simple set of scripts that collect outputs 
  from all known fencing devices and pass them back to the agents to 
  test functionalities. While not perfect, it will still allow to do 
  basic regression testing. We discussed this in terms of rewriting the 
  RAs as simple python classes, which would interact with the world 
  through IO abstractions (which would then be easy to capture/replay).

- honzaf will write up an ABI/API for the agents which merges both
  functionalities and features.

- Possibly agents will need to be rewritten/re-factored as part of the
  merge; some of the C plug-ins might become python classes etc

- lmb, dejan, honza and dct to work on it.

Release time lines:
- As the trees will merge and split into separate projects, RM's will
  coordinate effort to make sure the new work will be available as 
  modular as possible.

- All releases will be available in neutral area for users to download 
  in one shot as discussed previously.

Standard logging:
- Everybody to standardize on logsys.

- The log recorder is worth mentioning here - buffering debug logging so
  that it can be dumped (retroactively) when a fault is encountered.
  Very useful feature.

- heartbeat has a hb_report feature to gather logs, configurations, 
  stack traces from core dumps etc from all cluster nodes, that'll be 
  extended over time to support all this too

- New features will be required in logsys to improve the user 

Init scripts:
- agreed that all init scripts shipped from upstream need to be LSB 
  compliant and work in a distribution independent way. Users should 
  not need to care when installing from our tarballs.

- With portable packages, any differences should be hidden in there.

Packaging from upstream:
- in order to speed up adoption, our plan is to ship .spec and debian/
  packaging format directly from upstream and with support from   
  packagers. This will greatly reduce the time of propagation from 
  upstream release into users that do not like installing manually.
  Packages can be built using the openSUSE build service to avoid 
  requirement on new infrastructure.

Standard quorum service:
- Chrissie to implement the service within corosync/openais.

- API has been discussed and explained in depth.

Standard configuration:
- New stack will standardize on CIB (from pacemaker). CIB is approx. a 
  ccsd on steroids.

- fabbione to look into CIB, and port libccs to libcib.

- chrissie to port LDAP loader to CIB.

Common shell scripting library for RA's:
- Agreed to merge and review all RA's. This is a natural step as 
  rgmanager will be deprecated.

- lon and dejan to work on it.

Clustered Samba:
- More detailed investigation required but the short line is that 
  performance testing are required.

- Might require RA.

- Investigate benefit from infiniband.

- Nice to see samba integrated with corosync/openais.

Split site:
- There are 2 main scenarios for split site:
  - Metropolitan Area Clusters: "low" latency, redundancy affordable
  - Wide Area Clusters: high latency, expensive redundancy

  Each case has different problematic s (as latency and speed of the
  links). We will start tackling "remote" and only service/application 
  fail-over. Data Replication will come later as users will demand it.

- lmb to write the code for the "3rd site quorum" service tied into 
  pacemaker resource dependency framework.

- Identified need for some additional RAs to coordinate routing/address
  resolution switch-over; interfacing with routing protocols
  (BGP4/OSPF/etc) and DNS.


- corosync release cycles
  - "Flatiron" to be released in time for February (+ Wilson/openAIS)
  - Need to understand effects of RDMA versus IP over infiniband

- openSharedRoot presentation
  - Lots of unsolved issues, mostly related to clunky CDSL emulation,
    and the need to bring up significant portions of the stack before
    mounting root

- NTT:
  - Raised lots of issues about supportability too
  - NTT will drive a stonith agent which works nicely with crashdumps 

More information about the Pacemaker mailing list