[ClusterLabs] Users Digest, Vol 115, Issue 10

Peter Romancik promanci at redhat.com
Tue Aug 27 12:35:08 UTC 2024


On Sat, Aug 24, 2024 at 12:02 PM Angelo Ruggiero via Users <
users at clusterlabs.org> wrote:

> Thanks Peter for taking the time. Some followup questions, then I will
> reformualte my summary again 🙂and also put in the quorum device (2 node
> cluster) and web ui to pcsd.
>
> Thanks for your patience...
>
>
>    - There are two Unix domain sockets used by pcsd:
>       - For communication between Python and Ruby pcsd
>
> Ah....
>
>       ("/var/run/pcsd-ruby.socket"). This is due to some legacy parts of
> the
>       daemon still running in Ruby. So, if Python gets a request that
> should be
>       handled by Ruby daemon, it forwards it using this socket.
>
> Just for interest what software goes to the ruby?
>

I'm not sure what you mean by "software." However, certain functionalities,
such as synchronization of 'known-hosts' files across nodes, are still
implemented in the legacy ruby part of pcsd.

      - For communication between clients and pcsd
>       ("/var/run/pcsd.socket"). However, PCS does not currently utilize
> this
>       socket to communicate with the local pcsd.
>
> Ah so pcs uses the Ruby unix socket?
>

This socket is only used to forward HTTPS requests between pcsd running in
Python and the legacy part of pcsd running in Ruby.


> Who does use the pcsd.socket then?
>

Currently, only pcs Web UI running in the Cockpit web console (
https://cockpit-project.org/).


>    - All the communication in the network commands is done using tls, as
>    you mentioned. However, the communication between pcs and the local
> pcsd is
>    not done using unix domain socket. Unix domain socket is only used for
> the
>    already mentioned Python to Ruby daemon communication.
>
> TLS is just used for network for pcs to pcsd  or for browser to pscd web
> ui to define "All" 🙂 Not for corosync. Right?
> And then that would mean pcs to pcsd also locally is via the  TCP socket?
>
>
Sorry for not being entirely exact. I was speaking in the context of pcs
and pcsd, so by "all" I meant the communication between pcs and pcsd. The
communication between pcs and pcsd is done using HTTPS (HTTP + TLS).


> - The commands can even be run on a node that will not be part of the
> cluster after the setup. In the case of 'pcs cluster setup,' it must run on
> a node with all the intended cluster nodes authenticated.
>
> Ok...  i.e the pcs to pcsd is via the network as you previously said.
>
> - There are no Corosync files involved in 'pcs host auth'. Being
> authenticated to pcsd and being a part of a cluster are separate things.
> There also seems to be a little confusion about what the two mentioned
> commands do. So here is an overview of how they work:
>
> Ah....Being authenticated to pcsd and being part of a cluster seperate
> things... That helps thanks
> So the overall design is that pcs connects to all nodes even local nodes
> securely using tokens via TLS and tells those nodes
> Write the corosync and pacemaker authkey files, update know hosts, ...
>

Yes, pcs uses HTTPS (HTTP + TLS) to communicate with all the nodes. It
sends HTTPS requests telling the nodes to save the keys, update
known-hosts, ...
The tokens are used in the COOKIEs of the HTTPS requests.

   - 'pcs host auth <list of hosts>'
>       1. For each node in the list of hosts (even the local node)
>          1. pcs sends https request for authentication
>          2. Remote node authenticates using PAM (username + password)
>          3. Remote node generates a token, saves it, and sends it in
>          response
>       2. Local known-hosts file is updated locally with the tokens received
>       from responses and distributed to all nodes
>    - 'pcs cluster setup <list of hosts>'
>       1. Requests for destroying cluster and removal of pcsd config files
>       are sent to all nodes
>          - in case there is some old unwanted configuration on the nodes
>       2. The local known-hosts file is distributed to all nodes
>       3. corosync_authkey and pacemaker_authkey are generated and
>       distributed to all nodes
>          - each node receives the keys and saves them
>       4. New pcsd tls certificate and key is generated and distributed
>          - so that all nodes have the same certificate
>          - each remote node saves the files and reloads tls certificate
>          used by the daemon
>       5. corosync_conf is generated and distributed to all nodes
>          - Again, each node receives the config file and saves it
>
>
>    1. In step 3 who is generating the keys i.e the pcs command?
>
>
The keys are generated by the pcs command (os.urandom() python function is
used to generate the keys).

b. When you say distribute/sent above via the pcs to pcsd connection?
>

Distribute, in this case, means that HTTPS requests with the needed data
are sent to all the cluster nodes.


>          c. Regarding the tokens.
>     c.1 I can imagine that the pcs to pcsd proctol has the ability to ask
> the pcs to prompt for password and then the pcsd does the PAM to validate
> and return a token
>               c.2 And i asusme that the pcs to pcsd protocol allow s the
> token to presented when some action needs to be authenticated e.g setting a
> node to standby,.
>

Pcs prompts for a username and password locally, then sends them to all
nodes via HTTPS requests. Pcsd running on a node handles this request and
provides the username and password to PAM. If the username and password are
correct, then pcsd generates a token. The token is saved locally and sent
in the response to save it in the 'known-hosts' file. The tokens from the
'known-hosts' file are then added to the COOKIE of the HTTPS requests when
communicating with the node.
So, as an example:

   - When we want to communicate with node 'A', we look into the
   known-hosts file to find a token for 'A'
   - We send an HTTPS request with the token for 'A' in the COOKIE.
   - Node 'A' receives the request and checks if the token from the COOKIE
   matches its saved token. If no, the request is rejected.



>          d. Would you, or anyone know, Where can i find a description of
> the pcs to pcsd protocol? I could not find it.
>

The communication is done through REST API. However, the API is not
properly documented.


>   e. In Step 4.. Just the certificate is sent to all nodes. The private
> key is kept local. Right? I.e all nodes are generating there own public,
> private key pair.
>

Both the certificate and the key are sent to all nodes. However, I
overlooked that this option is not enabled by default. By default, every
node creates its own self-signed cert (and key) during the startup of pcsd.
But, having the same certificate and key on all nodes can be useful, for
example, when using pcs Web UI:

   - The UI can be accessed using a floating IP running as a resource in
   the cluster. When the IP is moved, the user's browser connects to the new
   node. If the node has a different certificate, the browser warns that the
   cert has changed. By syncing the certificates, the transition to another
   node is a more seamless experience for the user since there is no warning
   about the certificate.
   - The command 'pcs pcsd sync-certificates' can be used to sync all nodes
   to use the same certificate and key (certificate and key from the local
   node are sent to all the nodes in the cluster).

To sync the certs during cluster setup, the variable
PCSD_SSL_CERT_SYNC_ENABLED needs to be set in systemd environment file.

              (which i might be asked to use our own CA and not the self
> signed one from Missouri I saw when I dumped the cert file).
>               I am a bit confused here to be honest.
>

There is a command for loading a custom certificate if you need to use your
certificates: 'pcs pcsd certkey <certificate file> <key file>'


>
>
> regards
> Angelo
>
>
> On Fri, Aug 16, 2024 at 2:41?PM Angelo M Ruggiero via Users <
> users at clusterlabs.org> wrote:
>
> > Hello,
> >
> > I have been learning and playing with the pacemaker. Its great. We are
> > going to use is in SAP R3/HANA on RHEL8 hopefully in the next few months.
> >
> > I am trying to make sure I know how it works from a security point of
> > view. As in my world I have to explain to security powers at be ....
> >
> > So been looking at the man pages, netstatin,tcpdumping, lsofing etc and
> > looking at the code even as far as i can.
> >
> > Here is an initial sort of description what actually happens during the
> > initial setup until all processes are up and "trusted" thereafter with
> > resources is less of an issue.
> >
> > I know it some how not exact enough. But I need some sort of pointers or
> > some basic corrections then I will make it better. Happy to contribute
> > something here if people think valuable.
> > I got some pics as well.
> >
> > Just to be I do not have a problem it is all working.
> >
> > So can someone help me to review the below.
> >
> >    1. packages pcs, pacemaker, corosync., ... installed on each host
> >    hacluster password set and pcsd started
> >    2. On one of the intended cluster hosts....pcs host add <list of
> hosts>
> >       1. pcs(1) connects to the local pcsd(8) via only root writable unix
> >       domain socket
> >       2. local pcsd connects to each remote host on port 2244 via TLS and
> >       configured cipher
> >          1. the remote pcsd via PAM requests uid, password authentication
> >          (hacluster and the above set passwd)
> >             1. if successfull the remote pcsd
> >                1. writes into the local /var/lib/pcsd/known_hosts its own
> >                entry
> >                2. writes the node list entry into the
> >                /etc/corosync/corosync.,conf
> >                3. if there is no /etc/corosync/authkey the
> >                corosync_keygen is running to generate and write the key
> >             2. the local pcsd
> >             1. writes also the remotes pcsd the remote hosts entry
> >                1. writes the node list entry into the
> >                /etc/corosync/corosync.,conf
> >                2. if there is no /etc/corosync/authkey the
> >                corosync_keygen is running to generate and write the key
> >             3. On one of the intended cluster hosts... pcs cluster setup
> >    <list of hosts>
> >       1. pcs(1) connects to the local pcsd(8) via only root writable unix
> >       domain socket
> >       2. allocates a random /etc/pacemaker/authkey
> >       3. connects to each of the list of hosts via TLS and for each
> >          1. presents the remote host token from the previously setup
> >          known hosts entry for authentication
> >          2. presents the /etc/pacemaker/authkey if not yet on the remote
> >          host
> >          3. send the configuration data
> >
> >
> > Angelo
> >
> >
> >
> > _______________________________________________
> > Manage your subscription:
> > https://lists.clusterlabs.org/mailman/listinfo/users
> >
> > ClusterLabs home: https://www.clusterlabs.org/
> >
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> https://lists.clusterlabs.org/pipermail/users/attachments/20240821/13f56e93/attachment-0001.htm
> >
>
> ------------------------------
>
> Message: 2
> Date: Thu, 22 Aug 2024 08:58:39 +0200
> From: Oyvind Albrigtsen <oalbrigt at redhat.com>
> To: Cluster Labs - All topics related to open-source clustering
>         welcomed <users at clusterlabs.org>
> Subject: Re: [ClusterLabs] a request for "oracle" resource agent
> Message-ID:
>         <bjt24kzkbdnewboneje5w4pkpnojoa2myb5k3wsvroeivmwdw5 at gikvjt3iq57c>
> Content-Type: text/plain; charset=us-ascii; format=flowed
>
> I cant remember anyone requesting this, but it should be fairly simple
> to implement.
>
> You can add a "mode" parameter to the metadata with
> OCF_RESKEY_mode_default="running", and add an expected_status
> variable in instance_live() that will be OPEN if it's set to running,
> and the expected state for standby when it's set to standby, and
> replace the 3 OPEN's in the function with $expected state.
>
>
> Oyvind
>
> On 19/08/24 17:10 GMT, Fabrizio Ermini wrote:
> >Hi!
> >I am trying to set up an oracle instance managed by a pacemaker cluster.
> >It's a task that I have performed several times with no issue, but in this
> >particular case I have a non standard requirement: since the instance
> >sometimes would take the role of a standby database, the "start" actions
> >should NOT open the DB instance, just mount it.
> >
> >Are you aware of a way to make this happen? I thought initially to just
> >comment out the "open" command in the resource script, but of course this
> >would not work since the monitor operations would report the unopened
> >instance as an error.
> >
> >In any case let me know if you could be interested to add this as a
> feature
> >if I manage to successfully make it work.
> >
> >Thanks for your time and effort, and best regards
> >Fabrizio
>
> >_______________________________________________
> >Manage your subscription:
> >https://lists.clusterlabs.org/mailman/listinfo/users
> >
> >ClusterLabs home: https://www.clusterlabs.org/
>
>
>
> ------------------------------
>
> Message: 3
> Date: Thu, 22 Aug 2024 09:33:54 +0200
> From: Oyvind Albrigtsen <oalbrigt at redhat.com>
> To: Cluster Labs - All topics related to open-source clustering
>         welcomed <users at clusterlabs.org>
> Subject: Re: [ClusterLabs] a request for "oracle" resource agent
> Message-ID:
>         <4legqceoeaervdfufz3csyubnki4iavluwcm7ked3fqxeepldi at 46zzwjqy3rai>
> Content-Type: text/plain; charset=us-ascii; format=flowed
>
> Feel free to make a Pull Request against the repository, and I can give
> some feedback, and we can either merge it if there's others wanting
> the feature, or close it if not (it will still be available so people
> can search for oracle in the Pull Requests section and find the code).
>
>
> Oyvind
>
> On 22/08/24 08:58 GMT, Oyvind Albrigtsen wrote:
> >I cant remember anyone requesting this, but it should be fairly simple
> >to implement.
> >
> >You can add a "mode" parameter to the metadata with
> >OCF_RESKEY_mode_default="running", and add an expected_status
> >variable in instance_live() that will be OPEN if it's set to running,
> >and the expected state for standby when it's set to standby, and
> >replace the 3 OPEN's in the function with $expected state.
> >
> >
> >Oyvind
> >
> >On 19/08/24 17:10 GMT, Fabrizio Ermini wrote:
> >>Hi!
> >>I am trying to set up an oracle instance managed by a pacemaker cluster.
> >>It's a task that I have performed several times with no issue, but in
> this
> >>particular case I have a non standard requirement: since the instance
> >>sometimes would take the role of a standby database, the "start" actions
> >>should NOT open the DB instance, just mount it.
> >>
> >>Are you aware of a way to make this happen? I thought initially to just
> >>comment out the "open" command in the resource script, but of course this
> >>would not work since the monitor operations would report the unopened
> >>instance as an error.
> >>
> >>In any case let me know if you could be interested to add this as a
> feature
> >>if I manage to successfully make it work.
> >>
> >>Thanks for your time and effort, and best regards
> >>Fabrizio
> >
> >>_______________________________________________
> >>Manage your subscription:
> >>https://lists.clusterlabs.org/mailman/listinfo/users
> >>
> >>ClusterLabs home: https://www.clusterlabs.org/
> >
>
>
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>
>
> ------------------------------
>
> End of Users Digest, Vol 115, Issue 10
> **************************************
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20240827/f6619502/attachment-0001.htm>


More information about the Users mailing list