Admin Guide Pacemaker

From HPC Wiki
Admin Guide Pacemaker /
Revision as of 19:22, 30 October 2020 by Robert-schade-e757@uni-paderborn.de (talk | contribs) (Author Roland Pabel)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search


Pacemaker

Pacemaker is a mature software for setting up a high availability cluster on Linux. For CentOS 7, the packages are in the default repository but for CentOS 8 were moved to the High Availablity section.

Pacemaker consists of the services corosync (synchronisation service), pcsd (PCS GUI and remote configuration interface) and pacemaker (Pacemaker High Availability Cluster Manager). Setup of pacemaker is described on many websites, very detailed is RedHat Documentation.

Nomenclature

  • Resource: A Resource can be a service, a mount point, basically anything for which a script exists that describs how this resource works. These scripts are usually save under /usr/lib/ocf/resource.d/ and referenced according to their path (e.g. ocf::heartbeat:Filesystem for the file /usr/lib/ocf/heartbeat/Filesystem).
  • Clone: The default is for a resource to run only once in a HA cluster. If a resource should be run on all nodes of the HA cluster, for example ntpd, then a clone is used.
  • Group: A Collection of Resources that are ordered (in the order they are added to a group). If a resource in a group fails or is disabled, all dependent resources are stopped.
  • Meta Properties: Properties of a Resource,etc that are only evaluated by pacemaker, not by the resource.
  • Constraints: Pacemaker starts resources according to the constraints set by the administrator. A constraint can be based upon hosts, time, other resources (colocation), etc.
  • Stickiness: Resources have meta property called “stickiness” which is just a number. This number sets a limit so that a resource is (not) moved to a node. Its purpose is to not have resource hop around nodes just because some minor detail changes in the cluster.
  • Fail Counts: If a resource fails, a counter is increased and a resource is usually forbidden on a node where it failed in the past.

Configuration

Configuration of pacemaker is done using pcs or the crm (cluster resource manager) tools. pcs is just a python frontend for the crm commands, but it is definitely easier to use. For example, pcs status is basically only crm_info -1otf, but a much clearer command. The crm tools are not known for being admin friendly, so pcs is definitely recommended. pcs has a very nice help and man page.

Tipps

  • Resources and their dependencies should be put into groups together. Do not group resources according to type.

    DO:

     Resource Group: group1
       group1-fs   (ocf::heartbeat:Filesystem):    Started
       group1-ip   (ocf::heartbeat:IPaddr2):       Started
       group1-server  (systemd:group1-service):    Started
     Resource Group: group2
       group2-fs   (ocf::heartbeat:Filesystem):    Started
       group2-ip   (ocf::heartbeat:IPaddr2):       Started
       group2-server  (systemd:group1-service):    Started

DONT:

 Resource Group: group-fs
     group1-fs   (ocf::heartbeat:Filesystem):    Started
     group2-fs   (ocf::heartbeat:Filesystem):    Started
 Resource Group: group1
     group1-ip   (ocf::heartbeat:IPaddr2):       Started
     group1-server  (systemd:group1-service):    Started
 Resource Group: group2
     group2-ip   (ocf::heartbeat:IPaddr2):       Started
     group2-server  (systemd:group1-service):    Started

Colocation Constraints:
  group1 with group-fs (score:INFINITY)
  group2 with group-fs (score:INFINITY)
Ordering Constraints:
  start group-fs then start group1 (kind:Mandatory)
  start group-fs then start group2 (kind:Mandatory)
  • Try to use constraints as little as possible. There are valid cases for them (for example, when several resources depend on one but there is no order among the resources), but a group is usually preferable to colocation and ordering constraints.
  • All systemd Services are automatically available as resources. In case this is not sufficient, writing your own scripts is very easy. The only important detail to look out for is this:
    # Monitor _MUST!_ differentiate correctly between running
    # (SUCCESS), failed (ERROR) or _cleanly_ stopped (NOT RUNNING).
    # That is THREE states, not just yes/no.

For an example, see generic-script on github