Appendix L: Automated Instance Recovery

Overview

As of the 19.04 charm release, with OpenStack Stein and later, Masakari can be a deployed to provide automated instance recovery for guests using shared storage. Masakari responds to two different failures types, individual guest failure and the loss of an entire compute node.

Warning

These charms bring forward upstream Masakari features which need to be carefully considered and pre-validated in test labs by cloud operators. Further upstream Masakari development, charm feature work and scenario validation is likely going to be necessary before the solution can be considered mature on the whole.

STONITH

It is important that guests using shared storage cannot continue to run in the event of a compute node becoming isolated. The risk being that masakari attempts to bring the same guest up on a new compute node when the old one is still running which could lead to data corruption. To ensure that this does not occur stonith can be setup for the compute nodes. For stonith to be configured the maas_url and maas_credentials config option must be set in the hacluster charm related to the masakari charm. Also the enable-stonith config option should be set to True in the pacemaker-remote charm.

Deployment

Three new charms are needeed to deploy this solution: masakari, masakari-monitors and pacemaker-remote. The masakari charm provides api services and is a principal or standalone charm. The masakari-monitors charm is deployed as a subordinate to the nova-compute charm as it monitors nova-compute directly and sends messages to the masakari API charm. The pacemaker-remote charm is also a subordinate to the nova-compute charm and is required to monitor the compute nodes health.

Below is an overlay which can be used to add masakari to an existing deployment:

machines:
  '0':
    series: bionic
  '1':
    series: bionic
  '2':
    series: bionic
  '3':
    series: bionic
relations:
- - nova-compute:juju-info
  - masakari-monitors:container
- - masakari:ha
  - hacluster:ha
- - keystone:identity-credentials
  - masakari-monitors:identity-credentials
- - nova-compute:juju-info
  - pacemaker-remote:juju-info
- - hacluster:pacemaker-remote
  - pacemaker-remote:pacemaker-remote
- - masakari:identity-service
  - keystone:identity-service
- - masakari:shared-db
  - mysql:shared-db
- - masakari:amqp
  - rabbitmq-server:amqp
series: bionic
applications:
  masakari-monitors:
    charm: cs:masakari-monitors
  hacluster:
    charm: cs:hacluster
    options:
      maas_url: <INSERT MAAS URL>
      maas_credentials: <INSERT MAAS API KEY>
  pacemaker-remote:
    charm: cs:pacemaker-remote
    options:
      enable-stonith: True
      enable-resources: False
  masakari:
    charm: cs:masakari
    series: bionic
    num_units: 3
    options:
      openstack-origin: cloud:bionic-stein
      vip: <INSERT VIP(S)>
    bindings:
      public: public
      admin: admin
      internal: internal
      shared-db: internal
      amqp: internal
    to:
    - 'lxd:1'
    - 'lxd:2'
    - 'lxd:3'

Warning

The bundle above with need customising to correct maas_url, maas_credentials and vip settings. The machine mappings will almost certainly need updating too.

To use the overlay with an existing model remember to use the –map-machines switch to juju

$ juju deploy base.yaml --overlay masakari-overlay.yaml --map-machines=existing

Configuring Masakari

In Masakari the compute nodes are grouped into failover segments. In the event of a failure guests are moved onto other nodes within the same segment. Which compute node is chosen to house the evacuated guests is determined by the recovery method of that segment.

‘AUTO’ Recovery Method

With auto recovery the guests are relocated to any of the available nodes in the same segment. The problem with this approach is that there is no guarantee that resources will be available to accommodate guests from a failed compute node.

To configure a group of compute hosts for auto recovery, first create a segment with the recovery method set to auto:

$ openstack segment create segment1 auto COMPUTE
+-----------------+--------------------------------------+
| Field           | Value                                |
+-----------------+--------------------------------------+
| created_at      | 2019-04-12T13:59:50.000000           |
| updated_at      | None                                 |
| uuid            | 691b8ef3-7481-48b2-afb6-908a98c8a768 |
| name            | segment1                             |
| description     | None                                 |
| id              | 1                                    |
| service_type    | COMPUTE                              |
| recovery_method | auto                                 |
+-----------------+--------------------------------------+

Next the hypervisors need to be added into the segment, these should be referenced by their unqualified hostname:

$ openstack segment host create tidy-goose COMPUTE SSH 691b8ef3-7481-48b2-afb6-908a98c8a768
+---------------------+--------------------------------------+
| Field               | Value                                |
+---------------------+--------------------------------------+
| created_at          | 2019-04-12T14:18:24.000000           |
| updated_at          | None                                 |
| uuid                | 11b85c9d-2b97-4b83-b773-0e9565e407b5 |
| name                | tidy-goose                           |
| type                | COMPUTE                              |
| control_attributes  | SSH                                  |
| reserved            | False                                |
| on_maintenance      | False                                |
| failover_segment_id | 691b8ef3-7481-48b2-afb6-908a98c8a768 |
+---------------------+--------------------------------------+

Repeat above for all remaining hypervisors:

$ openstack segment host list 691b8ef3-7481-48b2-afb6-908a98c8a768
+--------------------------------------+------------+---------+--------------------+----------+----------------+--------------------------------------+
| uuid                                 | name       | type    | control_attributes | reserved | on_maintenance | failover_segment_id                  |
+--------------------------------------+------------+---------+--------------------+----------+----------------+--------------------------------------+
| 75afadbb-67cc-47b2-914e-e3bf848028e4 | frank-colt | COMPUTE | SSH                | False    | False          | 691b8ef3-7481-48b2-afb6-908a98c8a768 |
| 11b85c9d-2b97-4b83-b773-0e9565e407b5 | tidy-goose | COMPUTE | SSH                | False    | False          | 691b8ef3-7481-48b2-afb6-908a98c8a768 |
| f1e9b0b4-3ac9-4f07-9f83-5af2f9151109 | model-crow | COMPUTE | SSH                | False    | False          | 691b8ef3-7481-48b2-afb6-908a98c8a768 |
+--------------------------------------+------------+---------+--------------------+----------+----------------+--------------------------------------+

‘RESERVED_HOST’ Recovery Method

With reserved_host recovery compute hosts are allocated as reserved which allows an operator to guarantee there is sufficient capacity available for any guests in need of evacuation.

Firstly create a segment with the reserved_host recovery method:

$ openstack segment create segment1 reserved_host COMPUTE -c uuid -f value
2598f8aa-3612-4731-9716-e126ca6cc280

Add a host using the –reserved switch to indicate that it will act as a standby:

$ openstack segment host create model-crow --reserved True COMPUTE SSH 2598f8aa-3612-4731-9716-e126ca6cc280

Add the remaining hypervisors as before:

$ openstack segment host create frank-colt COMPUTE SSH 2598f8aa-3612-4731-9716-e126ca6cc280
$ openstack segment host create tidy-goose COMPUTE SSH 2598f8aa-3612-4731-9716-e126ca6cc280

Listing the segment hosts shows that model-crow is a reserved host:

$ openstack segment host list 2598f8aa-3612-4731-9716-e126ca6cc280
+--------------------------------------+------------+---------+--------------------+----------+----------------+--------------------------------------+
| uuid                                 | name       | type    | control_attributes | reserved | on_maintenance | failover_segment_id                  |
+--------------------------------------+------------+---------+--------------------+----------+----------------+--------------------------------------+
| 4769e08c-ed52-440a-866e-832b977aa5e2 | tidy-goose | COMPUTE | SSH                | False    | False          | 2598f8aa-3612-4731-9716-e126ca6cc280 |
| 90aedbd2-e03b-4dbd-b330-a1c848f300df | frank-colt | COMPUTE | SSH                | False    | False          | 2598f8aa-3612-4731-9716-e126ca6cc280 |
| c77574cc-b6e7-440e-9c86-84e91981f15e | model-crow | COMPUTE | SSH                | True     | False          | 2598f8aa-3612-4731-9716-e126ca6cc280 |
+--------------------------------------+------------+---------+--------------------+----------+----------------+--------------------------------------+

Finally disable the reserved host in nova so that it remains available for failover:

$ openstack compute service set --disable model-crow nova-compute
$ openstack compute service list
+----+----------------+---------------------+----------+----------+-------+----------------------------+
| ID | Binary         | Host                | Zone     | Status   | State | Updated At                 |
+----+----------------+---------------------+----------+----------+-------+----------------------------+
|  1 | nova-scheduler | juju-44b912-3-lxd-3 | internal | enabled  | up    | 2019-04-13T10:59:10.000000 |
|  5 | nova-conductor | juju-44b912-3-lxd-3 | internal | enabled  | up    | 2019-04-13T10:59:08.000000 |
|  7 | nova-compute   | tidy-goose          | nova     | enabled  | up    | 2019-04-13T10:59:11.000000 |
|  8 | nova-compute   | frank-colt          | nova     | enabled  | up    | 2019-04-13T10:59:05.000000 |
|  9 | nova-compute   | model-crow          | nova     | disabled | up    | 2019-04-13T10:59:12.000000 |
+----+----------------+---------------------+----------+----------+-------+----------------------------+

When a compute node failure is detected, masakari will disable the failed node and enable the reserve node in nova. After simulating a failure of frank-colt the service list now looks like this:

$ openstack compute service list
+----+----------------+---------------------+----------+----------+-------+----------------------------+
| ID | Binary         | Host                | Zone     | Status   | State | Updated At                 |
+----+----------------+---------------------+----------+----------+-------+----------------------------+
|  1 | nova-scheduler | juju-44b912-3-lxd-3 | internal | enabled  | up    | 2019-04-13T11:05:20.000000 |
|  5 | nova-conductor | juju-44b912-3-lxd-3 | internal | enabled  | up    | 2019-04-13T11:05:28.000000 |
|  7 | nova-compute   | tidy-goose          | nova     | enabled  | up    | 2019-04-13T11:05:21.000000 |
|  8 | nova-compute   | frank-colt          | nova     | disabled | down  | 2019-04-13T11:03:56.000000 |
|  9 | nova-compute   | model-crow          | nova     | enabled  | up    | 2019-04-13T11:05:22.000000 |
+----+----------------+---------------------+----------+----------+-------+----------------------------+

Since the reserved host has now been enabled and is hosting evacuated guests, masakari has removed the reserved flag from it. Masakari has also placed the failed node in maintenance mode.

$ openstack segment host list 2598f8aa-3612-4731-9716-e126ca6cc280
+--------------------------------------+------------+---------+--------------------+----------+----------------+--------------------------------------+
| uuid                                 | name       | type    | control_attributes | reserved | on_maintenance | failover_segment_id                  |
+--------------------------------------+------------+---------+--------------------+----------+----------------+--------------------------------------+
| 4769e08c-ed52-440a-866e-832b977aa5e2 | tidy-goose | COMPUTE | SSH                | False    | False          | 2598f8aa-3612-4731-9716-e126ca6cc280 |
| 90aedbd2-e03b-4dbd-b330-a1c848f300df | frank-colt | COMPUTE | SSH                | False    | True           | 2598f8aa-3612-4731-9716-e126ca6cc280 |
| c77574cc-b6e7-440e-9c86-84e91981f15e | model-crow | COMPUTE | SSH                | False    | False          | 2598f8aa-3612-4731-9716-e126ca6cc280 |
+--------------------------------------+------------+---------+--------------------+----------+----------------+--------------------------------------+

‘AUTO_PRIORITY’ and ‘RH_PRIORITY’ Recovery Methods

These methods appear to chain the previous methods together. So, auto_priority attempts to move the guest using the auto method first and if that fails it tries the reserved_host method. rh_priority does the same thing but in the reverse order. See Masakari Pike Release Note for details.

Individual Instance Recovery

Finally, to use the masakari feature which reacts to a single guest failing rather than a whole hypervisor, the guest(s) need to be marked with a small piece of metadata:

$ openstack server set --property HA_Enabled=True server_120419134342