Appendix L: Automated Instance Recovery¶
Overview¶
As of the 19.04 charm release, with OpenStack Stein and later, Masakari can be a deployed to provide automated instance recovery for guests using shared storage. Masakari responds to two different failures types, individual guest failure and the loss of an entire compute node.
Warning
These charms bring forward upstream Masakari features which need to be carefully considered and pre-validated in test labs by cloud operators. Further upstream Masakari development, charm feature work and scenario validation is likely going to be necessary before the solution can be considered mature on the whole.
STONITH¶
It is important that guests using shared storage cannot continue to run in the event of a compute node becoming isolated. The risk being that masakari attempts to bring the same guest up on a new compute node when the old one is still running which could lead to data corruption. To ensure that this does not occur stonith can be setup for the compute nodes. For stonith to be configured the maas_url and maas_credentials config option must be set in the hacluster charm related to the masakari charm. Also the enable-stonith config option should be set to True in the pacemaker-remote charm.
Deployment¶
Three new charms are needeed to deploy this solution: masakari, masakari-monitors and pacemaker-remote. The masakari charm provides api services and is a principal or standalone charm. The masakari-monitors charm is deployed as a subordinate to the nova-compute charm as it monitors nova-compute directly and sends messages to the masakari API charm. The pacemaker-remote charm is also a subordinate to the nova-compute charm and is required to monitor the compute nodes health.
Below is an overlay which can be used to add masakari to an existing deployment:
machines:
'0':
series: bionic
'1':
series: bionic
'2':
series: bionic
'3':
series: bionic
relations:
- - nova-compute:juju-info
- masakari-monitors:container
- - masakari:ha
- hacluster:ha
- - keystone:identity-credentials
- masakari-monitors:identity-credentials
- - nova-compute:juju-info
- pacemaker-remote:juju-info
- - hacluster:pacemaker-remote
- pacemaker-remote:pacemaker-remote
- - masakari:identity-service
- keystone:identity-service
- - masakari:shared-db
- mysql:shared-db
- - masakari:amqp
- rabbitmq-server:amqp
series: bionic
applications:
masakari-monitors:
charm: cs:masakari-monitors
hacluster:
charm: cs:hacluster
options:
maas_url: <INSERT MAAS URL>
maas_credentials: <INSERT MAAS API KEY>
pacemaker-remote:
charm: cs:pacemaker-remote
options:
enable-stonith: True
enable-resources: False
masakari:
charm: cs:masakari
series: bionic
num_units: 3
options:
openstack-origin: cloud:bionic-stein
vip: <INSERT VIP(S)>
bindings:
public: public
admin: admin
internal: internal
shared-db: internal
amqp: internal
to:
- 'lxd:1'
- 'lxd:2'
- 'lxd:3'
Warning
The bundle above with need customising to correct maas_url, maas_credentials and vip settings. The machine mappings will almost certainly need updating too.
To use the overlay with an existing model remember to use the –map-machines switch to juju
$ juju deploy base.yaml --overlay masakari-overlay.yaml --map-machines=existing
Configuring Masakari¶
In Masakari the compute nodes are grouped into failover segments. In the event of a failure guests are moved onto other nodes within the same segment. Which compute node is chosen to house the evacuated guests is determined by the recovery method of that segment.
‘AUTO’ Recovery Method¶
With auto recovery the guests are relocated to any of the available nodes in the same segment. The problem with this approach is that there is no guarantee that resources will be available to accommodate guests from a failed compute node.
To configure a group of compute hosts for auto recovery, first create a segment with the recovery method set to auto:
$ openstack segment create segment1 auto COMPUTE
+-----------------+--------------------------------------+
| Field | Value |
+-----------------+--------------------------------------+
| created_at | 2019-04-12T13:59:50.000000 |
| updated_at | None |
| uuid | 691b8ef3-7481-48b2-afb6-908a98c8a768 |
| name | segment1 |
| description | None |
| id | 1 |
| service_type | COMPUTE |
| recovery_method | auto |
+-----------------+--------------------------------------+
Next the hypervisors need to be added into the segment, these should be referenced by their unqualified hostname:
$ openstack segment host create tidy-goose COMPUTE SSH 691b8ef3-7481-48b2-afb6-908a98c8a768
+---------------------+--------------------------------------+
| Field | Value |
+---------------------+--------------------------------------+
| created_at | 2019-04-12T14:18:24.000000 |
| updated_at | None |
| uuid | 11b85c9d-2b97-4b83-b773-0e9565e407b5 |
| name | tidy-goose |
| type | COMPUTE |
| control_attributes | SSH |
| reserved | False |
| on_maintenance | False |
| failover_segment_id | 691b8ef3-7481-48b2-afb6-908a98c8a768 |
+---------------------+--------------------------------------+
Repeat above for all remaining hypervisors:
$ openstack segment host list 691b8ef3-7481-48b2-afb6-908a98c8a768
+--------------------------------------+------------+---------+--------------------+----------+----------------+--------------------------------------+
| uuid | name | type | control_attributes | reserved | on_maintenance | failover_segment_id |
+--------------------------------------+------------+---------+--------------------+----------+----------------+--------------------------------------+
| 75afadbb-67cc-47b2-914e-e3bf848028e4 | frank-colt | COMPUTE | SSH | False | False | 691b8ef3-7481-48b2-afb6-908a98c8a768 |
| 11b85c9d-2b97-4b83-b773-0e9565e407b5 | tidy-goose | COMPUTE | SSH | False | False | 691b8ef3-7481-48b2-afb6-908a98c8a768 |
| f1e9b0b4-3ac9-4f07-9f83-5af2f9151109 | model-crow | COMPUTE | SSH | False | False | 691b8ef3-7481-48b2-afb6-908a98c8a768 |
+--------------------------------------+------------+---------+--------------------+----------+----------------+--------------------------------------+
‘RESERVED_HOST’ Recovery Method¶
With reserved_host recovery compute hosts are allocated as reserved which allows an operator to guarantee there is sufficient capacity available for any guests in need of evacuation.
Firstly create a segment with the reserved_host recovery method:
$ openstack segment create segment1 reserved_host COMPUTE -c uuid -f value
2598f8aa-3612-4731-9716-e126ca6cc280
Add a host using the –reserved switch to indicate that it will act as a standby:
$ openstack segment host create model-crow --reserved True COMPUTE SSH 2598f8aa-3612-4731-9716-e126ca6cc280
Add the remaining hypervisors as before:
$ openstack segment host create frank-colt COMPUTE SSH 2598f8aa-3612-4731-9716-e126ca6cc280
$ openstack segment host create tidy-goose COMPUTE SSH 2598f8aa-3612-4731-9716-e126ca6cc280
Listing the segment hosts shows that model-crow is a reserved host:
$ openstack segment host list 2598f8aa-3612-4731-9716-e126ca6cc280
+--------------------------------------+------------+---------+--------------------+----------+----------------+--------------------------------------+
| uuid | name | type | control_attributes | reserved | on_maintenance | failover_segment_id |
+--------------------------------------+------------+---------+--------------------+----------+----------------+--------------------------------------+
| 4769e08c-ed52-440a-866e-832b977aa5e2 | tidy-goose | COMPUTE | SSH | False | False | 2598f8aa-3612-4731-9716-e126ca6cc280 |
| 90aedbd2-e03b-4dbd-b330-a1c848f300df | frank-colt | COMPUTE | SSH | False | False | 2598f8aa-3612-4731-9716-e126ca6cc280 |
| c77574cc-b6e7-440e-9c86-84e91981f15e | model-crow | COMPUTE | SSH | True | False | 2598f8aa-3612-4731-9716-e126ca6cc280 |
+--------------------------------------+------------+---------+--------------------+----------+----------------+--------------------------------------+
Finally disable the reserved host in nova so that it remains available for failover:
$ openstack compute service set --disable model-crow nova-compute
$ openstack compute service list
+----+----------------+---------------------+----------+----------+-------+----------------------------+
| ID | Binary | Host | Zone | Status | State | Updated At |
+----+----------------+---------------------+----------+----------+-------+----------------------------+
| 1 | nova-scheduler | juju-44b912-3-lxd-3 | internal | enabled | up | 2019-04-13T10:59:10.000000 |
| 5 | nova-conductor | juju-44b912-3-lxd-3 | internal | enabled | up | 2019-04-13T10:59:08.000000 |
| 7 | nova-compute | tidy-goose | nova | enabled | up | 2019-04-13T10:59:11.000000 |
| 8 | nova-compute | frank-colt | nova | enabled | up | 2019-04-13T10:59:05.000000 |
| 9 | nova-compute | model-crow | nova | disabled | up | 2019-04-13T10:59:12.000000 |
+----+----------------+---------------------+----------+----------+-------+----------------------------+
When a compute node failure is detected, masakari will disable the failed node and enable the reserve node in nova. After simulating a failure of frank-colt the service list now looks like this:
$ openstack compute service list
+----+----------------+---------------------+----------+----------+-------+----------------------------+
| ID | Binary | Host | Zone | Status | State | Updated At |
+----+----------------+---------------------+----------+----------+-------+----------------------------+
| 1 | nova-scheduler | juju-44b912-3-lxd-3 | internal | enabled | up | 2019-04-13T11:05:20.000000 |
| 5 | nova-conductor | juju-44b912-3-lxd-3 | internal | enabled | up | 2019-04-13T11:05:28.000000 |
| 7 | nova-compute | tidy-goose | nova | enabled | up | 2019-04-13T11:05:21.000000 |
| 8 | nova-compute | frank-colt | nova | disabled | down | 2019-04-13T11:03:56.000000 |
| 9 | nova-compute | model-crow | nova | enabled | up | 2019-04-13T11:05:22.000000 |
+----+----------------+---------------------+----------+----------+-------+----------------------------+
Since the reserved host has now been enabled and is hosting evacuated guests, masakari has removed the reserved flag from it. Masakari has also placed the failed node in maintenance mode.
$ openstack segment host list 2598f8aa-3612-4731-9716-e126ca6cc280
+--------------------------------------+------------+---------+--------------------+----------+----------------+--------------------------------------+
| uuid | name | type | control_attributes | reserved | on_maintenance | failover_segment_id |
+--------------------------------------+------------+---------+--------------------+----------+----------------+--------------------------------------+
| 4769e08c-ed52-440a-866e-832b977aa5e2 | tidy-goose | COMPUTE | SSH | False | False | 2598f8aa-3612-4731-9716-e126ca6cc280 |
| 90aedbd2-e03b-4dbd-b330-a1c848f300df | frank-colt | COMPUTE | SSH | False | True | 2598f8aa-3612-4731-9716-e126ca6cc280 |
| c77574cc-b6e7-440e-9c86-84e91981f15e | model-crow | COMPUTE | SSH | False | False | 2598f8aa-3612-4731-9716-e126ca6cc280 |
+--------------------------------------+------------+---------+--------------------+----------+----------------+--------------------------------------+
‘AUTO_PRIORITY’ and ‘RH_PRIORITY’ Recovery Methods¶
These methods appear to chain the previous methods together. So, auto_priority attempts to move the guest using the auto method first and if that fails it tries the reserved_host method. rh_priority does the same thing but in the reverse order. See Masakari Pike Release Note for details.
Individual Instance Recovery¶
Finally, to use the masakari feature which reacts to a single guest failing rather than a whole hypervisor, the guest(s) need to be marked with a small piece of metadata:
$ openstack server set --property HA_Enabled=True server_120419134342