Role - backup_and_restore

Role Documentation

Welcome to the “backup_and_restore” role documentation.

Role Defaults

This section highlights all of the defaults and variables set within the “backup_and_restore” role.

# All variables intended for modification should be placed in this file.
tripleo_backup_and_restore_hide_sensitive_logs: '{{ hide_sensitive_logs | default(true)
  }}'
tripleo_backup_and_restore_debug: '{{ ((ansible_verbosity | int) >= 2) | bool }}'
tripleo_controller_group_name: "{{ controller_group_name | default('Controller') }}"

# Set the container command line entry-point
tripleo_container_cli: "{{ container_cli | default('podman') }}"
tripleo_container_cli_flags: ''
# Stop and start all running services before backup is ran.
tripleo_backup_and_restore_service_manager: true

# If this is false, backup of the overcloud is taken by stopping it completely. Enable it to do a
# backup stopping only one node at a time, maintaining the controller active during the backup duration.
tripleo_backup_and_restore_enable_snapshots: true

# Set the name of the mysql container
tripleo_backup_and_restore_mysql_container: mysql

# Default name for the Undercloud mysql DB backup file
tripleo_backup_and_restore_mysql_backup_file: openstack-backup-mysql.sql

# Default name for the Undercloud mysql DB grants file
tripleo_backup_and_restore_mysql_grants_file: openstack-backup-mysql-grants.sql

# All variables within this role should have a prefix of "tripleo_backup_and_restore"
# By default this should be the Undercloud node
tripleo_backup_and_restore_server: 192.168.24.1
tripleo_backup_and_restore_shared_storage_folder: /ctl_plane_backups
tripleo_backup_and_restore_shared_storage_subfolders: []
tripleo_backup_and_restore_clients_nets: [192.168.24.0/24, 10.0.0.0/24, 172.16.0.0/24]
tripleo_backup_and_restore_rear_simulate: false
tripleo_backup_and_restore_using_uefi_bootloader: 0
tripleo_backup_and_restore_exclude_paths_common: [/data/*, /tmp/*, '{{ tripleo_backup_and_restore_shared_storage_folder
    }}/*']
tripleo_backup_and_restore_exclude_paths_controller_non_bootstrapnode: false
tripleo_backup_and_restore_exclude_paths_controller: [/var/lib/mysql/*]
tripleo_backup_and_restore_exclude_paths_compute: [/var/lib/nova/instances/*]
tripleo_backup_and_restore_hiera_config_file: /etc/puppet/hiera.yaml

# This var is a dictionary of the configuration of the /etc/rear/local.conf
# The key:value will be interpreted as key=value on the configuration file.
# To set that the value is a string, it needs to be single quoted followed by
# double quoted as it will be interpreted by BASH.
tripleo_backup_and_restore_local_config:
  ISO_DEFAULT: '"automatic"'
  OUTPUT: ISO
  BACKUP: NETFS
  BACKUP_PROG_COMPRESS_OPTIONS: ( --gzip)
  BACKUP_PROG_COMPRESS_SUFFIX: '".gz"'
  OUTPUT_URL: '{{ tripleo_backup_and_restore_output_url }}'
  ISO_PREFIX: '{{ tripleo_backup_and_restore_hostname.stdout }}'
  BACKUP_URL: '{{ tripleo_backup_and_restore_backup_url }}'
  BACKUP_PROG_CRYPT_ENABLED: '{{ tripleo_backup_and_restore_crypt_backup_enabled |
    default(false) }}'
  BACKUP_PROG_CRYPT_KEY: "{{ tripleo_backup_and_restore_crypt_backup_password | default('REPLACE_ME')\
    \ }}"

# This var is used to define the commands to be run for preparing the network
# during the restoration phase. Because ReaR does not support ovs, it is required
# to setup the network for connecting to the backup node.
# This is configured on /etc/rear/local.conf
# as an example
# ('ip l a br-ex type bridge' 'ip l s ens3 up' 'ip l s br-ex up' 'ip l s ens3 master br-ex' 'dhclient br-ex')
tripleo_backup_and_restore_network_preparation_commands: ()

# This var is a dictionary of the configuration of the /etc/rear/rescue.conf
# The key:value will be interpreted as key=value on the configuration file.
# To set that the value is a string, it needs to be single quoted followed by
# double quoted as it will be interpreted by BASH.
tripleo_backup_and_restore_rescue_config: {}

tripleo_backup_and_restore_output_url: nfs://{{ tripleo_backup_and_restore_server
  }}{{ tripleo_backup_and_restore_shared_storage_folder }}
tripleo_backup_and_restore_backup_url: nfs://{{ tripleo_backup_and_restore_server
  }}{{ tripleo_backup_and_restore_shared_storage_folder }}

# Ceph authentication backup file
tripleo_backup_and_restore_ceph_auth_file: ceph_auth_export.bak

# Ceph backup file
tripleo_backup_and_restore_ceph_backup_file: /var/lib/ceph.tar.gz

# Ceph directory to back up
tripleo_backup_and_restore_ceph_path: /var/lib/ceph

# If there is a firewalld active, setup the zone where the NFS server ports need to be opened
tripleo_backup_and_restore_firewalld_zone: libvirt

# The ReaR rpm installs a cronjob at 1:30 each day by default. This variable deactivate that behaviour.
tripleo_backup_and_restore_remove_default_cronjob: true

# Skip the ping test to the server on rear setup
tripleo_backup_and_restore_skip_server_test: false

# How many seconds do we want to wait fir pcs cluster stop to finish
tripleo_backup_and_restore_pcs_timeout: 3600

# Date argument to get the string of the backup
tripleo_backup_and_restore_date_argument: '%Y%m%d%H%M'

# Enable historical backups
tripleo_backup_and_restore_historical: true

# Cron programming, by default, run cron weekly at midnight on Sundays
tripleo_backup_and_restore_cron: 0 0 * * 0

# The user that will run the backup command. If empty, root will run the backup command
tripleo_backup_and_restore_cron_user: stack

# Any extra parameters that will be added to the backup command when it is executed by cron
tripleo_backup_and_restore_cron_extra: ''

# The role which handles the ceph on the controllers
tripleo_backup_and_restore_ceph_mon_role: ceph_mon

# The cephadm path
tripleo_backup_and_restore_cephadm_path: /usr/sbin/cephadm

# The name of the node to restore
tripleo_backup_and_restore_overcloud_restore_name: undercloud

# Ironic images path
tripleo_backup_and_restore_ironic_images_path: /var/lib/ironic/images

# Restore retries
tripleo_backup_and_restore_restore_retries: 300

# Restore delay
tripleo_backup_and_restore_restore_delay: 10

# Galera retries
tripleo_backup_and_restore_galera_retries: 300

# Galera delay
tripleo_backup_and_restore_galera_delay: 10

# Ironic ubdirectory where the kernel and initrd are uploaded
backup_and_restore_history_path: ''

# Ceph cluster name
tripleo_backup_and_restore_ceph_cluster: ceph
tripleo_backup_and_restore_ceph_admin_keyring: /etc/ceph/{{ tripleo_backup_and_restore_ceph_cluster
  }}.client.admin.keyring

Role Variables: redhat.yml

# While options found within the vars/ path can be overridden using extra
# vars, items within this path are considered part of the role and not
# intended to be modified.

# All variables within this role should have a prefix of "tripleo_{{ role_name | replace('-', '_') }}"

tripleo_backup_and_restore_rear_packages:
- rear
- syslinux
- xorriso
- nfs-utils
- lftp
- grub2-tools-extra
tripleo_backup_and_restore_nfs_packages:
- nfs-utils
tripleo_backup_and_restore_uefi_packages:
- dosfstools
- efibootmgr
- grub2-efi-x64-modules

Molecule Scenarios

Molecule is being used to test the “backup_and_restore” role. The following section highlights the drivers in service and provides an example playbook showing how the role is leveraged.

Scenario: cli_undercloud_backup_db

Molecule Inventory
hosts:
  all:
    hosts:
      instance:
        ansible_host: localhost
  Undercloud:
    hosts:
      instance:
        ansible_host: localhost
Example cli_undercloud_backup_db playbook
- name: Converge
  become: true
  hosts: all

- import_playbook: ../../../../playbooks/cli-undercloud-db-backup.yaml

Scenario: default

Molecule Inventory
hosts:
  all:
    hosts:
      instance:
        ansible_host: localhost
Example default playbook
- name: Converge
  become: true
  hosts: all
  roles:
  - role: backup_and_restore
    tripleo_backup_and_restore_server: localhost
    tripleo_backup_and_restore_rear_simulate: true
    tripleo_backup_and_restore_hiera_config_file: '{{ ansible_user_dir }}/hiera.yaml'

Usage

This Ansible role allows to do the following tasks:

  1. Install an NFS server.

  2. Install ReaR.

  3. Perform a ReaR backup.

This example is meant to describe a very simple use case in which the user needs to create a set of recovery images from the control plane nodes.

First, the user needs to have access to the environment Ansible inventory.

We will use the tripleo-ansible-inventory command to generate the inventory file.

tripleo-ansible-inventory \
  --ansible_ssh_user heat-admin \
  --static-yaml-inventory ~/tripleo-inventory.yaml

In this particular case, we don’t have an additional NFS server to store the backups from the control plane nodes, so, we will install the NFS server in the Undercloud node (but any other node can be used as the NFS storage backend).

First, we need to create an Ansible playbook to specify that we will install the NFS server in the Undercloud node.

cat <<'EOF' > ~/bar_nfs_setup.yaml
# Playbook
# We will setup the NFS node in the Undercloud node
# (we don't have any other place at the moment to do this)
- become: true
  hosts: undercloud
  name: Setup NFS server for ReaR
  roles:
  - role: backup_and_restore
EOF

Then, we will create another playbook to determine the location in which we will like to install ReaR.

cat <<'EOF' > ~/bar_rear_setup.yaml
# Playbook
# We install and configure ReaR in the control plane nodes
# As they are the only nodes we will like to backup now.
- become: true
  hosts: Controller
  name: Install ReaR
  roles:
  - role: backup_and_restore
EOF

Now we create the playbook to create the actual backup.

cat <<'EOF' > ~/bar_rear_create_restore_images.yaml
# Playbook
# We run ReaR in the control plane nodes.
- become: true
  hosts: ceph_mon
  name: Backup ceph authentication
  tasks:
    - name: Backup ceph authentication role
      include_role:
        name: backup_and_restore
        tasks_from: ceph_authentication
      tags:
      -  bar_create_recover_image

- become: true
  hosts: Controller
  name: Create the recovery images for the control plane
  roles:
  - role: backup_and_restore
EOF

The last step is to run the previously create playbooks filtering by the corresponding tag.

First, we configure the NFS server.

# Configure NFS server in the Undercloud node
ansible-playbook \
    -v -i ~/tripleo-inventory.yaml \
    --extra="ansible_ssh_common_args='-o StrictHostKeyChecking=no'" \
    --become \
    --become-user root \
    --tags bar_setup_nfs_server \
    ~/bar_nfs_setup.yaml

Then, we install ReaR in the desired nodes.

# Configure ReaR in the control plane
ansible-playbook \
    -v -i ~/tripleo-inventory.yaml \
    --extra="ansible_ssh_common_args='-o StrictHostKeyChecking=no'" \
    --become \
    --become-user root \
    --tags bar_setup_rear \
    ~/bar_rear_setup.yaml

Lastly, we execute the actual backup step. With or without ceph.

# Create recovery images of the control plane
ansible-playbook \
    -v -i ~/tripleo-inventory.yaml \
    --extra="ansible_ssh_common_args='-o StrictHostKeyChecking=no'" \
    --become \
    --become-user root \
    --tags bar_create_recover_image \
    ~/bar_rear_create_restore_images.yaml

Ironic Usage

This Ansible role gets the most of the ironic/metallsmitch service on the Undercloud to automate the restoration of the nodes.

  1. Install an NFS server as a data backup.

  2. Install an NFS server on the Undercloud.

  3. Install and configure ReaR.

  4. Perform a ReaR backup.

  5. Restore a Node.

Firstly, the user needs to have access to the environment Ansible inventory.

We will use the tripleo-ansible-inventory command to generate the inventory file.

tripleo-ansible-inventory \
  --stack overcloud \
  --ansible_ssh_user heat-admin \
  --static-yaml-inventory ~/tripleo-inventory.yaml

Secondly, we need to create an Ansible playbook to specify that we will install the NFS server in the Undercloud node.

cat <<'EOF' > ~/bar_nfs_setup.yaml
# Playbook
# We will setup the NFS node in the Undercloud node
# (we don't have any other place at the moment to do this)
- become: true
  hosts: backupServer
  name: Setup NFS server for ReaR
  roles:
  - role: backup_and_restore
EOF

Then, we need to install and configure the NFS server.

# Install and Configure NFS server node
ansible-playbook \
    -v -i ~/tripleo-inventory.yaml \
    --extra="ansible_ssh_common_args='-o StrictHostKeyChecking=no'" \
    --become \
    --become-user root \
    --tags bar_setup_nfs_server \
    ~/bar_nfs_setup.yaml

The Undercloud needs to be configured to integrate ReaR with Ironic. The first step is the creation of the playbook.

cat <<'EOF' > ~/prepare-undercloud-pxe.yaml
---
- name: TripleO PXE installation and configuration.
  hosts: Undercloud
  become: true
  vars:
    tripleo_backup_and_restore_shared_storage_folder: "{{ tripleo_backup_and_restore_ironic_images_path }}"
    tripleo_backup_and_restore_shared_storage_subfolders: ["pxelinux.cfg"]
  roles:
    - role: backup_and_restore
EOF

After the playbook is created, let’s execute ansible to apply the changes.

ansible-playbook \
    -v -i ~/tripleo-inventory.yaml \
    --extra="ansible_ssh_common_args='-o StrictHostKeyChecking=no'" \
    --become \
    --become-user root \
    --tags bar_setup_nfs_server \
    ~/prepare-undercloud-pxe.yaml

Now, the overcloud nodes need to be configured. As before firstly the playbook is created.

cat <<'EOF' > ~cli-overcloud-conf-ironic.yaml
---
- name: Get Undercloud data
  hosts: Undercloud
  tasks:
    - name: Get networking
      setup:
        gather_subset: network
      tags:
        - never

- name: TripleO Ironic ReaR installation and configuration on Overcloud
  hosts: Controller
  become: true
  vars:
    tripleo_backup_and_restore_pxe_output_url: "nfs://{{ hostvars['undercloud']['ansible_facts']['br_ctlplane']['ipv4']['address'] }}{{ tripleo_backup_and_restore_ironic_images_path }}"
    tripleo_backup_and_restore_local_config:
      OUTPUT: PXE
      OUTPUT_PREFIX_PXE: $HOSTNAME
      BACKUP: NETFS
      PXE_RECOVER_MODE: '"unattended"'
      PXE_CREATE_LINKS: '"IP"'
      USE_STATIC_NETWORKING: y
      PXE_CONFIG_GRUB_STYLE: y
      KERNEL_CMDLINE: '"unattended"'
      POST_RECOVERY_SCRIPT: poweroff
      USER_INPUT_TIMEOUT: "10"
      PXE_TFTP_URL: "{{ tripleo_backup_and_restore_pxe_output_url }}"
      BACKUP_URL: "{{ tripleo_backup_and_restore_backup_url }}"
      PXE_CONFIG_URL: "{{ tripleo_backup_and_restore_pxe_output_url }}/pxelinux.cfg"
  roles:
    - role: backup_and_restore
EOF

Install and configure ReaR on the overcloud controller nodes. If the nodes are using OVS, ReaR does not know how to configure the network so the tripleo_backup_and_restore_network_preparation_commands needs to be configure.

ansible-playbook \
    -v -i tripleo-inventory.yaml \
    --extra="ansible_ssh_common_args='-o StrictHostKeyChecking=no'" \
    --become \
    --become-user root \
    --tags bar_setup_rear \
    ~/cli-overcloud-conf-ironic.yaml \
    -e "tripleo_backup_and_restore_network_preparation_commands=\"('ip l a br-ex type bridge' 'ip l s ens3 up' 'ip l s br-ex up' 'ip l s ens3 master br-ex' 'dhclient br-ex')\""

There are some playbooks that can be used to perform a backup of the nodes.

ansible-playbook \
    -v -i ~/tripleo-inventory.yaml \
    --extra="ansible_ssh_common_args='-o StrictHostKeyChecking=no'" \
    --become \
    --become-user root \
    --tags bar_create_recover_image \
    /usr/share/ansible/tripleo-playbooks/cli-overcloud-backup.yaml

In the same way to Restore a node there is also a playbook to achieve it. The tripleo_backup_and_restore_overcloud_restore_name is the name, uuid or hostname of the node that is going to be restored.

ansible-playbook \
    -v -i tripleo-inventory.yaml \
    /usr/share/ansible/tripleo-playbooks/cli-overcloud-restore-node.yml \
    -e "tripleo_backup_and_restore_overcloud_restore_name=control-0"