2024.1 Series Release Notes

18.2.0-25

Upgrade Notes

  • Support for failing execution early if fact collection fails on any of the hosts by setting kolla_ansible_setup_any_errors_fatal to true has been removed. This is due to Ansible’s any_errors_fatal parameter not being templated, resulting in the value always being interpreted as true, even though the default value of kolla_ansible_setup_any_errors_fatal is false.

    Equivalent behaviour is possible by setting the maximum failure percentage to 0. This may be done specifically for fact gathering using gather_facts_max_fail_percentage or globally using kolla_max_fail_percentage.

Bug Fixes

  • Fixes an issue with ironic dnsmasq failing to start in deployments using podman because it requires the NET_RAW capability. See LP#2055282.

  • Fixes keystone service configuration for haproxy when using federation. LP#2058656

  • Fixes mariadb’s backup failure due to missing CREATE privileges on the mariadb_backup_history table. LP#2061889

  • Fixes the MariaDB recovery issue when kolla-ansible is running from a docker container. LP#2073370

  • Fixes #2080552. openvswitch role will now set external-ids:hostname to {{ ansible_facts.fqdn }} instead of {{ ansible_facts.hostname }} due to Neutron using FQDN based hostnames in requested-chassis field. LP#2080552

  • Fixes an issue during fact gathering when using the --limit argument where a host that fails to gather facts could cause another host to fail during delegated fact gathering.

  • Add skip_kpartx yes to multipath.conf defaults section to prevent kpartx scanning multipath devices and unlock multipathd del map operation of os-brick for volume detaching oprtaions. LP#2078973 <https://launchpad.net/bugs/2078973>`__

  • Fixes 2067036. Added octavia_interface_wait_timeout to control octavia-interface.service timeout to be able wait openvswitch agent sync has been finished and octavia-lb-net is reachable from the host. Also set restart policy for this unit to on-failure LP#2067036

  • Fixes Octavia service upgrade issue where it can fail when Octavia persistence database user is missing. LP#2065591

  • Fixes unreliable health checks for neutron_ovn_agent and neutron_ovn_metadata_agent bug. Changed to check OVS DB connection instead of OVN southbound DB connection. LP#2084128

  • Fixes an issue, when using podman, with named volumes that use a mode specifier. See LP#2054834 for more details.

  • Fixes parsing of JSON output of inner modules called by kolla-toolbox when data was returned on standard error. LP#2080544

18.2.0

New Features

  • Modifies public API firewalld rules to be applied immediately to a running firewalld service. This requires firewalld to be running, but avoids reloading firewalld, which is disruptive due to the way in which firewalld builds its firewall chains.

Bug Fixes

  • Fixes an deploy opensearch with enable TLS on the internal VIP.

  • Fixes handling of openvswitch on manila-share nodes. LP#1993285

  • Adds database configuration necessary for barbican. LP#2072554

  • Fixes the Python requests library issue when using custom CA by adding the REQUESTS_CA environment variable to the kolla-toolbox container. See LP#1967132

  • Fixes the detection of the Nova Compute Ironic service when a custom host option is set in the service config file. See LP#2056571

  • Removes the default /tmp/ mountpoint from the horizon container. This change is made to harden the container and prevent potential security issues. For more information, see the Bug Report: LP#2068126.

  • Fixed an issue with the prometheus.yml template which would break when deploying alertmanager.

  • Fixes an issue where OVN northbound or southbound database deployment could fail when a new leader is elected. LP#2059124

18.1.0

Upgrade Notes

  • MariaDB backup now uses the same image as the running MariaDB server. The following variables relating to MariaDB backups are no longer used and have been removed:

    • mariabackup_image

    • mariabackup_tag

    • mariabackup_image_full

Bug Fixes

  • Fixes trove module imports. Path to the modules needed by trove-api changed in source trove package so the configuration was updated. LP#1937120

  • Modifies the MariaDB procedure to use the same container image as the running MariaDB server container. This should prevent compatibility issues that may cause the backup to fail.

  • Fixes a bug in kolla_podman_worker, where missing commas in list of strings create implicit concatenation of items that should be separate. LP#2067278

  • Fixes configuration of CloudKitty when internal TLS is enabled. LP#1998831

  • Fixes the dimensions comparison when we set values like 1g in the container dimensions configuration, making the docker container getting restarted even with no changes, as we are comparing 1g with 1073741824, which is displayed in the docker inspect while 1g is in the configuration.

  • Fixes keystone port in skyline-console pointing to wrong endpoint port. LP#2069855

  • Fixes 2065168. Fix kolla systemd unit template to prevent restart all kolla services with docker.service restart. LP#[2065168]

  • All stable RabbitMQ feature flags are now enabled during deployments, reconfigures, and upgrades. As such, the variable rabbitmq_feature_flags is no longer required. This is a partial fix to RabbitMQ SLURP support. LP#2049512

  • Fixes skyline’s old format of stop task. It used docker_container which would cause problems with podman deployments.

18.0.0

New Features

  • Added a new variable prometheus_ceph_exporter_interval for controlling the scrape interval of Ceph metrics.

  • Exposed a flag, bifrost_enable_ironic_inspector, to enable ironic-inspector in Bifrost. This option defaults to True as it can be useful for backwards compatibility. It still allows using native in-band inspection when Ironic Inspector is enabled by setting inspect_interface to agent. Please see the Ironic documentation for more details.

  • Implemented [Configure tap-as-a-service plugin on neutron containers]. Added the necessary changes and configurations to use the neutron plugin, tap-as-a-service, for creating port mirrors using openstack tap commands. For more information, refer to the blueprint configure-taas-plugin.

  • Implemented [Enable Fluentd Plugin Systemd]. Added the necessary changes and configurations to use the fluentd plugin, systemd, to read logs from /var/log/journal by default. This allows us to read and manipulate these logs for monitoring purposes.

    These logs will be sent to OpenSearch by default. To disable this behavior, set the value of the variable enable_fluentd_systemd to false in the configuration file /etc/kolla/globals.yml.

    By default, when enabling central logging, we also enable the systemd plugin. To disable this behavior when central logging is enabled, set the value of the variable enable_fluentd_systemd to false in the configuration file /etc/kolla/globals.yml.

    fluent-plugin-systemd source: https://github.com/fluent-plugin-systemd/fluent-plugin-systemd

    For more information, refer to the blueprint enable-fluent-plugin-systemd.

  • New variables have been added to be used by the neutron role, neutron_dns_integration and neutron_dns_domain. They allow enabling/disabling internal/external DNS integrations, or their combinations.

  • Configures the log level field for the Grafana OpenSearch datasource. This allows for logs to be coloured based on log level. To apply this you need to delete the datasource and reconfigure grafana.

  • Removed configuration and deployment of prometheus-haproxy-exporter as its repository is now archived. We now use the native support for Prometheus which is now built into HAProxy. For consistency this is exposed on the prometheus_haproxy_exporter_port port. prometheus-haproxy-exporter containers and config are automatically removed.

  • Elevated access for project-scoped service role in Ironic has been enabled. Ironic recently started to enforce new policies and scopes, and it is one of the few OpenStack projects that require system scope for some admin-related API calls. However, Ironic has also begun to allow project-scoped behavior for service roles by setting rbac_service_role_elevated_access. This change enables this setting to achieve similar behavior for service role as other OpenStack projects.

  • The service role has been added to Ironic service users. Ironic recently enforced new policy validation and added support for service roles.

  • Support has been added for setting the max fail percentage for Ansible plays via kolla_max_fail_percentage. It can also be set on a per-service basis, e.g., nova_max_fail_percentage.

  • Set a log retention policy for OpenSearch via Index State Management (ISM). Documentation.

  • Support for neutron-fwaas v2 has been re-added. Set enable_neutron_fwaas: yes to enable.

  • Configuration has been added to connect Skyline’s Prometheus to make the Monitor Center work. The latest Skyline Console now includes a Monitor Center in the administrator view that displays information from Prometheus. For this to work, the Prometheus connection needs to be set up in skyline.yaml.

  • The ability to override Skyline configuration files has been added. You can supply your own versions of nginx.conf for Skyline Console, gunicorn.py, and skyline.yaml for the Skyline API Server. Place the files in the skyline subfolder of your Kolla config directory; skyline.yaml will be merged with the Kolla-provided version.

  • More services supported by Skyline have been added to the configuration, making them accessible to Skyline’s frontend console. The new services include Barbican, Designate, Masakari, and Swift or Ceph RGW. If both Swift and Ceph RGW are enabled, only Swift is configured.

  • Allow to overwrite Skyline Console logos. Some of the Skyline logos can be replaced. You can now do this. See the reference documentation for details.

  • Single Sign-On (SSO) is enabled in Skyline Console if Keystone federation is enabled and at least one identity provider with the openid protocol is set up. Skyline Console’s redirect URI is added to Keystone’s trusted dashboards.

Upgrade Notes

  • The minimum supported Ansible version is now 8 (ansible-core 2.15), and the maximum supported is 9 (ansible-core 2.16).

  • Support for deploying Freezer has been dropped.

  • Support for deploying Murano has been dropped. Additionally, support for deploying outward RabbitMQ (only used for Murano) has been dropped as well.

  • Support for deploying Sahara has been dropped.

  • Support for deploying Senlin has been dropped.

  • Support for deploying Solum has been dropped.

  • Support for deploying Vitrage has been dropped.

  • The configuration variable designate_enable_notifications_sink has been changed to no, configuring notifications for designate in neutron, nova, and control deployment of designate-sink which is now optional.

    Operators who want to keep the previous behavior should set this to true.

  • The grafana volume is no longer used. If you wish to automatically remove the old volume, set grafana_remove_old_volume to true. Note that doing this will lose any plugins installed via the CLI directly and not through Kolla. If you have previously installed Grafana plugins via the Grafana UI or CLI, you must change to installing them at image build time. The Grafana volume, which contains existing custom plugins, will be automatically removed in the D release.

  • Due to the change from using the prometheus-haproxy-exporter to using the native support for Prometheus which is now built into HAProxy, metric names may have been replaced and/or removed, and in some cases the metric names may have remained the same but the labels may have changed. Alerts and dashboards may also need to be updated to use the new metrics. Please review any configuration that references the old metrics as this is not a backwards compatible change.

  • The Horizon role has been reworked to the preferred local_settings.d configuration model. Files local_settings and custom_local_settings have been renamed to _9998-kolla-settings.py and _9999-custom-settings.py respectively. Users who use Horizon’s custom configuration must change the names of those files in /etc/kolla/config/horizon as well.

  • Added log retention in OpenSearch, previously handled by Elasticsearch Curator. By default the soft and hard retention periods are 30 and 60 days respectively. If you are upgrading from Elasticsearch, and have previously configured elasticsearch_curator_soft_retention_period_days or elasticsearch_curator_hard_retention_period_days, those variables will be used instead of the defaults. You should migrate your configuration to use the new variable names before the Caracal release.

  • If credentials are updated in passwords.yml kolla-ansible is now able to update these credentials in the keystone database and in the on disk config files.

    The changes to passwords.yml are applied once kolla-ansible -i INVENTORY reconfigure has been run.

    If you want to revert to the old behavior - credentials not automatically updating during reconfigure if they changed in passwords.yml - you can specify this by setting update_keystone_service_user_passwords: false in your globals.yml.

    Notice that passwords are only changed if you change them in passwords.yml. This mechanism is not a complete solution for automatic credential rollover. No passwords are changed if you do not change them inside passwords.yml.

Deprecation Notes

  • Support for deploying Masakari is no longer deprecated. The Masakari CI scenarios are now working again, and commitment has been made to improve the health of the project.

Bug Fixes

  • Add conditionals for IPv6 sysctl settings that have IPV6 disabled in kernel. Changing sysctl settings related to IPv6 on those systems lead to errors. LP#1906306

  • Fixed nova-cell not updating the cell0 database address when VIP changes. For more details, refer to LP#1915302.

  • Fixes non-persistent Neutron agent state data. LP2009884

  • Starting with ansible-core 2.13, list concatenation format is changed which resulted in inability to override horizon policy files. See LP#2045660 for more details.

  • Fixes long service restarts while using systemd LP#2048130.

  • Fixes an issue with high CPU usage of the cAdvisor container by setting the per-container housekeeping interval to the same value as the Prometheus scrape interval. LP#2048223

  • Fixes Nova operations using the scp command, such as cold migration or resize, on Debian Bookworm. LP#2048700

  • Fixes configuration of nova-compute and nova-compute-ironic, that will enable exposing vendordata over configdrive. LP#2049607

  • Fixes mariadb role deployment when using Ansible check mode. LP#2052501

  • Fixed an issue with openvswitch bridge creation when neutron_bridge_name was specified as two bridges. For details, see LP#2056332.

  • Fixed the use of Redis as coordination backend. For details, see LP#2056667.

  • Fixed the wrong configuration of the ovs-dpdk service, which broke the deployment of Kolla-Ansible. For details, see bug 2058372.

  • Incorrect condition in Podman part prevented the retrieval of facts of all the containers when no names were provided. LP#2058492

  • Fixes a bug where loadbalancer upgrade task fails, when podman was used as container engine. LP#2063896

  • Updated configuration of service user tokens for all Nova and Cinder services to stop using admin role for service_token and use service role.

    See LP#[2004555] and LP#[2049762] for more details.

  • Fixes enabled usage audit notifications when they are not needed. See LP##2049503.

  • Fixed ‘cinder-backup’ service when Swift with TLS enabled. LP#2051986

  • Fixes an idempotency issue in the OpenSearch upgrade tasks where subsequent runs of kolla-ansible upgrade would leave shard allocation disabled. LP#2049512

  • Fixes Docker health check for the sahara_engine container. LP#2046268

  • Fixed a trove deployment bug where the trove guest-agent failed to connect to RabbitMQ due to the absence of the oslo_messaging_rabbit config in guest-agent.conf. See bug 2048822 for details.

  • Fixed trove failing to discover the swift endpoint due to the absence of service_credentials in guest-agent.conf. See bug 2048829 for details.

  • Fixed an issue where the MariaDB Cluster recovery process would fail if the sequence number was not found in the logs. The recovery process now checks the complete log file for the sequence number and recovers the cluster. See LP#1821173 for details.

  • Updated the default Grafana OpenSearch datasource configuration to use values for OpenSearch that work out of the box. Replaced the Elasticsearch values that were previously being used. The new configuration can be applied by deleting your datasource and reconfiguring Grafana through Kolla Ansible. To prevent dashboards from breaking when the datasource is deleted, one should use datasource variables <https://grafana.com/docs/grafana/latest/dashboards/variables/add-template-variables/#add-a-data-source-variable> in Grafana. See bug 2039500 <https://bugs.launchpad.net/kolla-ansible/+bug/2039500>.

  • Fixed bug #2039498 where the Grafana docker volume was bind mounted over Grafana plugins installed at image build time. This was fixed by copying the dashboards into the container from an existing bind mount instead of using the grafana volume. However, this leaves behind the volume, which can be removed by setting grafana_remove_old_volume to true. Please note that any plugins installed via the CLI directly and not through Kolla will be lost when doing this. In a future release, grafana_remove_old_volume will default to true.

  • Added log retention in OpenSearch, previously handled by Elasticsearch Curator, now using Index State Management (ISM) OpenSearch bundled plugin. LP#2047037.

  • Support for friendly labels has been added for Prometheus Ironic exporter and Alertmanager metrics. See LP#2041855 for details.

  • Changes to service user passwords in passwords.yml will now be applied when reconfiguring services.

    This behaviour can reverted by setting update_keystone_service_user_passwords: false.

    Fixes LP#2045990