2024.2 Series Release Notes

19.0.0

New Features

  • Adds docker_image_name_prefix that allows to define images prefix.

  • With the boolean parameter fluentd_enable_watch_timer it is now possible to enable the additional watch timer of Fluentd.

    The default value of fluentd_enable_watch_timer is set to false.

    More details about the watch timer in Fluentd can be found at https://docs.fluentd.org/input/tail#enable_watch_timer.

  • Adds prometheus_node_exporter_targets_extra to add additional scrape targets to the node exporter job. See kolla-ansible-doc:documentation <reference/logging-and-monitoring/prometheus-guide.html> for more information.

  • Blackbox monitoring endpoint configuration is now automated for many common services. The default endpoint list, prometheus_blackbox_exporter_endpoints_default, varies according to the services that are enabled. Custom endpoints can be added to prometheus_blackbox_exporter_endpoints_custom.

  • The ceilometer/pipeline.yaml file is now handled as a template file.

  • A new variable cinder_cluster_name which controls the name of cinder-volume High Availability cluster (for those backend drivers that support it). It replaces possible user-defined config overrides that already may have implemented this feature.

  • Adds support for configuring CloudKitty to use OpenSearch as storage backend.

  • Add support to configure Huawei backends in Cinder. The extra configuration XML files provided by cinder_backend_huawei_xml_files copied into cinder-volume containers during deploy when cinder_backend_huawei is true.

    Blueprint cinder-huawei-backend

  • Modifies public API firewalld rules to be applied immediately to a running firewalld service. This requires firewalld to be running, but avoids reloading firewalld, which is disruptive due to the way in which firewalld builds its firewall chains.

  • Adds the ability to provide the NTP (time source) server for multiple DHCP ranges in the Ironic Inspector DHCP server.

  • Adds a neutron_physical_networks variable for customising Neutron physical network names. The default behaviour of using physnet1 to physnetN is unchanged.

  • Implement jinja filters for service dicts. Using select_services_enabled_and_mapped_to_host filter gets rid of some overhead caused by ansible skipping items in tasks. With bigger amount of hosts, this overhead is non-insignificant. Usage of service_enabled_and_mapped_to_host filter is mostly cosmetic and has no effect on the performance. Blueprint performance-improvements

  • Implements ability to use internal frontend TLS between a Kolla service and ProxySQL This does not enable TLS itself, its need to be patched in per-service patches, that will enable TLS in mysql connection strings

  • Implements TLS between Keystone and ProxySQL

  • Introduces new variables mariadb_monitor_connect_interval, mariadb_monitor_galera_healthcheck_interval, mariadb_monitor_galera_healthcheck_timeout, mariadb_monitor_galera_healthcheck_max_timeout_count, mariadb_monitor_ping_interval, mariadb_monitor_ping_timeout, and mariadb_monitor_ping_max_failures. These allow faster detection of issues in Galera clusters, reducing downtime to 10 seconds.

  • Added options to configure mariadb_shun_on_failures, mariadb_connect_retries_delay, and mariadb_connect_retries_on_failure for enhanced control over ProxySQL’s shun behavior. These adjustments help manage failover responses effectively. For more details, see Proxysql Documentation

  • Add proxysql_prometheus_exporter configuration parameter which can be used to configure Prometheus to scrape ProxySQL metrics endpoints. The default value of proxysql_prometheus_exporter is set to the combined values of enable_prometheus and enable_proxysql.

  • Added NVMe-TCP as a new transport for Pure Storage FlashArray Cinder driver.

  • Implements service-cert-copy role being able to copy certs to non-HAProxy container. Partial Blueprint mariadb-ssl-support <https://blueprints.launchpad.net/kolla-ansible/+spec/mariadb-ssl-support>

  • kolla-ansible now validates the Prometheus configuration files when called via kolla-ansible -i $inventory validate-config. This validation is done by running the promtool check config command. See the documentation for the kolla-ansible validate-config command for details.

Upgrade Notes

  • prometheus_blackbox_exporter_endpoints will now be automatically populated with endpoints for many common services. Custom endpoints should be migrated to prometheus_blackbox_exporter_endpoints_custom to avoid overriding the default configuration.

  • Changes the strategy of installing projects in dev mode in containers. Instead of bind mounting the project’s git repository to the venv of the container, the repository is bind mounted to /dev-mode/<project_name> from which the it is installed using pip on every startup of the container using kolla_install_projects script. Also updates docs to reflect the changes.

  • MariaDB backup now uses the same image as the running MariaDB server. The following variables relating to MariaDB backups are no longer used and have been removed:

    • mariabackup_image

    • mariabackup_tag

    • mariabackup_image_full

  • A precheck for running Cinder HA has been introduced, which checks for multiple cinder-volume instances and fails when cinder_cluster_name is unset. For details on configuration guidelines please check the HA section of Cinder guide. To disable the precheck please set cinder_cluster_skip_precheck to true.

  • To use OpenSearch for CloudKitty storage, set cloudkitty_storage_backend to opensearch. The following variables have been added and may need to be updated unless the default configuration is used:

    • cloudkitty_opensearch_index_name

    • cloudkitty_opensearch_url

    • cloudkitty_opensearch_cafile

    • cloudkitty_opensearch_insecure_connections

  • Support for OpenEuler host operating system has been dropped, due to no recent (3.10+) python availability that is required by ansible-core 2.16 and later.

  • Python 3.8, 3.9 support has been dropped. The minimum version of Python now supported by Kolla Ansible is Python 3.10.

  • If you have old clients that do not support the new TLS settings, you can revert back to the old behaviour by setting the following variable in your globals.yml:

    kolla_haproxy_ssl_settings: legacy or if you want to have at least some improved security settings: kolla_haproxy_ssl_settings: intermediate

    See LP#2060787

  • Rewrite kolla-ansible CLI in Python. Moving the CLI to Python allows for easier maintenance and larger feature set. The CLI was built using the cliff package that is used in the openstack and kayobe commands.

    This patch introduces a few breaking changes stemming from the nature of the cliff package:

    • the order of parameters must be kolla-ansible <action> <arguments>

    • mariadb_backup and mariadb_recovery now are mariadb-backup and mariadb-recovery

    The --key parameter has also been dropped as it was duplicating --vault-password-file.

  • Support for failing execution early if fact collection fails on any of the hosts by setting kolla_ansible_setup_any_errors_fatal to true has been removed. This is due to Ansible’s any_errors_fatal parameter not being templated, resulting in the value always being interpreted as true, even though the default value of kolla_ansible_setup_any_errors_fatal is false.

    Equivalent behaviour is possible by setting the maximum failure percentage to 0. This may be done specifically for fact gathering using gather_facts_max_fail_percentage or globally using kolla_max_fail_percentage.

  • Global variable distro_python_version now defaults to “3”.

  • The config option enable_proxysql has been changed to yes, which means that MySQL connections will now be handled by ProxySQL by default instead of HAProxy. Users who wish to retain load balancing of MySQL connections through HAProxy must set enable_proxysql to no. Also Due to this change, the config option enable_mariadb_clustercheck is also dynamically changed to no. Users who still wish to maintain mariadb_clustercheck can override this config option in the configuration. However, with ProxySQL, mariadb_clustercheck is no longer needed and can be manually removed.

  • Adds support for Ubuntu Noble Numbat 24.04 as a host operating system.

Bug Fixes

  • Fixes an deploy opensearch with enable TLS on the internal VIP.

  • Fixes an issue with ironic dnsmasq failing to start in deployments using podman because it requires the NET_RAW capability. See LP#2055282.

  • Fixes problems where when package file manifest changes, the changes were not reflected in to devmode-enabled container. LP#1814515

  • Put memcache_security_strategy in single place at all.yml For possible config options see docs

    LP#1850733

  • Fixes trove module imports. Path to the modules needed by trove-api changed in source trove package so the configuration was updated. LP#1937120

  • Fixes handling of openvswitch on manila-share nodes. LP#1993285

  • Fixes kolla-ansible removing inventory file placed in /etc/kolla/<inventory>. See LP#2052706 for more details.

  • Fixes the incorrect dictionary key reference in ‘Copy Policy File’ task. LP#2054867

  • Modifies the MariaDB procedure to use the same container image as the running MariaDB server container. This should prevent compatibility issues that may cause the backup to fail.

  • Fixes keystone service configuration for haproxy when using federation. LP#2058656

  • Fixes mariadb’s backup failure due to missing CREATE privileges on the mariadb_backup_history table. LP#2061889

  • Fixes a bug where loadbalancer upgrade task fails, when podman was used as container engine. LP#2063896

  • Fixes a bug in kolla_podman_worker, where missing commas in list of strings create implicit concatenation of items that should be separate. LP#2067278

  • Fixes redundant copying of grafana custom config files. LP#2067999

  • Fixes podman failure when enable_container_healthchecks is set to “no”. LP#2071912

  • Adds database configuration necessary for barbican. LP#2072554

  • Fixes the MariaDB recovery issue when kolla-ansible is running from a docker container. LP#2073370

  • Fixes busy libvirt’s secret volume while secrets are changing. LP#2073678

  • Fixes issue in PodmanWorker where it didn’t set KOLLA_SERVICE_NAME environment variable when creating new container. Additionally, two methods were moved from DockerWorker to ContainerWorker as they are applicable to both engines.

  • Fixes indentation in haproxy configuration. LP#2080034

  • Fixes an issue where backend-related certificates are attempted to be copied when kolla_copy_ca_into_containers is enabled but kolla_enable_tls_backend is disabled. LP#2080381

  • Fixes #2080552. openvswitch role will now set external-ids:hostname to {{ ansible_facts.fqdn }} instead of {{ ansible_facts.hostname }} due to Neutron using FQDN based hostnames in requested-chassis field. LP#2080552

  • Fix ProxySQL unable to bind due to incorrect format of IPv6 addresses in the mysql_ifaces configuration. LP#2081106

  • Fix simple typo in section cinder_backend_pure_nvme_tcp to correct pure_nvme transport` LP#2081149

  • Add missing logrotate config for redis. LP#2084523

  • Fixes the Python requests library issue when using custom CA by adding the REQUESTS_CA environment variable to the kolla-toolbox container. See LP#1967132

  • Fixes configuration of CloudKitty when internal TLS is enabled. LP#1998831

  • Fixes an issue during fact gathering when using the --limit argument where a host that fails to gather facts could cause another host to fail during delegated fact gathering.

  • Fixes an issue with setting up OIDC based Keystone federation against IDP where there are multiple OIDC groups that are separated by a custom delimiter. Add a variable keystone_federation_oidc_claim_delimiter to set the custom value. LP#2080394

  • Add skip_kpartx yes to multipath.conf defaults section to prevent kpartx scanning multipath devices and unlock multipathd del map operation of os-brick for volume detaching oprtaions. LP#2078973 <https://launchpad.net/bugs/2078973>`__

  • Fixes 2067036. Added octavia_interface_wait_timeout to control octavia-interface.service timeout to be able wait openvswitch agent sync has been finished and octavia-lb-net is reachable from the host. Also set restart policy for this unit to on-failure LP#2067036

  • Fixes the dimensions comparison when we set values like 1g in the container dimensions configuration, making the docker container getting restarted even with no changes, as we are comparing 1g with 1073741824, which is displayed in the docker inspect while 1g is in the configuration.

  • Fixes the detection of the Nova Compute Ironic service when a custom host option is set in the service config file. See LP#2056571

  • Fixes keystone port in skyline-console pointing to wrong endpoint port. LP#2069855

  • Fixes 2065168. Fix kolla systemd unit template to prevent restart all kolla services with docker.service restart. LP#[2065168]

  • Fixes unreliable health checks for neutron_ovn_agent and neutron_ovn_metadata_agent bug. Changed to check OVS DB connection instead of OVN southbound DB connection. LP#2084128

  • Fixes an issue, when using podman, with named volumes that use a mode specifier. See LP#2054834 for more details.

  • Removes the default /tmp/ mountpoint from the horizon container. This change is made to harden the container and prevent potential security issues. For more information, see the Bug Report: LP#2068126.

  • Configures Heat with [volumes]/backups_enabled based on whether the cinder-backup service is enabled.

  • Fixes parsing of JSON output of inner modules called by kolla-toolbox when data was returned on standard error. LP#2080544

  • Fixed an issue with the prometheus.yml template which would break when deploying alertmanager.

  • Updates proxysql.yaml.j2 to use mariadb_port for backends. This fixes setups where database_port and mariadb_port differ.

  • All stable RabbitMQ feature flags are now enabled during deployments, reconfigures, and upgrades. As such, the variable rabbitmq_feature_flags is no longer required. This is a partial fix to RabbitMQ SLURP support. LP#2049512

  • nova_upgrade_checks container uses a newly generated config.json

  • Fixes skyline’s old format of stop task. It used docker_container which would cause problems with podman deployments.

  • Fixes a bug where the IP address comparison was not done properly for the variable kolla_same_external_internal_vip. Fix the comparison to use the ipaddr filter instead. For details see LP#2076889.

  • Fixes an issue where OVN northbound or southbound database deployment could fail when a new leader is elected. LP#2059124