Zed Series Release Notes¶
15.6.0-24¶
Bug Fixes¶
Add conditionals for IPv6 sysctl settings that have IPV6 disabled in kernel. Changing sysctl settings related to IPv6 on those systems lead to errors. LP#1906306
Fixes
ovs-dpdk
images pull. LP#[2041864]
Starting with ansible-core 2.13, list concatenation format is changed which resulted in inability to override horizon policy files. See LP#2045660 for more details.
Fixes configuration of nova-compute and nova-compute-ironic, that will enable exposing vendordata over configdrive. LP#2049607
Fixes behaviour of Change Password screen in Horizon until bug #2073639 is resolved. LP#2073159
Fixes the Python requests library issue when using custom CA by adding the REQUESTS_CA environment variable to the kolla-toolbox container. See LP#1967132
Fixed ‘cinder-backup’ service when Swift with TLS enabled. LP#2051986
Add
skip_kpartx yes
to multipath.confdefaults
section to prevent kpartx scanning multipath devices and unlockmultipathd del map
operation of os-brick for volume detaching oprtaions. LP#2078973 <https://launchpad.net/bugs/2078973>`__
Fixes an idempotency issue in the OpenSearch upgrade tasks where subsequent runs of kolla-ansible upgrade would leave shard allocation disabled. LP#2049512
Fixed an issue where the MariaDB Cluster recovery process would fail if the sequence number was not found in the logs. The recovery process now checks the complete log file for the sequence number and recovers the cluster. See LP#1821173 for details.
Removes the default /tmp/ mountpoint from the horizon container. This change is made to harden the container and prevent potential security issues. For more information, see the Bug Report: LP#2068126.
Fixes parsing of JSON output of inner modules called by
kolla-toolbox
when data was returned on standard error. LP#2080544
15.6.0¶
Upgrade Notes¶
If credentials are updated in
passwords.yml
kolla-ansible is now able to update these credentials in the keystone database and in the on disk config files.The changes to
passwords.yml
are applied oncekolla-ansible -i INVENTORY
reconfigure has been run.If you want to revert to the old behavior - credentials not automatically updating during reconfigure if they changed in
passwords.yml
- you can specify this by settingupdate_keystone_service_user_passwords: false
in your globals.yml.Notice that passwords are only changed if you change them in
passwords.yml
. This mechanism is not a complete solution for automatic credential rollover. No passwords are changed if you do not change them insidepasswords.yml
.
Bug Fixes¶
Fixes mariadb role deployment when using Ansible check mode. LP#2052501
Updated configuration of service user tokens for all Nova and Cinder services to stop using admin role for service_token and use service role.
See LP#[2004555] and LP#[2049762] for more details.
Add Keystone Service role. Keystone is creating service in bootstrap since Bobcat. Service role is needed for SLURP to work from Antelope. This role is also needed in Antelope and Zed for Cinder for proper service token support. LP#2049762
Changes to service user passwords in
passwords.yml
will now be applied when reconfiguring services.This behaviour can reverted by setting
update_keystone_service_user_passwords: false
.Fixes LP#2045990
15.5.0¶
Bug Fixes¶
Fixes enabled usage audit notifications when they are not needed. See LP##2049503.
15.4.0¶
New Features¶
The new command
kolla-ansible rabbitmq-reset-state
has been added. It force-resets the state of RabbitMQ. This is primarily designed to be used when enabling HA queues, see docs: https://docs.openstack.org/kolla-ansible/latest/reference/message-queues/rabbitmq.html#high-availability
Updates apache grok pattern to match the size of response in bytes, time taken to serve the request and user agent.
Masakari coordination backend can now be configured via masakari_coordination_backend variable. Coordination is optional and can now be set to either redis or etcd.
Set a log retention policy for OpenSearch via Index State Management (ISM). Documentation.
Adds the ability to configure rabbitmq via
rabbitmq_extra_config
which can be overriden in globals.yml.
In the configuration template of the Senlin service the
cafile
parameter is now set by default in theauthentication
section. This way the use of self-signed certificates on the internal Keystone endpoint is also usable in the Senlin service.
Upgrade Notes¶
Added log retention in OpenSearch, previously handled by Elasticsearch Curator. By default the soft and hard retention periods are 30 and 60 days respectively. If you are upgrading from Elasticsearch, and have previously configured
elasticsearch_curator_soft_retention_period_days
orelasticsearch_curator_hard_retention_period_days
, those variables will be used instead of the defaults. You should migrate your configuration to use the new variable names before the Caracal release.
Bug Fixes¶
Fix MariaDB backup if enable_proxysql is enable
Fixes keystone’s task which is connecting via ssh instead locally. LP#2004224
Fixes 504 timeout when scraping openstack exporter. Ensures that HAProxy server timeout is the same as the scrape timeout for the openstack exporter backend. LP#2006051
Fixes non-persistent Neutron agent state data. LP2009884
Fix issue with octavia security group rules creation when using IPv6 configuration for octavia management network. See LP#2023502 for more details.
Fixes glance-api failed to start privsep daemon when cinder_backend_ceph is set to true. See LP#2024541 for more details.
Fixes 2024554. Adds host and
mariadb_port
to the wsrep sync status check. This is so none standard ports can be used for mariadb deployments. LP#2024554
Fixes an issue with high CPU usage of the cAdvisor container by setting the per-container housekeeping interval to the same value as the Prometheus scrape interval. LP#2048223
Fixes glance image import LP#2048525.
Fixes an issue where Prometheus would fail to scrape the OpenStack exporter when using internal TLS with an FQDN. LP#2008208
Fixes Docker health check for the
sahara_engine
container. LP#2046268
Fixes an issue where Fluentd was parsing Horizon WSGI application logs incorrectly. Horizon error logs are now written to
horizon-error.log
instead ofhorizon.log
. See LP#1898174
Added log retention in OpenSearch, previously handled by Elasticsearch Curator, now using Index State Management (ISM) OpenSearch bundled plugin. LP#2047037.
Fixes an issue where Prometheus scraping of Etcd metrics would fail if Etcd TLS is enabled. LP#2036950
15.3.0¶
New Features¶
Added capability to specify custom kernel modules for Neutron: neutron_modules_default: Lists default modules. neutron_modules_extra: For custom modules and parameters.
Added a neutron check for ML2/OVS and ML2/OVN presence at the start of deploy phase. It will fail if neutron_plugin_agent is set to
ovn
and use of ML2/OVS container detected. In case where neutron_plugin_agent is set toopenvswitch
the check will fail when it detects ML2/OVN container or any of the OVN specific volumes.
Upgrade Notes¶
Default keystone user role has been changed from deprecated role
_member_
tomember
role.
Now
ironic_tftp
service does not bind on 0.0.0.0, by default it uses ip address of theapi_interface
. To revert to the old behaviour, please setironic_tftp_interface_address: 0.0.0.0
inglobals.yml
.
Before upgrading to the Zed release of Kolla-Ansible on Ubuntu, ensure that Elasticsearch indexes created in version 6 or earlier are reindexed. OpenSearch 2.x does not support these older indexes. A precheck for this scenario has now been introduced.
Configure Nova libvirt.num_pcie_ports to 16 by default. Nova currently sets ‘num_pcie_ports’ to “0” (defaults to libvirt’s “1”), which is not sufficient for hotplug use with ‘q35’ machine type.
Changes default value of nova libvirt driver setting
skip_cpu_compare_on_dest
to true. With the libvirt driver, during live migration, skip comparing guest CPU with the destination host. When using QEMU >= 2.9 and libvirt >= 4.4.0, libvirt will do the correct thing with respect to checking CPU compatibility on the destination host during live migration.
Security Issues¶
Restrict the access to the http Openstack services exposed /server-status by default through the HAProxy on the public endpoint. Fixes issue for Ubuntu/Debian installations. RockyLinux/CentOS not affected. LP#1996913
Bug Fixes¶
Fixes issues with OVN NB/SB DB deployment, where first node needs to be rebootstrapped. LP#1875223
enable_keystone_federation
andkeystone_enable_federation_openid
have not been explicitly handled as bool in various templates in the keystone role so far. LP#2036390
Fixes an issue when Kolla is setting the producer tasks to None, and this disables all designate producer tasks. LP#1879557
Fixes
ironic_tftp
which binds to all ip addresses on the system. Addedironic_tftp_interface
,ironic_tftp_address_family
andironic_tftp_interface_address
parameters to set the address for theironic_tftp
service. LP#2024664
Fixes an OpenSearch migration process by adding precheck for Elasticsearch indexes in too low version for OpenSearch 2.x.
Fixes an issue where a Docker health check wasn’t configured for the OpenSearch Dashboards container. See bug 2028362.
Fixes an issue where ‘q35’ libvirt machine type VM could not hotplug more than one PCIe device at a time.
Fixes an issue where keepalived track script fails on single controller environment and keepalived VIP goes into BACKUP state.
keepalived_track_script_enabled
variable has been introduced (default: true), which can be used to disable track scripts in keepalived configuration. LP#2025219
Fixes an issue were an OVS-DPDK task had a different name to how it was being notified.
When upgrading Nova to a new release, we use the tool
nova-status upgrade check
to make sure that there are nonova-compute
that are older than N-1 releases. This was performed using the currentnova-api
container, so computes which will be too old after the upgrade were not caught. Now the upgradednova-api
container image is used, so older computes are identified correctly. LP#1957080
15.2.0¶
New Features¶
Since CVE-2022-29404 is fixed the default value for the LimitRequestBody directive in the Apache HTTP Server has been changed from 0 (unlimited) to 1073741824 (1 GiB). This limits the size of images (for example) uploaded in Horizon. Now this limit can be configured via
horizon_httpd_limitrequestbody
. LP#2012588
etcd is now exposed internally via HAProxy on
etcd_client_port
.
Added two new flags to alter behaviour in RabbitMQ: * rabbitmq_message_ttl_ms, which lets you set a TTL on messages. * rabbitmq_queue_expiry_ms, which lets you set an expiry time on queues. See https://www.rabbitmq.com/ttl.html for more information on both.
The config option rabbitmq_ha_replica_count is added, to allow for changing the replication factor of mirrored queues in RabbitMQ. While the flag is unset, the queues are mirrored across all nodes using “ha-mode”:”all”. Note that this only has an effect if the flag ` om_enable_rabbitmq_high_availability` is set to True, as otherwise queues are not mirrored.
The config option rabbitmq_ha_promote_on_shutdown has been added, which allows changing the RabbitMQ definition ha-promote-on-shutdown. By default ha-promote-on-shutdown is “when-synced”. We recommend changing this to be “always”. This basically means we don’t mind losing some messages, instead we give priority to rabbitmq availability. This is most relevant when restarting rabbitmq, such as when upgrading. Note that setting the value of this flag, even to the default value of “when-synced”, will cause RabbitMQ to be restarted on the next deploy. For more details please see: https://www.rabbitmq.com/ha.html#cluster-shutdown
Services using etcd3gw via tooz now use etcd via haproxy. This removes a single point of failure, where we hardcoded the first etcd host for backend_url.
Upgrade Notes¶
Default tags of
neutron_tls_proxy
andglance_tls_proxy
have been changed tohaproxy_tag
, as both services are usinghaproxy
container image. Any custom tag overrides for those services should be altered before upgrade.
Security Issues¶
The kolla-genpwd, kolla-mergepwd, kolla-readpwd and kolla-writepwd commands now creates or updates passwords.yml with correct permissions. Also they display warning message about incorrect permissions.
Bug Fixes¶
The precheck for RabbitMQ failed incorrectly when
kolla_externally_managed_cert
was set totrue
. LP#1999081
Fixes removal of Elasicsearch and Kibana loadbalancer configs during migration to Opensearch, when those services are running on a dedicated monitoring node.
Fixes create sasl account before config file is ready. LP#2015589
Set correct permissions for opensearch-dashboard data location LP#2020152 https://bugs.launchpad.net/kolla-ansible/+bug/2020152
The flags
--db-nb-pid
and--db-sb-pid
have been corected to be--db-nb-pidfile
and--db-sb-pidfile
respectively. See here for reference: https://github.com/ovn-org/ovn/blob/6c6a7ad1c64a21923dc9b5bea7069fd88bcdd6a8/utilities/ovn-ctl#L1045 LP#2018436
Configuration of service user tokens for all Nova and Cinder services is now done automatically, to ensure security of block-storage volume data.
See LP#[2004555] for more details.
Fixes deployment when using Ansible check mode. LP#2002661
Fixes the incorrect endpoint URLs and service type information for the Cyborg service in the Keystone. LP#2020080
Set the etcd internal hostname and cacert for tls internal enabled deployments. This allows services to work with etcd when coordination is enabled for TLS interal deployments. Without this fix, the coordination backend fails to connect to etcd and the service itself crashes.
Fixes opensearch migration process. Including case when elasticsearch is located in regular folder instead of docker volume. Furthermore it now has checks if there is data to migrate.
When upgrading or deploying RabbitMQ, the policy ha-all is cleared if om_enable_rabbitmq_high_availability is set to false.
15.1.0¶
New Features¶
Adds the flag
om_enable_rabbitmq_high_availablity
. Setting this totrue
will enable both durable queues and classic mirrored queues in RabbitMQ. Note that classic queue mirroring and transient (aka non-durable) queues are deprecated and subject to removal in RabbitMQ version 4.0 (date of release unknown). Changes the pattern used in classic mirroring to exclude some queue types. This pattern is^(?!(amq\\.)|(.*_fanout_)|(reply_)).*
.
Adds
ovn-monitor-all
variable. A boolean value that tells if ovn-controller should unconditionally monitor all records in OVS databases. Settingovn-monitor-all
variable to ‘true’ will remove some CPU load from OVN SouthBound DB but will effect with more updates comming to ovn-controller. Might be helpfull in large deployments with many compute hosts.
Bug Fixes¶
Fixes
kolla_docker
module which did not take into account the common_options parameter, so there were always module’s default values. LP#2003079
The value of
[oslo_messaging_rabbit] heartbeat_in_pthread
is explicitly set to eithertrue
for wsgi applications, orfalse
otherwise.
Fix issue with octavia config generation when using
octavia_auto_configure
and thegenconfig
command. Note that access to the OpenStack API is necessary for Octavia auto configuration to work, even when generating config. See LP#1987299 for more details.
Fixes OVN deployment order - as recommended in OVN docs. LP#1979329
Fixes an issue where some prechecks would fail or not run when running in check mode. LP#2002657
Prevent haproxy-config role from attempting to configure firewalld during a kolla-ansible genconfig. LP#2002522
15.0.0¶
New Features¶
Adds a set of variables to control the cinder backend name, as used in cinder.conf. This is the name you use when setting the volume_backend_name property on volume types. Details are in the cinder guide section of the documentation.
Enables configuring firewalld for external API services. Extracts the required services and checks the external port, then adds the ports to a firewalld zone. Assumes that firewalld has been installed and configured beforehand. The variable disable_firewall, is disabled by default to preserve backwards compatibility. But its good practice to have the system firewall configured.
Adds support for deploying OpenSearch and OpenSearch dashboards. These services directly replace ElasticSearch and Kibana which are now end-of-life. Support for sending logs to a remote ElasticSearch (or OpenSearch) cluster is maintained.
Allow cinder-volume to be configured to use Pure Storage FlashArray with either the iSCSI or FC driver.
Adds possibility for inlcuding custom alert notification templates with Prometheus Alertmanager.
Adds a new, disabled by default, option for Prometheus OpenStack exporter, named “enable_prometheus_openstack_exporter_external”. This option allows exposing OpenStack exporter through HAProxy, and may be used to expose OpenStack metrics to an existing Prometheus server outside the OpenStack cloud, instead of using the default one provided by OpenStack.
Adds a new flag,
openvswitch_ovs_vsctl_wrapper_enabled
which will install a wrapper script to/usr/bin/ovs-vsctl
to docker exec into the openvswitchd container.
Adds the
prometheus_scrape_interval
configuration option. The default is set to60s
. This configures the default scrape interval for all jobs.
Adds
bifrost_deploy_verbosity
parameter. It allows to change the verbosity of the Bifrost bootstrap task.-vvvv
is a default value.
Adds support for configuring the CloudKitty fetcher using
cloudkitty_fetcher_backend
.
New switches added to control deployment of the Masakari monitors. The deployment of each type of monitors can be controlled individually via
enable_masakari_instancemonitor
andenable_masakari_hostmonitor
. By default, both are set totrue
when the deployment of the Masakari is enabled viaenable_masakari
.
Sanity checks have been removed. These “smoke tests” orignially were implemented for barbican, cinder, glance and keystone.
Kolla Ansible now supports failing execution early if fact collection fails on any of the hosts. This is to avoid late failures due to missing facts (especially cross-host). This is possible by setting
kolla_ansible_setup_any_errors_fatal: true
. Do note this still supports host fact caching and it will not affect scenarios with all facts cached (as there is no task to fail).
Adds a new variable
ceilometer_prometheus_pushgateway_options
.It is dictionary whose keys and respective values are added to the pushgateway’s URL, checking that no “None” value is being set.
For example, the following configurations:
ceilometer_prometheus_pushgateway_host: "127.0.0.1" ceilometer_prometheus_pushgateway_port: "9091" ceilometer_prometheus_pushgateway_options: timeout: 180 max_retries: verify_ssl: yes
Result in the following URL:
prometheus://127.0.0.1:9091/ \ metrics/job/openstack-telemetry/?timeout=180&verify_ssl=True
Adds support for managing resource providers via config files.
Adds support for setting up arbitrary HAProxy services in active/passive mode.
Implements container healthchecks for mariadb-server service. See blueprint
Adds support for configuring a coordination backend for Ironic Inspector via the
ironic_coordination_backend
variable. Possible values areredis
oretcd
.
Adds support for multiple DHCP ranges in the Ironic Inspector DHCP server.
Adds
ironic_http_interface/ironic_http_interface_address
parameters to set the addresses for theironic_http
service.
Support for both PXE and iPXE enabled in Ironic at the same time.
Adds variables to configure whether monitoring services should be exposed externally:
enable_grafana_external
enable_kibana_external
enable_prometheus_alertmanager_external
Adds support for configuring a number of UDP workers for Designate’s bind9 backend via the
designate_backend_bind9_workers
variable.
Adds support for configuring the Openstack Compute API microversion used by the OpenStack exporter for Prometheus using the
prometheus_openstack_exporter_compute_api_version
variable. The default value islatest
, matching the default behaviour of the exporter.
Adds
ovn-openflow-probe-interval
variable. It sets the inactivity probe interval of the OpenFlow connection to the OpenvSwitch integration bridge, in seconds. If the value is zero, it disables the connection keepalive feature. The default value is 60 seconds.
Adds support for deploying
prometheus-msteams
, which can be used to forward Prometheus Alertmanager notifications to Microsoft Teams. It is enabled by settingenable_prometheus_msteams
totrue
.
Adds ability to configure ProxySQL’s max replication lag via configuration value
proxysql_backend_max_replication_lag
which is set to default value as per documentation. If it is greater than 0, ProxySQL will regularly monitor replication lag and if it goes beyond the configured threshold it will temporary shun the host until replication catches up. Please see the official upgrade notes for more detail.
Upgrade Notes¶
If you are currently deploying ElasticSearch with Kolla Ansible, you should backup the data before starting the upgrade. The contents of the ElasticSearch data volume will be automatically moved to the OpenSearch volume. The ElasticSearch, ElasticSearch Curator and Kibana containers will be removed automatically. The inventory must be updated so that the
elasticsearch
group is renamed toopensearch
, and the kibana group is renamed toopensearch-dashboards
.
Enable TLS by default in Bifrost. Bifrost is now configured to enable TLS for the services it deploys, and generate self-signed certificates for them. TLS may be disabled by setting
enable_tls
tofalse
in/etc/kolla/config/bifrost/bifrost.yml
.
image_upload_use_cinder_backend = True
is no longer set on the Cinder’s default Ceph RBD backend, the common upstream default is now used (False
currently). See also LP#1991516
Kolla Ansible no longer sets
show_multiple_locations = True
by default when Glance’s Ceph RBD backend is enabled. This was applied as a fix but operators must note that this, in turn, disables the Cinder’s and Nova’s optimisations. On the other hand, these optimisations might have been causing other operators’ trouble. Please see the linked bug report. Operators relying on this feature can set the flag themselves using service config overrides. LP#1992153
Modifies the default value of
enable_hacluster
fromno
toyes
ifmasakari-hostmonitor
is enabled. LP#1934149
Sanity checks have been removed because they were broken.
The Nova legacy service and its endpoints are no longer advertised by default. To revert to the old behaviour, please set
nova_enable_nova_legacy_service: true
inglobals.yml
.
The variable
keystone_token_provider
does not exist anymore, because there is no alternative.
OpenStack Monasca is no longer supported by Kolla Ansible. Support for deploying
kafka
,storm
andzookeeper
has been dropped since they have been used only with Monasca. Post-upgrade cleanup of those services can be done usingkolla-ansible monasca_cleanup
- for details please see Monasca guide
Modifies the default lease time of the Ironic Inspector DHCP server to 10 minutes. This is small enough to use small pools of IP addresses for inspection but gives more room for the inspection to succeed. This default can be changed globally via
ironic_dnsmasq_dhcp_default_lease_time
variable or per range vialease_time
parameter.
Replaced
ironic_dnsmasq_dhcp_range
andironic_dnsmasq_default_gateway
in favour ofironic_dnsmasq_dhcp_ranges
. For example, if you have:ironic_dnsmasq_dhcp_range: "10.42.0.2,10.42.0.254,255.255.255.0" ironic_dnsmasq_default_gateway: "10.42.0.1"
replace it with:
ironic_dnsmasq_dhcp_ranges: - range: "10.42.0.2,10.42.0.254,255.255.255.0" routers: "10.42.0.1"
Ironic volumes related to PXE (TFTP) and iPXE & direct deploy (HTTP) are refactored to share a common parent path at
/var/lib/ironic
. This is done to support both PXE and iPXE at the same time. Operators doing advanced customisations might need to review the relevant defaults section.
Upgrades of Ironic will now wait for nodes in
wait
states to change their state. This is to improve the user experience by avoiding breaking processes being waited on. This can be disabled by settingironic_upgrade_skip_wait_check
toyes
.
Ironic containers related to PXE (TFTP) and iPXE & direct deploy (HTTP) are renamed to better reflect their role:
ironic_pxe
is nowironic_tftp
, whileironic_ipxe
is nowironic_http
. Operators doing advanced customisations might need to review the relevant defaults section. Additionally, their respective host groups have changed analogously:ironic-pxe
is nowironic-tftp
, andironic-ipxe
is nowironic-http
.
The Keystone’s admin endpoint is no longer created by default. Operators of existing deployments may wish to remove it after the upgrade completes. Operators having external services relying on the availability of the Keystone’s admin endpoint may set
keystone_create_admin_endpoint
totrue
to keep creating the admin endpoint but such support will be removed after Zed.
Keystone’s admin interface no longer points to a separate port. On upgrade, the port is preserved to maintain the intermediate compatibility. Users are advised to run the deploy and post-deploy commands afterwards to ensure port’s cleanup. For more information, please refer to the docs. Please note that the relevant variables
keystone_admin_port
,keystone_admin_url
andadmin_protocol
are no longer used and are deprecated for removal after Zed. Please cease their usage in your customisations.
Starting with Zed, Neutron marked the
linuxbridge
ML2 driver experimental. The Kolla team has decided to honour the upstream’s decision and make sure users are aware they are using a badly supported driver instead of having it configured out of the box. Thus, all users of this driver are advised to get acquainted with Neutron docs and proceed accordingly.
ovn
role has been split intoovn-controller
andovn-db
roles, therefore users that haveovn_extra_volumes
configured need to adapt their config to useovn_db_extra_volumes
orovn_controller_extra_volumes
.
For ovn the default value of openflow-probe-interval was changed to 60 seconds. Use the
ovn-openflow-probe-interval
variable to override.
Prometheus has been switched to active/passive mode. This is enabled by default but can be turned off by setting
prometheus_active_passive
tono
. See bug 1928193.
Prometheus Alertmanager has been switched to active/passive mode. This is enabled by default but can be turned off by setting
prometheus_alertmanager_active_passive
tono
.
The deprecated
enable_ironic_ipxe
variable has been removed. The iPXE still works by default and it can be disabled by setting the more-aptly-namedironic_dnsmasq_serve_ipxe
tofalse
.
The deprecated
storage_interface
variable has been removed. Please set theswift_storage_interface
directly.
Deprecated sysctl knobs related to
ip_forward
andrp_filter
were removed.
Influxdb variable
infuxdb_internal_endpoint
has been fixed toinfluxdb_internal_endpoint
. Operators might need to review the relevant variable.
Deprecation Notes¶
enable_ironic_ipxe
is deprecated in favour ofironic_dnsmasq_serve_ipxe
which reflects the effect better.enable_ironic_ipxe
will be removed in Zed.
enable_ironic_pxe_uefi
is deprecated and will be removed in Zed. This variable is not documented and results in a broken PXE setup for Ironic Inspector. The recommended way to support EFI/UEFI deployments in Ironic Inspector is to stay with the recommended default of iPXE in Ironic Inspector (see docs onironic_dnsmasq_serve_ipxe
).
In the April 2022 PTG the deprecation and removal of the sanity checks has been confirmed. Therefore the usage of
kolla-ansible check
is not possible any more.
Variables
keystone_admin_port
,keystone_admin_url
andadmin_protocol
are deprecated for removal after Zed.
Security Issues¶
Kolla Ansible used to run Ironic’s tftpd as an (unprivileged) root user. Now, it will explicitly use the nobody user.
Bug Fixes¶
The scrape interval for the prometheus data source in grafana is now to set to
prometheus_scrape_interval
. This fixes issues with dashboards that use the$__rate_interval
grafana variable as the default scrape interval of 60s does not match the grafana default of 15s.
Fixes an issue in the
bifrost_deploy
container where passwords generated by Bifrost were not persistent beyond the lifetime of the container. This is generally not a problem unless you access the Ironic or Inspector APIs outside of the Bifrost playbooks. LP#1983356
Fixes the issue of exponential growth of /run/openvswitch mounts when kolla-toolbox container is restarted. LP#1979295
Fixes LP#1982777. Set multipathd user_friendly_names to “no” to make os-brick able to resize volumes online. Adds ability to override multipathd config.
Fixed bug #1987982. This bug caused the database log_bin_trust_function_creators variable not to be set back to “OFF” after a keystone upgrade.
image_upload_use_cinder_backend = True
is no longer set on the Cinder’s default Ceph RBD backend. Related ERRORs and WARNINGs in Cinder and Glance logs are prevented. LP#1991516
Kolla Ansible no longer sets
show_multiple_locations = True
by default when Glance’s Ceph RBD backend is enabled. This caused various issues with the services running with the recommended Ceph permissions. LP#1992153
Fixes missing logrotate configuration for proxysql logs. LP#1995248
Fixes an issue when
masakari-hostmonitor
is enabled while corosync/pacemaker is not deployed. LP#1934149
Fixes an issue with recovering multi-node MariaDB Galera cluster.
Adds configuration necessary for application credential access rules to properly function. LP#1965111
Fixes an issue with AlertManager external Web URL being unconfigurable. A new variable
prometheus_alertmanager_external_url
has been introduced that users can use to set web.external-url to public.
Fixes an issue where Ironic Inspector could be configured without authentication in a multi-region environment in a region without a local Keystone service.
Fixes Keystone OIDC failing to validate JWT because of missing key on Azure auth-oidc endpoint. Adds new variable containing JWKS uri that delivers missing keys. LP#1990375
Fixes missing
[taskflow]
section in masakari.conf.j2 LP#1966536
Fixes Zun capsules loosing network namespaces after restarting zun_cni_daemon container
Under circumstances of extended disruption to the Fluentd-ElasticSearch central logging pipeline, it is possible to generate a sufficient buffer of unsent log data that takes longer than the default Fluentd request timeout (default 5 seconds) to transfer the buffer. The default request timeout value is raised to
60s
, and made configurable using new parameterfluentd_elasticsearch_request_timeout
. LP#1983031
Increases
prometheus_openstack_exporter_timeout
to 45 seconds to reduce the odds of scrape failures on deployments with large number of OpenStack resources. LP#1976629
Fixes Ironic API healthchecks when backend TLS encryption is enabled. LP#1990819
Removes the
dhcp-sequential-ip
configuration option fromironic_dnsmasq
to avoid a race condition offering the same IP address to multiple hosts being inspected at the same time.
Fixes an issue with
ironic-inspector
using the wrong option to configure the interface used to communicate with the Ironic API. LP#1995246
Fixes an issue with
ironic-neutron-agent
using the wrong option to configure the interface used to communicate with the Ironic API. LP#1990675
If
ironic_enabled_notification_topics
is set totrue
,ironic_notification_level
is set toinfo
in order to ensure that Ironic actually sends out notifications.See bug 1969826 for details.
Fixes monitor: kolla be added in external_labels by default. Prometheus default config should not include environment-specific details. In this patch, modify external_labels be optional, we can add any <labelname>: <labelvalue> in external_labels. LP#1944699
Fixes an issue with Masakari instance monitor when libvirt SASL is enabled. libvirt SASL was enabled by default in a recent change to Kolla Ansible. LP#1965754
Fixes an issue where a failure of any Nova compute service to register itself would cause only the host querying the nova API to fail. Now, only hosts that fail to register will fail the Kolla Ansible run. Alternatively, to fail all hosts in a cell when any compute service fails to register, set
nova_compute_registration_fatal
totrue
. LP#1940119
The prometheus openstack exporters are now behind haproxy, providing a unique time series in the prometheus database. Also ensures that only one exporter queries the openstack APIs at any given time interval. With the previous behavior each openstack exporter was scraped at the same time. This caused each exporter to query the openstack APIs simultaneously introducing unneccesary load and duplicate time series in the prometheus database due to the instance label being unique for each exporter. LP#1972818
Fixes an issue with misaligned data points in grafana when loadbalancing over multiple prometheus server instances. See bug 1928193.
Fixes an issue with Alertmanager silence creation leading to a 404 page. LP#1987866
Other Notes¶
sets balancing algorithm to round-robin for horizon if memcached is enabled LP#1990523
tools/ovs-dpdkctl.sh
moved toansible/roles/ovs-dpdk/files/ovs-dpdkctl.sh
Rocky Linux 9 based images are now recommended (instead of CentOS Stream ones).