Train Series Release Notes¶
9.3.2-21¶
New Features¶
Adds a new flag,
docker_disable_default_network
, which defaults tono
. Docker is using172.17.0.0/16
by default for bridge networking ondocker0
, and this might cause routing problems for operator networks. Setting this flag toyes
will disable Docker’s bridge networking. This feature will be enabled by default from the Wallaby 12.0.0 release.
Upgrade Notes¶
baremetal
role now uses CentOS8
package repository for Docker CE (compared to7
previously).
Security Issues¶
Adds mitigation for the Apache Log4j2 Remote Code Execution (RCE) Vulnerability in Elasticsearch - CVE-2021-44228.
Bug Fixes¶
Fix the issue when Swift deployed with S3 Token Middleware enabled. Fixes LP#1862765
Fixed an issue when Docker was configured after startup on Debian/Ubuntu, which resulted in iptables rules being created - before they were disabled. LP#1923203
Fixes
iscsid
failing in current CentOS 8 based images due to pid file being needlessly set. LP#1933033
Fixes unable to connect to zun console when
kolla_enable_tls_external
is true. Access to console of any zun container fails whenkolla_enable_tls_external
is true. This fix sets the protocol for wsproxybase_url
inzun.conf
according to the value ofkolla_enable_tls_external
LP#1957117
Adds a new flag,
docker_disable_ip_forward
, which defaults tono
and can be used (by settingyes
) to disable docker’sip-forward
option which makes docker setnet.ipv4.ip_forward
sysctl to1
. This is to protect from creating all-forwarding hosts. LP#1931615
adds back the option to configure the rabbitmq clustering interface via kolla LP#1900160 <https://bugs.launchpad.net/kolla-ansible/+bug/1900160>
Fixes an issue seen when using Jinja2 3.1.0.
9.3.2¶
Bug Fixes¶
A bug where sriov_agent.ini wasn’t copied due to
Permission denied
error was fixed. LP#1923467
Fixed an issue where docker python SDK 5.0.0 was failing due to missing six - introduced a constraint to install version lower than 5.x. LP#1928915
Fixes more-than-2-node RabbitMQ upgrade failing randomly. LP#1930293.
Fixes potential issue with Alertmanger in non-HA deployments. In this scenario, peer gossip protocol is now disabled and Alertmanager won’t try to form a cluster with non-existing other instances. LP#1926463
Fixes some configuration issues around Barbican logging. LP#1891343
Fixes some configuration issues around Cinder logging. LP#1916752
Fixes an issue with keepalived which was not recreated during an upgrade if configuration is unchanged. LP#1928362
Fixes an issue with executing
kolla-ansible
when installed viapip install --user
. LP#1915527
Removes whitespace around equal signs in
zookeeper.cfg
which were preventing thezkCleanup.sh
script from running correctly.
9.3.1¶
Bug Fixes¶
Fix the wrong configuration of the ovs-dpdk service. this breaks the deployment of kolla-ansible. For more details please see bug 1908850.
9.3.0¶
New Features¶
Adds a new flag,
docker_disable_default_iptables_rules
, which defaults tono
. Docker is manipulating iptables rules by default to provide network isolation, and this might cause problems if the host already has an iptables based firewall. A common problem is that Docker sets the default policy of theFORWARD
chain in thefilter
toDROP
. Settingdocker_disable_default_iptables_rules
toyes
will disable Docker’s iptables manipulation. This feature will be enabled by default from the Victoria 11.0.0 release.
Improves performance of the
common
role by generating all fluentd configuration in a single file.
Improves performance of the
common
role by generating all logrotate configuration in a single file.
Upgrade Notes¶
The default value of
REST_API_REQUIRED_SETTINGS
was synchronized with Horizon. You may want to review settings exposed by the updated configuration.
Security Issues¶
The
admin-openrc.sh
file generated bykolla-ansible post-deploy
was previously created withroot:root
ownership and644
permissions. This would allow anyone with access to the same directory to read the file, including the admin credentials. The ownership ofadmin-openrc.sh
is now set to the user executingkolla-ansible
, and the file is assigned a mode of600
. This change can be applied by runningkolla-ansible post-deploy
.
Bug Fixes¶
Add support to use bifrost-deploy behind proxy. It uses existing container_proxy variable.
Fixes handling of /dev/kvm permissions to be more robust against host-level actions. LP#1681461
Rework keystone fernet bootstrap which had tendencies to fail on multinode setups. See bug 1846789 for details.
IPv6 fully-routed topology (/128 addressing) is now allowed (where applicable). LP#1848941
Fixes deployment of fluentd without any enabled OpenStack services. LP#1867953
This patch adds
kolla-ansible
internal logrotate config for Logstash. Logstash 2.4 uses integrated in container logrotate configuration which tries to rotate logs in /var/log/logstash whilekolla-ansible
deployed Logstash logs are in /var/log/kolla/logstash. LP#1886787
Fixes
--configdir
parameter to apply to defaultpasswords.yml
location. LP#1887180
fluentd
is now logging to/var/log/kolla/fluentd/fluentd.log
instead ofstdout
. LP#1888852
Fixes
deploy-containers
action missing for the Masakari role. LP#1889611
An issue has been fixed when
keystone
container would be stuck in restart loop with a message that fernet key is stale. LP#1895723
Fixes
haproxy_single_service_split
template to work with default formode
(http
). LP#1896591
Fixed invalid fernet cron file path on Debian/Ubuntu from
/var/spool/cron/crontabs/root/fernet-cron
to/var/spool/cron/crontabs/root
. LP#1898765
Add with_first_found on placement for placement-api wsgi configuration to allow overwrite from users. LP#1898766
RabbitMQ services are now restarted serially to avoid a split brain. LP#1904702
Fixes LP#1906796 by adding notice and note loglevels to monasca log-metrics drop configuration
Fixes Swift’s stop action. It will no longer try to start
swift-object-updater
container again. LP#1906944
Fixes an issue with the
kolla-ansible prechecks
command with Docker 20.10. LP#1907436
Fixes an issue with
kolla-ansible mariadb_recovery
when themariadb
container does not exist on one or more hosts. LP#1907658
fix deploy freezer failed when use kolla_dev_mod LP#1888242
Fixes issues with some CloudKitty commands trying to connect to an external TLS endpoint using HTTP. LP#1888544
Fixes an issue where Docker may fail to start if
iptables
is not installed. LP#1899060
The
admin-openrc.sh
file generated bykolla-ansible post-deploy
was previously created withroot:root
ownership and644
permissions. This would allow anyone with access to the same directory to read the file, including the admin credentials. The ownership ofadmin-openrc.sh
is now set to the user executingkolla-ansible
, and the file is assigned a mode of600
. This change can be applied by runningkolla-ansible post-deploy
.
Fixes an issue during deleting evacuated instances with encrypted block devices. LP#1891462
Fixes an issue where Keystone Fernet key rotation may fail due to permission denied error if the Keystone rotation happens before the Keystone container starts. LP#1888512
Fixes an issue with Keystone startup when Fernet key rotation does not occur within the configured interval. This may happen due to one of the Keystone hosts being down at the scheduled time of rotation, or due to uneven intervals between cron jobs. LP#1895723
Fixes an issue where Grafana instances would race to bootstrap the Grafana DB. See LP#1888681.
Fixes LP#1892210 where the number of open connections to Memcached from
neutron-server
would grow over time until reaching the maximum set bymemcached_connection_limit
(5000 by default), at which point the Memcached instance would stop working.
An issue where when Kafka default topic creation was used to create a Kafka topic, no redundant replicas were created in a multi-node cluster. LP#1888522. This affects Monasca which uses Kafka, and was previously masked by the legacy Kafka client used by Monasca which has since been upgraded in Ussuri. Monasca users with multi-node Kafka clusters should consultant the Kafka documentation to increase the number of replicas.
Fixes an issue where the
br_netfilter
kernel module was not loaded on compute hosts. LP#1886796
Prevents adding a new Keystone host to an existing cluster when not targeting all Keystone hosts (e.g. due to
--limit
or--serial
arguments), to avoid overwriting existing Fernet keys. LP#1891364
Reduce the use of SQLAlchemy connection pooling, to improve service reliability during a failover of the controller with the internal VIP. LP#1896635
No longer configures the Prometheus OpenStack exporter to use the
prometheus
Docker volume, which was never required.
Updates the default value of
REST_API_REQUIRED_SETTINGS
in Horizonlocal_settings
, which enables some features such as selecting the default boot source for instances. LP#1891024
9.2.0¶
New Features¶
Adds ability to provide a custom elasticsearch config.
Adds Elasticsearch Curator for managing aggregated log data.
Kolla Ansible checks now that the local Ansible Python environment is coherent, i.e. used Ansible can see Kolla Ansible. LP#1856346
Upgrade Notes¶
Avoids unnecessary fact gathering using the
setup
module. This should improve the performance of environments using fact caching and the Ansiblesmart
fact gathering policy. See blueprint for details.
Adds
elasticsearch_use_v6
andkibana_use_v6
flags which can be set totrue
to deploy theelasticsearch6
andkibana6
images on CentOS 7 or 8. These flags aretrue
by default on CentOS 8, andfalse
elsewhere. The services should be upgraded from 5.x to 6.x viakolla-ansible upgrade elasticsearch,kibana
, and this can be used to provide an Elasticsearch 6.x cluster that is compatible between CentOS 7 and 8.
In the previous stable release, the octavia user was no longer given the admin role in the admin project, and a task was added to remove the role during upgrades. However, the octavia configuration was not updated to use the service project, causing load balancer creation to fail.
There is also an issue for existing deployments in simply switching to the service project. While existing load balancers appear to continue to work, creating new load balancers fails due to the security group belonging to the admin project. For this reason, Train and Stein have been reverted to use the admin project by default, while from the Ussuri release the service project will be used by default.
To provide flexibility, an
octavia_service_auth_project
variable has been added. In the Train and Stein releases this is set toadmin
by default, and from Ussuri it will be set toservice
by default. For users of Train and Stein,octavia_service_auth_project
may be set toservice
in order to avoid a breaking change during the Ussuri upgrade.To switch an existing deployment from using the
admin
project to theservice
project, it will at least be necessary to create the required security group in theservice
project, and updateoctavia_amp_secgroup_list
to this group’s ID. Ideally the Amphora flavor and network would also be recreated in theservice
project, although this does not appear to be necessary for operation, and will impact existing Amphorae.See bug 1873176 for details.
Changes the default value of
kibana_elasticsearch_ssl_verify
fromfalse
totrue
. LP#1885110
Apache ZooKeeper will now be automatically deployed whenever Apache Storm is enabled.
When deploying Monasca with Logstash 6 (the default for Centos 8), any custom Logstash 2 configuration for Monasca will need to be updated to work with Logstash 6. Please consult the documentation.
Bug Fixes¶
Fixes Kibana deployment with the new E*K stack (6+). LP#1799689
Fixes Grafana datasource update. LP#1881890
Removing chrony package and AppArmor profile from docker host if containerized chrony is enabled. LP#1882513
Escapes table names in mariadb upgrade procedure. LP#1883141
Fixes an issue with Manila deployment starting
openvswitch
andneutron-openvswitch-agent
containers whenenable_manila_backend_generic
was set toFalse
. LP#1884939
Fixes the Elasticsearch Curator cron schedule run. LP#1885732
Fixes an incorrect configuration for nova-conductor when a custom Nova policy was applied, preventing the
nova_conductor
container from starting successfully. LP#1886170
Add missing “become: true” on some VMWare related tasks. Fixed on
Copying VMware vCenter CA file
andCopying over nsx.ini
.
fix deploy nova failed when use kolla_dev_mod.
In line with clients for other services used by Magnum, Cinder and Octavia also use endpoint_type = internalURL. In the same tune, these services also use the globally defined openstack_region_name.
Fixes the default CloudKitty configuration, which included the
gnocchi_collector
andkeystone_fetcher
options that were deprecated in Stein and removed in Train. See bug 1876985 for details.
Fixes an issue with Cinder upgrades that would cause online schema migration to fail. LP#1880753
Fix cyborg api container failed to load api paste file. For details please see bug 1874028.
Fix the configuration of the etcd service so that its protocol is independant of the value of the
internal_protocol
parameter. The etcd service is not load balanced by HAProxy, so there is no proxy layer to do TLS termination wheninternal_protocol
is configured to behttps
.
Fixes an issue where
fernet_token_expiry
would fail the pre-checks despite being set to a valid value. Please see bug 1856021 for more details.
The kolla_logs Docker volume is now mounted into the Elasticsearch container to expose logs which were previously written erroneously to the container filesystem (bug 1859162). It is up to the user to migrate any existing logs if they so desire and this should be done before applying this fix.
In the previous stable release, the octavia user was no longer given the admin role in the admin project, and a task was added to remove the role during upgrades. However, the octavia configuration was not updated to use the service project, causing load balancer creation to fail. See upgrade notes for details. LP#1873176
Fixes an issue with RabbitMQ where tags would be removed from the
openstack
user after deploying Nova. This prevents the user from accessing the RabbitMQ management UI. LP#1875786
Adds a new variable
fluentd_elasticsearch_cacert
, which defaults to the value ofopenstack_cacert
. If set, this will be used to set the path of the CA certificate bundle used by Fluentd when communicating with Elasticsearch. LP#1885109
Improves error reporting in
kolla-genpwd
andkolla-mergepwd
when input files are not in the expected format. LP#1880220.
Fixes Magnum trust operations in multi-region deployments.
Deploys Apache ZooKeeper if Apache Storm is enabled explicitly. ZooKeeper would only be deployed if Apache Kafka was also enabled, which is often done implicitly by enabling Monasca.
When deploying Elasticsearch 6 (the default for Centos 8), Logstash 2 was deployed by default which is not compatible with Elasticsearch 6. Logstash 6 is now deployed by default when using Centos 8 containers.
9.1.0¶
New Features¶
Adds support for CentOS 8 as a host Operating System and base container image. This is the only major version of CentOS supported from the Ussuri release. The Train release supports both CentOS 7 and 8 hosts, and provides a route for migration.
Add Object Storage service (Swift) support for Ironic.
Adds a new variable,
openstack_tag
, which is used as the default Docker image tag in place ofopenstack_release
. The default value isopenstack_release
, with a suffix set viaopenstack_tag_suffix
. The suffix is empty except on CentOS 8 where it is set to-centos8
. This allows for the availability of images based on CentOS 7 and 8.
Upgrade Notes¶
Some images are supported by CentOS 7 but lack suitable packages in CentOS 8, and are not supported for CentOS 8. See Kolla release notes for details.
Adds a
rabbitmq_use_3_7_24_on_centos7
flag which can be set totrue
to deploy therabbitmq-3.7.24
image on CentOS 7. The image should be deployed viakolla-ansible upgrade
, and can be used to provide a RabbitMQ cluster that is compatible with the CentOS 8rabbitmq
image.
Support for the SCSI target daemon (
tgtd
) has been removed for CentOS/RHEL 8. The default value ofcinder_target_helper
is nowlioadm
on CentOS/RHEL 8, but remains astgtadm
on other platforms.
The octavia user is no longer given the admin role in the admin project. Octavia does not require this role and instead uses octavia user with admin role in service project. During an upgrade the octavia user is removed from the admin project. See bug 1873176 for details.
Security Issues¶
Fixes leak of RabbitMQ password into Ansible logs. LP#1865840
Bug Fixes¶
Fix that the cyborg conductor failed to communicate with placement. See bug 1873717.
Fix that cyborg agent failed to start privsep daemon. Add privileged capability for cyborg agent. See bug 1873715.
Adds necessary
region_name
tooctavia.conf
whenenable_barbican
is set totrue
. LP#1867926
Adds
/etc/timezone
toDebian/Ubuntu
containers. LP#1821592
Fixes an issue with Nova live migration not using
migration_interface_address
even when TLS was not used. When migrating an instance to a newly added compute host, if addressing depended on/etc/hosts
and it had not been updated on the source compute host to include the new compute host, live migration would fail. This did not affect DNS-based name resolution. Analogically, Nova live migration would fail if the address in DNS//etc/hosts
was not the same asmigration_interface_address
due to user customization. LP#1729566
Fix qemu loading of ceph.conf (permission error). LP#1861513
Remove /run bind mounts in Neutron services causing dbus host-level errors and add /run/netns for neutron-dhcp-agent and neutron-l3-agent. LP#1861792
Fixes an issue where old fluentd configuration files would persist in the container across restarts despite being removed from the
node_custom_config
directory. LP#1862211
Use more permissive regex to remove the offending 127.0.1.1 line from /etc/hosts. LP#1862739
Each Prometheus mysqld exporter points now to its local mysqld instance (MariaDB) instead of VIP address. LP#1863041
Cinder Backup has now access to kernel modules to load e.g. iscsi_tcp module. LP#1863094
Makes RabbitMQ hostname address resolution precheck stronger by requiring uniqueness of resolution to avoid later issues. LP#1863363
Fix protocol used by
neutron-metadata-agent
to connect to Nova metadata service. This possibly affected internal TLS setup. Fixes LP#1864615
Fixes haproxy role to avoid restarting haproxy service multiple times in a single Ansible run. LP#1864810 LP#1875228
Fixes an issue with deploying Grafana when using IPv6. LP#1866141
Fixes elasticsearch deployment in IPv6 environments. LP#1866727
Fixes failure to deploy telegraf with monitoring of zookeeper due to wrong variable being referenced. LP#1867179
Fixes
ceph
deployment reconfiguration error, when Gathering OSDs step would fail due to Kolla-Ansible user not having access to/var/lib/ceph/osd/_FSID_/whoami
. LP#1867946
Fix missing glance_ca_certificates_file variable in glance.conf. LP#1869133
Add client ca_cert file in heat LP#1869137
Fixes
designate-worker
not to useetcd
as its coordination backend because it is not supported by Designate (no group membership support available via tooz). LP#1872205
Fixes source-IP-based load balancing for Horizon when using the “split” HAProxy service template.
Fixes issue where HAProxy would have no backend servers in its config files when using the “split” config template style.
Manage nova scheduler workers through
openstack_service_workers
variable. LP#1873753
Remove the meta field of the Swift rings from the default rsync_module template. Having it by default, undocumented, can lead to unexpected behavior when the Swift documentation states that this field is not processed.
Fix elasticsearch schema in fluentd when
kolla_enable_tls_internal
is true.
Fixes an issue with HAProxy prechecks when scaling out using
--limit
or--serial
. LP#1868986.
Fixes an issue with the HAProxy monitor VIP precheck when some instances of HAProxy are running and others are not. See bug 1866617.
Fixes MariaDB issues in multinode scenarios which affected deployment, reconfiguration, upgrade and Galera cluster resizing. They were usually manifested by WSREP issues in various places and could lead to need to recover the Galera cluster. Note these issues were due to how MariaDB was handled during Kolla Ansible runs and did not affect Galera cluster during normal operations unless MariaDB was later touched by Kolla Ansible. Users wishing to run actions on their Galera clusters using Kolla Ansible are strongly advised to update. For details please see the following Launchpad bug records: bug 1857908 and bug 1859145.
Fixes an issue with Nova when deploying new compute hosts using
--limit
. LP#1869371.
Adapt Octavia to the latest dual CA certificate configuration. The following files should exist in
/etc/kolla/config/octavia/
:client.cert-and-key.pem
client_ca.cert.pem
server_ca.cert.pem
server_ca.key.pem
See the Octavia documentation for details on generating these files.
Since Openstack services can now be configured to use TLS enabled REST endpoints, urls should be constructed using the {{ internal_protocol }} and {{ external_protocol }} configuration parameters.
Construct service REST API urls using
kolla_internal_fqdn
instead ofkolla_internal_vip_address
. Otherwise SSL validation will fail when certificates are issued using domain names.
Fixes an issue with the
kolla-ansible stop
command where it may fail trying to stop non-existent containers. LP#1868596.
Fixes gnocchi-api script name for Ubuntu/Debian binary deployments. LP#1861688
Fixes an issue where host configuration tasks (
sysctl
, loading kernel modules) could be performed during thekolla-ansible genconfig
command. See bug 1860161 for details.
Fixes an issue where
openstack_release
was set tomaster
by default, resulting in containers taggedmaster
being deployed. It has been changed totrain
. The same applies tokolla_source_version
, which affects development mode. See bug 1866054 for details.
Fixes an issue with port prechecks for the Placement service. See bug 1861189 for details.
Removes the
[http]/max-row-limit = 10000
setting from the default InfluxDB configuration, which resulted in the CloudKitty v1 API returning only 10000 dataframes when using InfluxDB as a storage backend. See bug 1862358 for details.
Skydive’s API and the web UI now rely on Keystone for authentication. Only users in the Keystone project defined by skydive_admin_tenant_name will be able to authenticate. See LP#1870903 <https://launchpad.net/bugs/1870903> for more details.
masakari-monitor
will now use the internal API to reach masakari-api. LP#1858431
Switch endpoint_type from public to internal for octavia communicating with the barbican service. See bug 1875618 for details.
9.0.1¶
Bug Fixes¶
External Ceph: copy also cinder keyring to nova-compute. Since Train nova-compute needs also the cinder key in case rbd user is set to Cinder, because volume/pool checks have been moved to use rbd python library. Fixes LP#1859408
Adds configuration to set also_notifies within the pools.yaml file when using the Infoblox backend for Designate.
Pushing a DNS NOTIFY packet to the master does not cause the DNS update to be propagated onto other nodes within the cluster. This means each node needs a DNS NOTIFY packet otherwise users may be given a stale DNS record if they query any worker node. For details please see bug 1855085
Fixes an issue with Docker client timeouts where Docker reports ‘Read timed out’. The client timeout may be configured via
docker_client_timeout
. The default timeout has been increased to 120 seconds. See bug for details.
Fixes IPv6 deployment on CentOS 7. The issues with RabbitMQ and MariaDB have been worked around. For details please see the following Launchpad bug records: bug 1848444, bug 1848452, bug 1856532 and bug 1856725.
Fixes an issue with fluentd parsing of WSGI logs for Aodh, Masakari, Qinling, Vitrage and Zun. See bug 1720371 for details.
Fixes glance_api to run as privileged and adds missing mounts so it can use an iscsi cinder backend as its store. LP#1855695
When upgrading from Rocky to Stein HAProxy configuration moves from using a single configuration to assembling a file from snippets for each service. Applying the HAProxy tag to the entire play ensures that HAProxy configuration is generated for all services when the HAProxy tag is specified. For details please see bug 1855094.
Fixes an issue with the
ironic_ipxe
container serving instance images. See bug 1856194 for details.
Fixes templating of Prometheus configuration when Alertmanager is disabled. In a deployment where Prometheus is enabled and Alertmanager is disabled the configuration for the Prometheus will fail when templating as the variable
prometheus_alert_rules
does not contain the keyfiles
. LP#1854540
9.0.0¶
Prelude¶
The Kolla Ansible 9.0.0
release is the first release in the Train cycle. Highlights include support for deployment of the Masakari instance High Availability service and Qinling Function as a Service. It is now possible to deploy multiple Nova cells, and the control plane may be configured to communicate via IPv6.
New Features¶
Adds a new kolla-ansible subcommand:
deploy-containers
. This action will only do the container comparison and deploy out new containers if that comparison detects a change is needed. This should be used to get updated container images, where no new config changes are need, deployed out quickly.
Adds variables
horizon_wsgi_processes
andhorizon_wsgi_threads
to configure the number of processes and threads for WSGI in the Horizon container.
Adds configuration variables
kolla_enable_tls_internal
,kolla_internal_fqdn_cert
, andkolla_internal_fqdn_cacert
to optionally enable TLS encryption for OpenStack endpoints on the internal API network.
Adds support for overriding the
dnsmasq.conf
configuration file used by the Neutron DHCP agent via{{ node_custom_config}}/neutron/dnsmasq.conf
or{{ node_custom_config }}/neutron/{{ inventory_hostname }}/dnsmasq.conf
.
Adds support for deploying Qinling. Qinling is an OpenStack project to provide “Function as a Service”. This project aims to provide a platform to support serverless functions.
Add support to Kolla-Ansible for Cloudkitty InfluxDB storage system deployment.
Kolla Ansible can now configure deployed docker for Zun. Enable docker_configure_for_zun (disabled by default to retain backwards compatibility).
Adds support for providing custom configuration options for the Docker daemon via the
docker_custom_config
variable (JSON formatted).
Neutron port_forwarding service plugin, and l3 extension can be enabled with variable enable_neutron_port_forwarding.
Merge action plugins (for config/ini and yaml files) now allow relative imports in the same way that upstream template modules does, e.g. one can now include subtemplate from the same directory as base template.
HAProxy - Add the ability to define custom HAProxy services in {{ node_custom_config }}/haproxy/services.d/
Adds support for deployment in an IPv6 networking environment.
For details of IPv6 support consult the relevant docs.
Adds support for configuring libvirt with TLS support. This allows for secure communication between nova-compute and libvirt as well as between libvirt on different hypervisors, during live-migration. The default configuration passes data in plain text, over TCP, without authentication.
Adds support for deploying Masakari, the instance high availability service.
Cinder coordination backend can now be configured via cinder_coordination_backend variable. Coordination is optional and can now be set to either redis or etcd.
Adds support for configuring a coordination backend for Designate via the
designate_coordination_backend
variable. Coordination is mandatory when multiple workers are deployed as in a multinode environment. Possible values areredis
oretcd
.
Adds initial support for deployment of multiple Nova cells. Cells allow Nova clouds to scale to large sizes, by sharding the compute hosts into multiple pools called cells.
This feature is still evolving, and the associated configuration may be liable to change in the next release to support more advanced deployment topologies.
Adds support for deploying Prometheus blackbox exporter
An example blackbox-exporter module has been added (disabled by default) called os_endpoint. This allows for the probing of endpoints over HTTP and HTTPS. This can be used to monitor that OpenStack endpoints return a status code of either 200 or 300, and the word ‘versions’ in the payload.
Adds support for passing extra options to Prometheus.
It is now possible to pass
RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS
to RabbitMQ server’s Erlang VM via the newly introducedrabbitmq_server_additional_erl_args
variable. See Kolla Ansible docs RabbitMQ section for details.
Adds a standardised method to configure notifications for different services.
Adds support for configuring additional Docker volumes for Kolla containers. These are configured via
<service_name>_extra_volumes
.
Adds a new variable to be used by the swift role,
swift_extra_ring_files
. It allows to pass additional ring files to be deployed in the context of a multi-policy setup.
Adds support for Swift Recon.
Adds the necessary configuration to the Swift account, container and object configuration files to enable the Swift recon cli.
In order to give the object server on each Swift host access to the recon files, a Docker volume is mounted into each container which generates them. The volume is then mounted read only into the object server container. Note that multiple containers append to the same file. This should not be a problem since Swift uses a lock when appending.
Example usage: sudo docker exec swift_object_server swift-recon –all
Adds support for the Swift S3 API, enabled via the
enable_swift_s3api
flag.
Adds support for configuration of a trusted CA certificate file. CA bundle file must be added to both the Horizon and Kolla Toolbox containers for this to work correctly.
Adds support for iSCSI-based (including LVM) Cinder volumes to Zun deployment.
Upgrade Notes¶
Updates the minimum required version of Ansible to 2.6.
Removes support for RabbitMQ from the Bifrost container. During the Train cycle, Bifrost switched its default to use JSON-RPC rather than RabbitMQ for internal Ironic communication. This simplifies the deployment and should improve reliability.
Set RabbitMQ
cluster_partition_handling
topause_minority
. This is to avoid split-brain. The setting is overridable using custom config. Note this new config requires at least 3-node RabbitMQ cluster to provide HA (High Availability). See production architecture guide for more info.
Modifies the default storage backend for Cloudkitty to InfluxDB, to match the default in Cloudkitty from Stein onwards. This is controlled via
cloudkitty_storage_backend
. To use the previous default, setcloudkitty_storage_backend
tosqlalchemy
. See bug 1838641 for details.
Modifies the default value for
openstack_release
fromauto
to the name of the release (e.g.train
), ormaster
on the master branch. The value ofauto
will no longer detect the version of thekolla-ansible
Python package.
Docker engine configuration changes are now applied in
/etc/docker/daemon.json
file instead of altering the systemd unit (which gets removed if present).
Increases the default value of
docker_graceful_timeout
from 10 to 60. This sets the time that docker will wait for a container to gracefully stop before issuing a KILL signal.
RHEL-based targets no longer require EPEL repository. It can be safely removed from target hosts if not used otherwise.
InfluxDB TSI is now enabled by default. It is recommended for all customers by InfluxData. If you do not want to enable it you can set the variable
influxdb_enable_tsi
toFalse
inglobals.yml
. Instructions to migrate existing data to the new, disk based format can be found at https://docs.influxdata.com/influxdb/v1.7/administration/upgrading/ If you do not follow the migration proceedure, InfluxDB should continue to work, but this is not recommended.
The Keystone fernet key rotation scheduling algorithm has been modified to avoid issues with over-rotation of keys.
The variables
fernet_token_expiry
,fernet_token_allow_expired_window
andfernet_key_rotation_interval
may be set to configure the token expiry and key rotation schedule.By default,
fernet_token_expiry
is 86400,fernet_token_allow_expired_window
is 172800, andfernet_key_rotation_interval
is the sum of these two variables. This allows for the minimum number of active keys - 3.See bug 1809469 for details.
MariaDB is now exposed via HAProxy on the
database_port
and not themariadb_port
. Out of the box these are both the same, but if you have customisedmariadb_port
so that it is different to thedatabase_port
and you have a service talking to it via HAProxy on that port then you should review your configuration.
Modifies the path for custom configuration of
swift.conf
from/etc/kolla/config/swift/<service>.conf
to/etc/kolla/config/swift/<service>/swift.conf
, to avoid a collision with custom configuration for<service>.conf
. Here,<service>
may beproxy-server
,account-*
,container-*
orobject-*
.
Freezer now uses MariaDB as the default database backend.
Elasticsearch remains as an optional backend due to the requirement of Freezer to use Elasticsearch version 2.3.0. Elasticsearch in kolla-ansible is 5.6.x and that doesn’t work with Freezer.
New variables have been added::
freezer_database_backend
,freezer_database_name
,freezer_database_user
,freezer_database_address
,freezer_elasticsearch_replicas
,freezer_es_protocol
,freezer_es_address
,freezer_es_port
The default connection limit for HAProxy backends is 2000 however, MariaDB defaults to a max of 10000 conections. This has been changed to match the MariaDB limit.
‘haproxy_max_connections’ has also been increased to 40000 to accommodate this.
When installing
kolla-ansible
from source, thekolla_ansible
python module must now be installed in addition to the python dependencies listed inrequirements.txt
. This is done via:pip install /path/to/kolla-ansible
If the git repository is in the current directory, use the following to avoid installing the package from PyPI:
pip install ./kolla-ansible
Changes the database backup procedure to use
mariabackup
which is compatible with MariaDB 10.3. Theqpress
based compression used previously is now replaced withgzip
. The documented restore procedure has been modified accordingly. See the Mariabackup documentation for further information.
Removes the
cinder_iscsi_helper
variable which was deprecated in the Stein cycle in favour ofcinder_target_helper
.
The Heat role has stopped disabling deprecated plugins. To apply this change to existing deployments, the file
/etc/kolla/heat-engine/_deprecated.yaml
is automatically removed during the upgrade.
Removes support for installing Docker using the legacy Docker packages, and the variable
docker_legacy_packages
. Docker is now always installed using the Community Edition (CE) packages.
The
nova-consoleauth
service is no longer deployed. This has been deprecated in nova since Rocky and has not been used by other nova services since.
The legacy upgrade method for Nova has been removed in favour of the rolling upgrade which has been the default since Stein.
nova_enable_rolling_upgrade
should no longer be set.
Support for deployment of OracleLinux containers has been removed.
The Neutron LBaaS project was retired. Upgrading to deployment to Train release will not upgrade Neutron LBaaS. Learn more about its retirement and Octavia as its successor at https://wiki.openstack.org/wiki/Neutron/LBaaS/Deprecation
In Train, Tacker started using local filesystem to store VNF packages and CSAR files. Kolla Ansible provides no shared filesystem capabilities, hence only one instance of each Tacker service is deployed and all on the same host. Previous multinode deployments will be descaled when running upgrade.
Deprecation Notes¶
Deprecates support for deploying Ceph. In a future release support for deploying Ceph will be removed from Kolla Ansible. Prior to this we will ensure a migration path to another tool such as Ceph Ansible is available. For new deployments it is recommended to use another tool to deploy Ceph to avoid a future migration. This can be integrated with OpenStack by following the external Ceph guide.
Support for deploying ONOS integration config is deprecated. In the Ussuri cycle it will be removed. The Neutron (networking) project does not support ONOS at all since 2017.
Support for deploying OpenDaylight (ODL) is deprecated. In the Ussuri cycle support for deploying ODL will be removed. The version of ODL provided by Kolla has not been supported by Neutron since the Rocky release. It is recommended to use ODL upstream documentation to get it deployed.
Configuring Docker daemon via
docker_custom_option
(used in systemd unit file) is deprecated in favour ofdocker_custom_config
variable which adds options to/etc/docker/daemon.json
.
The
enable_xtrabackup
variable is deprecated in favour ofenable_mariabackup
.
Removes the
hnas_iscsi
cinder backend. The Hitachi NAS Platform iSCSI driver was marked as not supported by Cinder in the Ocata release.
Neutron FWaaS v1 is deprecated and removed since stein cycle by [0]. So remove related options in kolla.
The Neutron LBaaS project was retired and support for it in Kolla-Ansible removed.
The
enable_cadf_notifications
variable is deprecated. CADF is the default notification format in keystone. To enable keystone notifications, users should now setkeystone_default_notifications_topic_enabled
toyes
or enable Ceilometer viaenable_ceilometer
.
Bug Fixes¶
Adds system hostnames to
/etc/hosts
, if different from short hostnames. This can fix live migration of Nova instances in some contexts. See bug 1830023 for details.
When
etcd
is used withcinder_coordination_backend
and/ordesignate_coordination_backend
, the config has been changed to use theetcd3gw
(akaetcd3+http
)tooz
coordination driver instead ofetcd3
due to issues with the latter’s availability and stability.etcd3
does not handle well eventlet-based services, such as cinder’s and designate’s. See bugs 1852086 and 1854932 for details. See also tooz change introducing etcd3gw.
Fixes an issue where a failure in pulling an image could lead to a container being removed and not replaced. See bug 1852572 for details.
Fixes Swift volume mounting failing on kernel 4.19 and later due to removal of nobarrier from XFS mount options. See bug 1800132 for details.
Other Notes¶
The upstream-deprecated Nova RetryFilter has been removed from Blazar-enabled and fake Nova config. It has no effect since Queens.
While Kolla Ansible now avoids duplicating Nova cells when messaging or database connection information are changed, operators of existing deployments should perform a manual cleanup of duplicate cells using the
nova-manage cell_v2
command from a container running thenova_api
image, leaving only two cells, one namedcell0
and another one with the right connection information.
Tempest no longer disables IPv6 tests. The upstream default is used now.