Ussuri Series Release Notes¶
10.4.0-34¶
New Features¶
Adds support for libvirt SASL authentication. It is enabled by default. LP#1964013
Upgrade Notes¶
The addition of libvirt SASL authentication requires a new password in
passwords.yml
,libvirt_sasl_password
. This may be generated using the existingkolla-genpwd
andkolla-mergepwd
tooling.
The addition of libvirt SASL authentication requires both the
nova_libvirt
andnova_compute
containers to be updated simultaneously, using new images with the necessary Cyrus SASL dependencies, as well as configuration containing the SASL credentials.
Security Issues¶
Explicitly removes the
net.ipv4.ip_forward
sysctl from/etc/sysctl.conf
on hosts with Neutron L3 Agent. In the absence of another source for this sysctl, it should revert to the default of 0 after the next reboot. This is a follow up to a previous change which stopped setting the sysctl, but leaves existing systems with the original value of 1 set.A deployer looking to more aggressively change the value may set
neutron_l3_agent_host_ipv4_ip_forward
to 0 using a Yoga release of Kolla Ansible. This option will be removed in future. Any deployments still relying on the previous value may setneutron_l3_agent_host_ipv4_ip_forward
to 1. LP#1945453
Kolla Ansible used to run Ironic’s tftpd as an (unprivileged) root user. Now, it will explicitly use the nobody user.
Fixes an issue where the default configuration of libvirt did not use authentication for the API exposed over TCP on the internal API network. This allowed anyone with access to the internal API network read-write access to libvirt. While the internal API network is typically trusted, other services on this network generally at least require authentication.
SASL authentication is now enabled for libvirt by default. Kolla Ansible supports libvirt TLS since the Train release, and this is recommended to provide a higher level of security. LP#1964013
Adds mitigation for the Apache Log4j2 Remote Code Execution (RCE) Vulnerability in Elasticsearch - CVE-2021-44228.
Bug Fixes¶
Removes custom value of
max_allowed_secret_in_bytes
inbarbican.conf
. The default maximum size in Barbican was doubled to avoid issues with some certificates. LP #1957795
Fixes broken elasticsearch_curator container by adding the necessary “LANG=en_US.UTF-8” to the crontab. LP#1919328
Fixes unable to connect to zun console when
kolla_enable_tls_external
is true. Access to console of any zun container fails whenkolla_enable_tls_external
is true. This fix sets the protocol for wsproxybase_url
inzun.conf
according to the value ofkolla_enable_tls_external
LP#1957117
Fixed bug #1987982 This bug caused the database log_bin_trust_function_creators variable not to be set back to “OFF” after a keystone upgrade.
adds back the option to configure the rabbitmq clustering interface via kolla LP#1900160 <https://bugs.launchpad.net/kolla-ansible/+bug/1900160>
Fixes an issue seen when using Jinja2 3.1.0.
Fixes the configuration option setting the type of endpoint used by Neutron to send requests to Placement. LP#1960503
Fixes a configuration issue with Node Exporter causing all file system metrics of a host to be identical. LP#1961438
Fixes an issue where a failure of any Nova compute service to register itself would cause only the host querying the nova API to fail. Now, only hosts that fail to register will fail the Kolla Ansible run. Alternatively, to fail all hosts in a cell when any compute service fails to register, set
nova_compute_registration_fatal
totrue
. LP#1940119
The prometheus openstack exporters are now behind haproxy, providing a unique time series in the prometheus database. Also ensures that only one exporter queries the openstack APIs at any given time interval. With the previous behavior each openstack exporter was scraped at the same time. This caused each exporter to query the openstack APIs simultaneously introducing unneccesary load and duplicate time series in the prometheus database due to the instance label being unique for each exporter. LP#1972818
10.4.0¶
New Features¶
Add new option prometheus_openstack_exporter_timeout to override default scrape_timeout for openstack exporter job.
Adds support for elasticsearch storage backend with cloudkitty: That feature let you store cloudkitty rating documents directly within your elasticsearch cluster.
If you already have an elasticsearch cluster running for logging it create a new cloudkitty specific index. That let you use kibana, grafana or any other interface to browse your rating data and create appropriate dashboard or build an appropriate billing service over it.
Adds support for prometheus as a fetcher/collector for cloudkitty: That feature let you use prometheus metrics as your source of rating. Using prometheus let you rate pretty much any openstack object directly from the kolla provided exporters (Openstack_exporter) or your own customs exporters.
Adds config parameter
haproxy_nova_spicehtml5_proxy_tunnel_timeout
to configure theTunnel TimeOut
directive for spicehtml5proxy haproxy service.
Adds a new variable,
disable_firewall
, which defaults totrue
. If set tofalse
, then the host firewall will not be disabled duringkolla-ansible bootstrap-servers
.
Adds two new variables
service_images_pull_retries
andservice_images_pull_delay
which control the behaviour of image pulling tasks. These are useful if your registry is not 100% reliable (usually due to load). The defaults have been set to 3 retries and 5 seconds delay to ensure a better default experience (these are actually Ansible defaults when task retries are enabled).
Adds support for configuring the
filter
andgather_subset
arguments for thesetup
module viakolla_ansible_setup_filter
andkolla_ansible_setup_gather_subset
respectively. These can be used to reduce the number of facts, which can have a significant effect on performance of Ansible.
New variable
ironic_enable_keystone_integration
was added. It helps to add keystone connection information intoironic.conf
if we want to connect to existing keystone (not installing it at the same time).
Upgrade Notes¶
Updates all references to Ansible facts within Kolla Ansible from using individual fact variables to using the items in the
ansible_facts
dictionary. This allows users to disable fact variable injection in their Ansible configuration, which may provide some performance improvement. Check for facts referenced in local configuration files, and update to useansible_facts
before disabling fact variable injection.
Modifies the default value of
ceph_nova_user
fromnova
to the value ofceph_cinder_user
, in line with the default forceph_nova_keyring
. Users who have overriddenceph_nova_keyring
to use separate keyrings for Nova and Cinder should also overrideceph_nova_user
to match the Nova keyring. LP#1934145
Modifies the default value of
rabbitmq_server_additional_erl_args
from an empty string to+S 2:2 +sbwt none +sbwtdcpu none +sbwtdio none
.
Security Issues¶
Fixes
net.ipv4.ip_forward
not to be enabled by Kolla Ansible on the default network namespace. It was enabled on hosts with Neutron L3 Agent (thus in most common setups with OVS and/or Linux Bridge, but not OVN) and allowed, unless users had extra iptables rules to avoid that, any traffic to be accepted for forwarding (as long as it was routable and passed other checks). Users of existing setups are advised to re-evaluate whether they need this sysctl enabled and disable if not necessary. Kolla Ansible will simply no longer try to set this sysctl at all. Neutron L3 Agent handles forwarding enablement per managed namespace. LP#1945453
Bug Fixes¶
Fixes monasca-thresh to correctly submit the topology to Storm. The previous container ran the topology in local mode (within the container), and didn’t use the Storm cloud. The new container handles submitting the topology to Storm and also handles killing and replaces the topology when it’s configuration has changed. As a result, the monasca-thresh container is only used for submission, and exits after that’s completed. The logs for the topology will now be available in the storm worker-artifact logs. LP#1808805
Fixes an issue where configuration in containers could become stale. This prevented containers with updated configuration from being restarted, e.g., if the
kolla-ansible genconfig
andkolla-ansible deploy-containers
commands were used together. LP#1848775
Fixes elasticsearch fluentd output being enabled when elasticsearch is not enabled. LP#1927880
Fixes an issue seen when customising the Docker Yum repository URL on CentOS, where the
docker_yum_gpgkey
variable is not used consistently. LP#1934913
Fixes an issue where spice console is freezed after while, see LP#1938549.
Fixed broken
kolla-toolbox
container when RabbitMQ is disabled and IPv6 is used. LP#1939883
Fixes
mariadb-clustercheck
not to run when there is no HAProxy. LP#1944114
No longer creates directories for haproxy and swift logs where they are not needed. LP#1945070
Fixes an error in placement role which prevents to deploy the placement service when custom policy file is used. LP#1948835
Fixes missing current Ansible version in the error message. LP#1948979
Fixes an issue with Cyborg deployment. LP#1937911
Fixes an issue with
config.json
forneutron-server
when a VMware plugin agent is used.
Fixes an issue with Neutron
linuxbridge
ML2 agent whenneutron_external_interface
includes multiple interfaces. LP#1863935
Fixes an issue with Manila configuration which was missing a
[glance]
section, preventing some drivers from operating.
Fixes an issue with default Nova configuration for Ceph where the RBD user is set to
nova
, but only acinder
keyring is copied. The default value ofceph_nova_user
is changed to the value ofceph_cinder_user
, in line with the default forceph_nova_keyring
. LP#1934145
Fixes an issue where RabbitMQ consumes a large amount of CPU, particularly on multi-core systems. The default RabbitMQ tuning assumes that RabbitMQ is running on a dedicated host, which is the opposite of a typical Kolla Ansible container setup. For more details on tuning RabbitMQ in your environment, please see: https://www.rabbitmq.com/runtime.html#busy-waiting https://www.rabbitmq.com/runtime.html#scheduling
Other Notes¶
Optimised image pulling to avoid looping over disabled services.
10.3.0¶
New Features¶
Adds
kolla_sysctl_conf_path
variable that allows to customise the path tosysctl.conf
that will be modified by Kolla Ansible plays. The default is/etc/sysctl.conf
as it was before.
Adds a new flag,
docker_disable_default_network
, which defaults tono
. Docker is using172.17.0.0/16
by default for bridge networking ondocker0
, and this might cause routing problems for operator networks. Setting this flag toyes
will disable Docker’s bridge networking. This feature will be enabled by default from the Wallaby 12.0.0 release.
Added a new haproxy configuration variable,
haproxy_host_ipv4_tcp_retries2
, which allows users to modify this kernel option. This option sets maximum number of times a TCP packet is retransmitted in established state before giving up. The default kernel value is 15, which corresponds to a duration of approximately between 13 to 30 minutes, depending on the retransmission timeout. This variable can be used to mitigate an issue with stuck connections in case of VIP failover, see bug 1917068 for details.
Adds the ability to override the automatic detection of fluentd_version and fluentd_binary. These can now be defined as extra variables. This removes the dependency of having docker configured for config generation.
Adds support for collecting Prometheus metrics from RabbitMQ. This is enabled by default when Prometheus and RabbitMQ are enabled, and may be disabled by setting
enable_prometheus_rabbitmq_exporter
tofalse
.
Allows to skip and unset sysctl variables controlled by Kolla Ansible plays using
KOLLA_SKIP
andKOLLA_UNSET
values.
Bug Fixes¶
Fixes an issue with
kolla-ansible bootstrap-servers
if Zun is enabled where Zun-specific configuration for Docker was applied to all nodes. LP#1914378
Fix the issue when Swift deployed with S3 Token Middleware enabled. Fixes LP#1862765
Fixes the Northbound and Southbound database socket paths in OVN.
chronyd crash loop if server is rebooted (Debian) LP#1915528
Fixed an issue when Docker was configured after startup on Debian/Ubuntu, which resulted in iptables rules being created - before they were disabled. LP#1923203
A bug where sriov_agent.ini wasn’t copied due to
Permission denied
error was fixed. LP#1923467
Fixed an issue where docker python SDK 5.0.0 was failing due to missing six - introduced a constraint to install version lower than 5.x. LP#1928915
Fixes more-than-2-node RabbitMQ upgrade failing randomly. LP#1930293.
Fixes Swift deploy when TLS enabled. Added the missing handler and corrected the container name. LP#1931097
Fixes missing region_name in keystone_auth sections. See bug 1933025 for details.
Fixes
iscsid
failing in current CentOS 8 based images due to pid file being needlessly set. LP#1933033
Fixes host bootstrap on Debian not removing the conflicting packages. It now behaves in accordance with the docs. LP#1933122
Fixes an issue where
kolla-ansible
exits with a zero exit code when executed with a bogus command name. LP#1929397
Fixes potential issue with Alertmanger in non-HA deployments. In this scenario, peer gossip protocol is now disabled and Alertmanager won’t try to form a cluster with non-existing other instances. LP#1926463
Adds a new flag,
docker_disable_ip_forward
, which defaults tono
and can be used (by settingyes
) to disable docker’sip-forward
option which makes docker setnet.ipv4.ip_forward
sysctl to1
. This is to protect from creating all-forwarding hosts. LP#1931615
Fixes an issue when generating
/etc/hosts
duringkolla-ansible bootstrap-servers
when one or more hosts has anapi_interface
with dashes (-
) in its name. LP#1927357
Fixes some configuration issues around Barbican logging. LP#1891343
Fixes some configuration issues around Cinder logging. LP#1916752
Fix the wrong configuration of the ovs-dpdk service. this breaks the deployment of kolla-ansible. For more details please see bug 1908850.
Fixes an issue with keepalived which was not recreated during an upgrade if configuration is unchanged. LP#1928362
Fixes an issue with Magnum when TLS is enabled. LP#781062
Fixes an issue with executing
kolla-ansible
when installed viapip install --user
. LP#1915527
Fixes an issue where
masakari.conf
was generated for themasakari-instancemonitor
service but not used.
Fixes an issue where
masakari-monitors.conf
was generated for themasakari-api
andmasakari-engine
services but not used.
Uses a consistent variable name for container dimensions for
masakari-instancemonitor
-masakari_instancemonitor_dimensions
. The old name ofmasakari_monitors_dimensions
is still supported.
Fixes an issue with Octavia deployment when using a custom service auth project. If
octavia_service_auth_project
is set to a project that does not exist, Octavia deployment would fail. The project is now created. LP#1922100
Fixes LP#1892376 by updating deprecated syntax in the Monasca Elasticsearch template.
Removes whitespace around equal signs in
zookeeper.cfg
which were preventing thezkCleanup.sh
script from running correctly.
Other Notes¶
Following Cinder upstream, support for using ZFSSA with Cinder has been removed. ZFSSA was unsupported in Train and later removed in Ussuri.
10.2.0¶
New Features¶
Adds a new flag,
docker_disable_default_iptables_rules
, which defaults tono
. Docker is manipulating iptables rules by default to provide network isolation, and this might cause problems if the host already has an iptables based firewall. A common problem is that Docker sets the default policy of theFORWARD
chain in thefilter
toDROP
. Settingdocker_disable_default_iptables_rules
toyes
will disable Docker’s iptables manipulation. This feature will be enabled by default from the Victoria 11.0.0 release.
Improves performance of the
common
role by generating all fluentd configuration in a single file.
Improves performance of the
common
role by generating all logrotate configuration in a single file.
Known Issues¶
Since Ussuri, there is a bug in how Ceph (RBD) is handled with Cinder: the
backend_host
option is missing from the generated configuration for external Ceph. The symptoms are that volumes become unmanageable until extra admin action is taken. This does not affect the data plane - running virtual machines are not affected.There is a related issue regarding active-active
cinder-volume
services (single-hostcinder-volume
not affected), which is that they should not have been configured withbackend_host
in the first place but withcluster
and proper coordination instead. Some users might have customised their config already to address this issue.The Kolla team is investigating the best way to address this for all its users. In the meantime, please ensure that, before upgrading to Ussuri, the
backend_host
option is set to its previous value (the default wasrbd:volumes
) via a config override.For more details please refer to the referenced bug. Do note this issue affects both new deployments and upgrades. LP#1904062
Upgrade Notes¶
When deploying Monasca with Logstash 6, any custom Logstash 2 configuration for Monasca will need to be updated to work with Logstash 6. Please consult the documentation.
baremetal
role now uses CentOS8
package repository for Docker CE (compared to7
previously).
The Prometheus OpenStack exporter now uses internal endpoints to communicate with OpenStack services, to match the configuration of other services deployed by Kolla Ansible. Using public endpoints can be retained by setting the
prometheus_openstack_exporter_endpoint_type
variable topublic
.
The default value of
REST_API_REQUIRED_SETTINGS
was synchronized with Horizon. You may want to review settings exposed by the updated configuration.
Security Issues¶
The
admin-openrc.sh
file generated bykolla-ansible post-deploy
was previously created withroot:root
ownership and644
permissions. This would allow anyone with access to the same directory to read the file, including the admin credentials. The ownership ofadmin-openrc.sh
is now set to the user executingkolla-ansible
, and the file is assigned a mode of600
. This change can be applied by runningkolla-ansible post-deploy
.
Bug Fixes¶
Add support to use bifrost-deploy behind proxy. It uses existing container_proxy variable.
Fixes handling of /dev/kvm permissions to be more robust against host-level actions. LP#1681461
IPv6 fully-routed topology (/128 addressing) is now allowed (where applicable). LP#1848941
When deploying Elasticsearch 6, Logstash 2 was deployed by default which is not compatible with Elasticsearch 6. Logstash 6 is now deployed by default.
Fix Castellan (Barbican client) when used with enabled TLS. LP#1886615
Fixes
--configdir
parameter to apply to defaultpasswords.yml
location. LP#1887180
fluentd
is now logging to/var/log/kolla/fluentd/fluentd.log
instead ofstdout
. LP#1888852
Fixes
deploy-containers
action missing for the Masakari role. LP#1889611
An issue has been fixed when
keystone
container would be stuck in restart loop with a message that fernet key is stale. LP#1895723
Fixes
haproxy_single_service_split
template to work with default formode
(http
). LP#1896591
Fixed invalid fernet cron file path on Debian/Ubuntu from
/var/spool/cron/crontabs/root/fernet-cron
to/var/spool/cron/crontabs/root
. LP#1898765
Add with_first_found on placement for placement-api wsgi configuration to allow overwrite from users. LP#1898766
OVN will no longer schedule SNAT routers on compute nodes when
neutron_ovn_distributed_fip
is enabled. LP#1901960
RabbitMQ services are now restarted serially to avoid a split brain. LP#1904702
Fixes LP#1906796 by adding notice and note loglevels to monasca log-metrics drop configuration
Fixes Swift’s stop action. It will no longer try to start
swift-object-updater
container again. LP#1906944
Fixes an issue with the
kolla-ansible prechecks
command with Docker 20.10. LP#1907436
Fixes an issue with
kolla-ansible mariadb_recovery
when themariadb
container does not exist on one or more hosts. LP#1907658
fix deploy freezer failed when use kolla_dev_mod LP#1888242
Fixes issues with some CloudKitty commands trying to connect to an external TLS endpoint using HTTP. LP#1888544
Fixes an issue where Docker may fail to start if
iptables
is not installed. LP#1899060
The
admin-openrc.sh
file generated bykolla-ansible post-deploy
was previously created withroot:root
ownership and644
permissions. This would allow anyone with access to the same directory to read the file, including the admin credentials. The ownership ofadmin-openrc.sh
is now set to the user executingkolla-ansible
, and the file is assigned a mode of600
. This change can be applied by runningkolla-ansible post-deploy
.
Fixes an issue during deleting evacuated instances with encrypted block devices. LP#1891462
Fixes an issue where Keystone Fernet key rotation may fail due to permission denied error if the Keystone rotation happens before the Keystone container starts. LP#1888512
Fixes an issue with Keystone startup when Fernet key rotation does not occur within the configured interval. This may happen due to one of the Keystone hosts being down at the scheduled time of rotation, or due to uneven intervals between cron jobs. LP#1895723
Fixes an issue with Kibana upgrade on Debian/Ubuntu systems. LP#1901614
Reverts the arp_responder option setting to the default (‘False’) for the LinuxBridge agent, as this is known to cause problems with l2_population as well as other issues such as not being fully compatible with the allowed-address-pairs extension. LP#1892776
Fixes an issue with the Neutron Linux bridge ML2 driver where the firewall driver configuration was not applied. LP#1889455
Fixes an issue with Masakari and internal TLS where CA certificates were not copied into containers, and the path to the CA file was not configured. Depends on masakari bug 1873736 being fixed. LP#1888655
Fixes an issue where Grafana instances would race to bootstrap the Grafana DB. See LP#1888681.
Fixes LP#1892210 where the number of open connections to Memcached from
neutron-server
would grow over time until reaching the maximum set bymemcached_connection_limit
(5000 by default), at which point the Memcached instance would stop working.
An issue where when Kafka default topic creation was used to create a Kafka topic, no redundant replicas were created in a multi-node cluster. LP#1888522. This affects Monasca which uses Kafka, and was previously masked by the legacy Kafka client used by Monasca which has since been upgraded in Ussuri. Monasca users with multi-node Kafka clusters should consultant the Kafka documentation to increase the number of replicas.
Fixes an issue where the
br_netfilter
kernel module was not loaded on compute hosts. LP#1886796
The Prometheus OpenStack exporter now uses internal endpoints to communicate with OpenStack services, to match the configuration of other services deployed by Kolla Ansible.
Prevents adding a new Keystone host to an existing cluster when not targeting all Keystone hosts (e.g. due to
--limit
or--serial
arguments), to avoid overwriting existing Fernet keys. LP#1891364
Reduce the use of SQLAlchemy connection pooling, to improve service reliability during a failover of the controller with the internal VIP. LP#1896635
No longer configures the Prometheus OpenStack exporter to use the
prometheus
Docker volume, which was never required.
Updates the default value of
REST_API_REQUIRED_SETTINGS
in Horizonlocal_settings
, which enables some features such as selecting the default boot source for instances. LP#1891024
Other Notes¶
Add trove-guestagent.conf for trove
10.1.0¶
New Features¶
Adds ability to provide a custom elasticsearch config.
Upgrade Notes¶
Changes the default value of
kibana_elasticsearch_ssl_verify
fromfalse
totrue
. LP#1885110
Apache ZooKeeper will now be automatically deployed whenever Apache Storm is enabled.
Bug Fixes¶
Fixes an issue when using ip addresses instead of hostnames in Ansible inventory. OpenvSwitch role sets system-id based on inventory_hostname, which in case of ip addresses in is first ip octet. Such a deployment would result in multiple OVN chassis with duplicate name e.g. “10” connecting to OVN Southbound database - which spawns high numbers of create/delete events in Encap database table - leading to near 100% CPU usage of OVN/OVS/Neutron processes.
Fixes an issue with Manila deployment starting
openvswitch
andneutron-openvswitch-agent
containers whenenable_manila_backend_generic
was set toFalse
. LP#1884939
Fixes the Elasticsearch Curator cron schedule run. LP#1885732
Fixes an incorrect configuration for nova-conductor when a custom Nova policy was applied, preventing the
nova_conductor
container from starting successfully. LP#1886170
Fixes an incorrect Ceph keyring file configuration in
gnocchi.conf
, which prevented Gnocchi from connecting to Ceph. LP#1886711
In line with clients for other services used by Magnum, Cinder and Octavia also use endpoint_type = internalURL. In the same tune, these services also use the globally defined openstack_region_name.
Fix the configuration of the etcd service so that its protocol is independant of the value of the
internal_protocol
parameter. The etcd service is not load balanced by HAProxy, so there is no proxy layer to do TLS termination wheninternal_protocol
is configured to behttps
.
Fixes LP#1885885 where the default chunk size in the Monasca Fluentd output plugin increased from 8MB to 256MB for file buffering which exceeded the limit allowed by the Monasca Log / Unified API.
Adds a new variable
fluentd_elasticsearch_cacert
, which defaults to the value ofopenstack_cacert
. If set, this will be used to set the path of the CA certificate bundle used by Fluentd when communicating with Elasticsearch. LP#1885109
Improves error reporting in
kolla-genpwd
andkolla-mergepwd
when input files are not in the expected format. LP#1880220.
Fixes Magnum trust operations in multi-region deployments.
Deploys Apache ZooKeeper if Apache Storm is enabled explicitly. ZooKeeper would only be deployed if Apache Kafka was also enabled, which is often done implicitly by enabling Monasca.
10.0.0¶
Prelude¶
The Kolla Ansible 10.0.0
release is the first release in the Ussuri
cycle. Notable changes include:
all playbooks and scripts now use Python 3 and support for Python 2 has been dropped
CentOS 8 is now supported as a host operating system and container image, and support for CentOS 7 has been dropped
Ceph deployment support has been dropped
configuration of external Ceph integration has been streamlined
initial support for TLS encryption of backend API services, providing end-to-end encryption of API traffic for Barbican, Cinder, Glance, Heat, Horizon, Keystone, Nova and Placement
support for deployment of Open Virtual Network (OVN) and integration of it with Neutron
New Features¶
Adds Elasticsearch Curator for managing aggregated log data.
Adds configuration variables
cron_logrotate_rotation_interval
andcron_logrotate_rotation_count
to set the logrotate rotation interval and count.
Adds a mechanism to customize
prometheus.yml
. Please read the the documentation. for more details.
Add support for two new Senlin services;
senlin-conductor
andsenlin-health-manager
. Both of these services are required for Senlin to be fully functional starting with the Ussuri release.
Adds a mechanism to copy user defined files via the
extras
directory of prometheus config. This can can be useful for certain prometheus config customizations that reference additional files. An example is setting up file based service discovery.
Adds a new variable,
influxdb_datadir_volume
. This allows you control where the docker volume for InfluxDB is created. A performance tuning is to set this to a path on a high performance flash drive.
Adds a new variable,
kafka_datadir_volume
. This allows you to control where the Kafka data is stored. Generally you will want this to be a spinning disk, or an array of spinning disks.
Add a new container
zun-cni-daemon
for Zun service. This container is a daemon service for implementing the CNI plugin for Zun.
Allow operators to use custom parameters with the ceilometer-upgrade command. This is quite useful when using the dynamic pollster subsystem; that sub-system provides flexibility to create and edit pollsters configs, which affects Gnocchi resource-type configurations. However, Ceilometer uses default and hard-coded resource-type configurations; if one customizes some of its default resource-types, he/she can get into trouble during upgrades. Therefore, the only way to work around it is to use the
--skip-gnocchi-resource-types
flag.
Adds new checks to
kolla-ansible prechecks
that validate that expected Ansible groups exist.
Kolla Ansible checks now that the local Ansible Python environment is coherent, i.e. used Ansible can see Kolla Ansible. LP#1856346
Adds support for CentOS 8 as a host Operating System and base container image. This is the only major version of CentOS supported from the Ussuri release. The Train release supports both CentOS 7 and 8 hosts, and provides a route for migration.
Introduces user modifiable variables instead of fixed names for Ceph keyring files used by external Ceph functionality.
Configures all openstack services to use the globally defined Certificate Authority file to verify HTTPS connections. The global CA file is configured by the
openstack_cacert
parameter.
When
kolla_copy_ca_into_containers
is configured toyes
, the certificate authority files in/etc/kolla/certificates/ca
will be copied into service containers to enable trust for those CA certificates. This is required for any certificates that are either self-signed or signed by a private CA, and are not already present in the service image trust store. Otherwise, either CA validation will need to be explicitly disabled or the path to the CA certificate must be configured in the service using theopenstack_cacert
parameter.
Adds a
prune-images
command for Docker image pruning on hosts. See blueprint for details.
Fluentd now buffers logs locally to file when the Monasca API is unreachable.
Adds configuration options to enable backend TLS encryption from HAProxy to the Keystone, Glance, Heat, Placement, Horizon, Barbican, and Cinder services. When used in conjunction with enabling TLS for service API endpoints, network communcation will be encrypted end to end, from client through HAProxy to the backend service.
Delegates execution of the Ansible
uri
module to service containers usingkolla_toolbox
. This will enable any certificates that are already copied and extracted into the service container to be automatically validated. This is particularly useful in the case that the certificate is either self-signed or signed by a local (private) CA.
Introduce External Ceph user IDs as variables to allow non-standard Ceph authentication IDs in OpenStack service configuration without the need to override configuration files.
Adds a
--clean
argument tokolla-mergepwd
. It allows to clean old (no longer used) keys from the passwords file.
Adds support for generating self-signed certificates for both the internal and external (public) networks via the
kolla-ansible certificates
command. If they are the same network, then the certificate files will be the same.
Self-signed TLS certificates can be used to test TLS in a development OpenStack environment. The
kolla-ansible certificates
command will generate the required self-signed TLS certificates. This command has been updated to first create a self-signed root certificate authority. The command then generates the internal and external facing certificates and signs them using the root CA. If backend TLS is enabled, the command will generate the backend certificate and sign it with the root CA.
HAProxy - Add the ability to define custom HAProxy services in {{ node_custom_config }}/haproxy/services.d/
Adds a new precheck for supported host OS distributions. Currently supported distributions are CentOS/RHEL 8, Debian Buster and Ubuntu Bionic. This check can be disabled by setting
prechecks_enable_host_os_checks
tofalse
.
Adds support for deployment of OVN and integration of it with Neutron. This includes deployment of:
OVN databases (
ovn-sb-db
andovn-nb-db
)Southbound and Northbound databases connector (
ovn-northd
)Hypervisor components
ovn-controller
andneutron-ovn-metadata-agent
Add Object Storage service (Swift) support for Ironic.
Adds support for managing Ceilometer dynamic pollster configuration in Kolla Ansible. This feature will look for configurations in
{{ node_custom_config }}/ceilometer/pollster.d/
by default. If there are configs there, they are copied to the control nodes, to configure Ceilometer dynamic pollster sub-system.
Enable Galera node state checking by using
clustercheck
script that is used by HAProxy to define node up/down state.
Introduces a new configuration variable
mariadb_wsrep_extra_provider_options
allowing users to set additional WSREP options.
Adds support for the Neutron policy file in both .json and .yaml format.
Adds a new variable,
openstack_tag
, which is used as the default Docker image tag in place ofopenstack_release
. The default value isopenstack_release
, with a suffix set viaopenstack_tag_suffix
. The suffix is empty except on CentOS 8 where it is set to-centos8
. This allows for the availability of images based on CentOS 7 and 8.
Prometheus server can now be disabled, allowing the exporters to be deployed without it. The default behaviour of deploying Prometheus server when Prometheus is enabled remains.
Known Issues¶
Python Requests library will not trust self-signed or privately signed CAs even if they are added into the OS trusted CA folder and update-ca-trust is executed. For services that rely on the Python Requests library, either CA verification must be explicitly disabled in the service or the path to the CA certificate must be configured using the
openstack_cacert
parameter.
Upgrade Notes¶
Adds a maximum supported version check for Ansible. Kolla Ansible now requires at least Ansible
2.8
and supports up to2.9
. See blueprint for details.
Avoids unnecessary fact gathering using the
setup
module. This should improve the performance of environments using fact caching and the Ansiblesmart
fact gathering policy. See blueprint for details.
CentOS 7 is no longer supported as a host Operating System or base container image. CentOS users should migrate to CentOS 8. The Train release supports both CentOS 7 and 8 images, and provides a route for migration.
Some images were supported by CentOS 7 but lack suitable packages in CentOS 8, and are no longer supported for CentOS. See Kolla release notes for details.
Support for the SCSI target daemon (
tgtd
) has been removed for CentOS/RHEL 8. The default value ofcinder_target_helper
is nowlioadm
on CentOS/RHEL 8, but remains astgtadm
on other platforms.
For cinder (
cinder-volume
andcinder-backup
),glance-api
andmanila
keyrings behavior has changed and Kolla Ansible deployment will not copy those keys using wildcards (ceph.*
), instead will use newly introduced variables. Your environment may render unusable after an upgrade if your keys in/etc/kolla/config
do not match default values for introduced variables.
The default
migration_interface
is moved fromnetwork_interface
toapi_interface
, which is treaded as internal and security network plane in most case.
The gnocchi-statsd daemon is no longer enabled by default. If you are using the daemon, you will need to set
enable_gnocchi_statsd: "yes"
to continue using it in your deployment.
Erlang 22.x dropped support for HiPE so the
rabbitmq_hipe_compile
variable has been removed.
Changes default value of
enable_haproxy_memcached
tono
. Memcached has not been accessed via haproxy since at least the Rocky release. Users depending on haproxy for memcached for other software may want to change this back toyes
.
Python 2.7 support has been dropped. The last release of Kolla Ansible to support Python 2.7 is OpenStack Train. The minimum version of Python now supported by Kolla Ansible is Python 3.6.
The default behavior for generating the
cinder.conf
template has changed. Anrbd-1
section will be generated when external Ceph functionality is used, i.e.cinder_backend_ceph
is set totrue
. Previously it was only included when Kolla Ansible internal Ceph deployment mechanism was used.
The
rbd
section ofnova.conf
fornova-compute
is now generated whennova_backend
is set to"rbd"
. Previously it was only generated when bothenable_ceph
was"yes"
andnova_backend
was set to"rbd"
.
The
kolla_logs
Docker volume is now mounted into the Elasticsearch container to expose logs which were previously written erroneously to the container filesystem. It is up to the user to migrate any existing logs if they so desire and this should be done before applying this fix. LP#1859162
The default value for
kolla_external_fqdn_cacert
has been changed from: “{{ node_config }}/certificates/haproxy-ca.crt” to: “{{ node_config }}/certificates/ca/haproxy.crt”and the default value for
kolla_external_fqdn_cacert
has been changed from: “{{ node_config }}/certificates/haproxy-ca-internal.crt” to: “{{ node_config }}/certificates/ca/haproxy-internal.crt”These variables set the value for the
OS_CACERT
environment variable inadmin-openrc.sh
. This has been done to allow these certificates to be copied into containers whenkolla_copy_ca_into_containers
is true.
Replaced
kolla_external_fqdn_cacert
andkolla_internal_fqdn_cacert
withkolla_admin_openrc_cacert
, which by default is not set.OS_CACERT
is now set to the value ofkolla_admin_openrc_cacert
in the generatedadmin-openrc.sh
file.
Glance deployment now uses Multi-Store support. Users that have
default_stores
in their service config overrides forglance-api.conf
should remove it and usedefault_backend
if needed.
The
enable_cadf_notifications
variable was removed. CADF is the default notification format in keystone. To enable keystone notifications, users can now setkeystone_default_notifications_topic_enabled
toyes
or enable Ceilometer viaenable_ceilometer
.
Removes support for the
enable_xtrabackup
variable that was deprecated in favour ofenable_mariabackup
in the Train (9.0.0) release.
Support for deploying Ceph has been removed, after it was deprecated in Stein. Please use an external tool to deploy Ceph and integrate it with Kolla Ansible deployed OpenStack by following the external Ceph guide.
The octavia user is no longer given the admin role in the admin project. Octavia does not require this role and instead uses octavia user with admin role in service project. During an upgrade the octavia user is removed from the admin project.
For existing deployments this may cause problems, so a
octavia_service_auth_project
variable has been added which may be set toadmin
to return to the previous behaviour.To switch an existing deployment from using the
admin
project to theservice
project, it will at least be necessary to create the required security group in theservice
project, and updateoctavia_amp_secgroup_list
to this group’s ID. Ideally the Amphora flavor and network would also be recreated in theservice
project, although this does not appear to be necessary for operation, and will impact existing Amphorae.See bug 1873176 for details.
Support for configuration of Neutron related to integration with ONOS has been removed.
Support for deployment of OpenDaylight controller and configuration of Neutron related to integration with OpenDaylight have been removed.
Neutron Linux bridge and Open vSwitch Agents config has been split out into
linuxbridge_agent.ini
andopenvswitch_agent.ini
respectively. Please move your custom service config fromml2_conf.ini
into those files.
The Monasca Log API has been removed. All logs now go to the unified Monasca API when Monasca is enabled. Any custom Fluentd configuration and inventory files will need to be updated. Any monasca_log_api containers will be removed automatically.
Deprecation Notes¶
Deprecates support for deploying with Hyper-V integrations. In Victoria support for these will be removed from Kolla Ansible.
This is dictated by lack of interest and maintenance.
See also the post to openstack-discuss
Deprecates support for deploying MongoDB. In Victoria support for deploying MongoDB will be removed from Kolla Ansible. Note CentOS 8 already lost support for MongoDB due to decisions made upstream.
This affects Panko as it will no longer be possible to get automatic deployment of MongoDB database for it. However, the default, SQL, backend is and will be supported via MariaDB.
MongoDB lost its position in OpenStack environment after controversial relicensing under their custom SSPL (Server Side Public License) which did not pass OSI (Open Source Initiative) validation.
The neutron-fwaas project was deprecated in the Neutron stadium and will be removed from stadium in the Wallaby cycle. The support for neutron-fwaas in the Neutron and Horizon roles is deprecated as of the Ussuri release and will be removed in the Wallaby cycle.
Deprecates support for deploying with VMware integrations. In Victoria support for these will be removed from Kolla Ansible.
This is dictated by lack of interest and maintenance.
See also the post to openstack-discuss
Deprecates support for deploying with XenAPI integrations. In Victoria support for these will be removed from Kolla Ansible.
This is dictated by lack of interest and maintenance, and upstream decision of deprecation by Nova (for the same reasons).
See also the post to openstack-discuss. And the Nova notice.
The
congress
project is no longer maintained. This has been retired since Victoria and has not been used by other OpenStack services since.
Customizing Neutron Linux bridge and Open vSwitch Agents config via
ml2_conf.ini
is deprecated. The config has been split out for these agents intolinuxbridge_agent.ini
andopenvswitch_agent.ini
respectively. In this release (Ussuri) custom service configml2_conf.ini
overrides will still be used when merging configs - but but that functionality will be removed in the Victoria release.
Security Issues¶
Fixes leak of RabbitMQ password into Ansible logs. LP#1865840
Bug Fixes¶
Fix that the cyborg conductor failed to communicate with placement. See bug 1873717.
Fix that cyborg agent failed to start privsep daemon. Add privileged capability for cyborg agent. See bug 1873715.
Adds necessary
region_name
tooctavia.conf
whenenable_barbican
is set totrue
. LP#1867926
Adds
/etc/timezone
toDebian/Ubuntu
containers. LP#1821592
Fixes an issue with Nova live migration not using
migration_interface_address
even when TLS was not used. When migrating an instance to a newly added compute host, if addressing depended on/etc/hosts
and it had not been updated on the source compute host to include the new compute host, live migration would fail. This did not affect DNS-based name resolution. Analogically, Nova live migration would fail if the address in DNS//etc/hosts
was not the same asmigration_interface_address
due to user customization. LP#1729566
Fixes Kibana deployment with the new E*K stack (6+). LP#1799689
Reworks Keystone fernet bootstrap which had tendencies to fail on multinode setups. See bug 1846789 for details.
Fix prometheus-openstack-exporter to use CA certificate.
Changes Manila cephfs share driver to
manila.share.drivers.cephfs.driver.CephFSDriver
, as the old driver was deprecated.
External Ceph: copy also cinder keyring to nova-compute. Since Train nova-compute needs also the cinder key in case rbd user is set to Cinder, because volume/pool checks have been moved to use rbd python library. Fixes LP#1859408
Fix qemu loading of ceph.conf (permission error). LP#1861513
Remove /run bind mounts in Neutron services causing dbus host-level errors and add /run/netns for neutron-dhcp-agent and neutron-l3-agent. LP#1861792
Fixes an issue where old fluentd configuration files would persist in the container across restarts despite being removed from the
node_custom_config
directory. LP#1862211
Use more permissive regex to remove the offending 127.0.1.1 line from /etc/hosts. LP#1862739
Each Prometheus mysqld exporter points now to its local mysqld instance (MariaDB) instead of VIP address. LP#1863041
Cinder Backup has now access to kernel modules to load e.g. iscsi_tcp module. LP#1863094
Makes RabbitMQ hostname address resolution precheck stronger by requiring uniqueness of resolution to avoid later issues. LP#1863363
Fix protocol used by
neutron-metadata-agent
to connect to Nova metadata service. This possibly affected internal TLS setup. Fixes LP#1864615
Fixes haproxy role to avoid restarting haproxy service multiple times in a single Ansible run. LP#1864810 LP#1875228
Fixes an issue with deploying Grafana when using IPv6. LP#1866141
Fixes elasticsearch deployment in IPv6 environments. LP#1866727
Fixes failure to deploy telegraf with monitoring of zookeeper due to wrong variable being referenced. LP#1867179
Fixes deployment of fluentd without any enabled OpenStack services. LP#1867953
Fix missing glance_ca_certificates_file variable in glance.conf. LP#1869133
Add client ca_cert file in heat LP#1869137
Adds missing
vitrage-persistor
service, required by Vitrage deployments for storing data. LP#1869319
Fixes
designate-worker
not to useetcd
as its coordination backend because it is not supported by Designate (no group membership support available via tooz). LP#1872205
Fixes Octavia in internally-signed (e.g. self-signed) cert TLS deployments by providing path to CA cert file in proper config places. LP#1872404
Fixes source-IP-based load balancing for Horizon when using the “split” HAProxy service template.
Fixes issue where HAProxy would have no backend servers in its config files when using the “split” config template style.
Manage nova scheduler workers through
openstack_service_workers
variable. LP#1873753
Fixes Grafana datasource update. LP#1881890
Removing chrony package and AppArmor profile from docker host if containerized chrony is enabled. LP#1882513
Add missing “become: true” on some VMWare related tasks. Fixed on
Copying VMware vCenter CA file
andCopying over nsx.ini
.
fix deploy nova failed when use kolla_dev_mod.
Remove the meta field of the Swift rings from the default rsync_module template. Having it by default, undocumented, can lead to unexpected behavior when the Swift documentation states that this field is not processed.
Fixes the default CloudKitty configuration, which included the
gnocchi_collector
andkeystone_fetcher
options that were deprecated in Stein and removed in Train. See bug 1876985 for details.
When
etcd
is used withcinder_coordination_backend
and/ordesignate_coordination_backend
, the config has been changed to use theetcd3gw
(akaetcd3+http
)tooz
coordination driver instead ofetcd3
due to issues with the latter’s availability and stability.etcd3
does not handle well eventlet-based services, such as cinder’s and designate’s. See bugs 1852086 and 1854932 for details. See also tooz change introducing etcd3gw.
Adds configuration to set also_notifies within the pools.yaml file when using the Infoblox backend for Designate.
Pushing a DNS NOTIFY packet to the master does not cause the DNS update to be propagated onto other nodes within the cluster. This means each node needs a DNS NOTIFY packet otherwise users may be given a stale DNS record if they query any worker node. For details please see bug 1855085
Fixes an issue with Docker client timeouts where Docker reports ‘Read timed out’. The client timeout may be configured via
docker_client_timeout
. The default timeout has been increased to 120 seconds. See bug for details.
Fixes IPv6 deployment on CentOS 7. The issues with RabbitMQ and MariaDB have been worked around. For details please see the following Launchpad bug records: bug 1848444, bug 1848452, bug 1856532 and bug 1856725.
Fixes an issue with Cinder upgrades that would cause online schema migration to fail. LP#1880753
Fix cyborg api container failed to load api paste file. For details please see bug 1874028.
Fix elasticsearch schema in fluentd when
kolla_enable_tls_internal
is true.
Fixes an issue where
fernet_token_expiry
would fail the pre-checks despite being set to a valid value. Please see bug 1856021 for more details.
Fixes an issue with HAProxy prechecks when scaling out using
--limit
or--serial
. LP#1868986.
Fixes an issue with the HAProxy monitor VIP precheck when some instances of HAProxy are running and others are not. See bug 1866617.
The
kolla_logs
Docker volume is now mounted into the Elasticsearch container to expose logs which were previously written erroneously to the container filesystem. LP#1859162
Fixes MariaDB issues in multinode scenarios which affected deployment, reconfiguration, upgrade and Galera cluster resizing. They were usually manifested by WSREP issues in various places and could lead to need to recover the Galera cluster. Note these issues were due to how MariaDB was handled during Kolla Ansible runs and did not affect Galera cluster during normal operations unless MariaDB was later touched by Kolla Ansible. Users wishing to run actions on their Galera clusters using Kolla Ansible are strongly advised to update. For details please see the following Launchpad bug records: bug 1857908 and bug 1859145.
Fixes an issue with Nova when deploying new compute hosts using
--limit
. LP#1869371.
Adapts Octavia to the latest dual CA certificate configuration. The following files should exist in
/etc/kolla/config/octavia/
:client.cert-and-key.pem
client_ca.cert.pem
server_ca.cert.pem
server_ca.key.pem
See the Octavia documentation for details on generating these files.
Fixes an issue with RabbitMQ where tags would be removed from the
openstack
user after deploying Nova. This prevents the user from accessing the RabbitMQ management UI. LP#1875786
Fixes an issue where a failure in pulling an image could lead to a container being removed and not replaced. See bug 1852572 for details.
Since Openstack services can now be configured to use TLS enabled REST endpoints, urls should be constructed using the {{ internal_protocol }} and {{ external_protocol }} configuration parameters.
Construct service REST API urls using
kolla_internal_fqdn
instead ofkolla_internal_vip_address
. Otherwise SSL validation will fail when certificates are issued using domain names.
Fixes an issue with the
kolla-ansible stop
command where it may fail trying to stop non-existent containers. LP#1868596.
Fixes Swift volume mounting failing on kernel 4.19 and later due to removal of nobarrier from XFS mount options. See bug 1800132 for details.
Fixes an issue with fluentd parsing of WSGI logs for Aodh, Masakari, Qinling, Vitrage and Zun. See bug 1720371 for details.
Fixes gnocchi-api script name for Ubuntu/Debian binary deployments. LP#1861688
Fixes glance_api to run as privileged and adds missing mounts so it can use an iscsi cinder backend as its store. LP#1855695
When upgrading from Rocky to Stein HAProxy configuration moves from using a single configuration to assembling a file from snippets for each service. Applying the HAProxy tag to the entire play ensures that HAProxy configuration is generated for all services when the HAProxy tag is specified. For details please see bug 1855094.
Fixes an issue with the
ironic_ipxe
container serving instance images. See bug 1856194 for details.
Fixes an issue with Kibana deployment when
openstack_cacert
is unset. See bug 1864180 for details.
Fixes an issue with Monasca deployment where an invalid variable (
monasca_log_dir
) is referenced. See bug 1864181 for details.
Fixes an issue where host configuration tasks (
sysctl
, loading kernel modules) could be performed during thekolla-ansible genconfig
command. See bug 1860161 for details.
Fixes an issue with port prechecks for the Placement service. See bug 1861189 for details.
Fixes templating of Prometheus configuration when Alertmanager is disabled. In a deployment where Prometheus is enabled and Alertmanager is disabled the configuration for the Prometheus will fail when templating as the variable
prometheus_alert_rules
does not contain the keyfiles
. LP#1854540
Removes the
[http]/max-row-limit = 10000
setting from the default InfluxDB configuration, which resulted in the CloudKitty v1 API returning only 10000 dataframes when using InfluxDB as a storage backend. See bug 1862358 for details.
Skydive’s API and the web UI now rely on Keystone for authentication. Only users in the Keystone project defined by skydive_admin_tenant_name will be able to authenticate. See LP#1870903 <https://launchpad.net/bugs/1870903> for more details.
Fixes an issue where Elasticsearch API requests made during Kibana, Elasticsearch and Monasca deployment could have an invalid body. See bug 1864177 for details.
masakari-monitor
will now use the internal API to reach masakari-api. LP#1858431
Switch endpoint_type from public to internal for octavia communicating with the barbican service. See bug 1875618 for details.