Wallaby Series (16.1.0 - 17.0.x) Release Notes¶
17.1.0-23¶
Upgrade Notes¶
When upgrading Ironic to address the
qemu-img
image conversion security issues, theironic-python-agent
ramdisks will also need to be upgraded.
As a result of security fixes to address
qemu-img
image conversion security issues, a new configuration parameter has been added to Ironic,[conductor]permitted_image_formats
with a default value of “raw,qcow2,iso”. Raw and qcow2 format disk images are the image formats the Ironic community has consistently stated as what is supported and expected for use with Ironic. These formats also match the formats which the Ironic community tests. Operators who leverage other disk image formats, may need to modify this setting further.
Adds
sha256
,sha384
andsha512
as supported SNMPv3 authentication protocols to iRMC driver.
Security Issues¶
Ironic now checks the supplied image format value against the detected format of the image file, and will prevent deployments should the values mismatch. If being used with Glance and a mismatch in metadata is identified, it will require images to be re-uploaded with a new image ID to represent corrected metadata. This is the result of CVE-2024-44082 tracked as bug 2071740.
Ironic always inspects the supplied user image content for safety prior to deployment of a node should the image pass through the conductor, even if the image is supplied in
raw
format. This is utilized to identify the format of the image and the overall safety of the image, such that source images with unknown or unsafe feature usage are explicitly rejected. This can be disabled by setting[conductor]disable_deep_image_inspection
toTrue
. This is the result of CVE-2024-44082 tracked as bug 2071740.
Ironic also inspect images which would normally be provided as a URL for direct download by the
ironic-python-agent
ramdisk. This is enabled by default and increases the overall network traffic and disk space utilization of the conductor. This level of inspection can be disabled by setting[conductor]conductor_always_validates_images
toFalse
. Doing so is not advisable as Zed release and earlierironic-python-agent
ramdisks will not be made available due to backport regression risk. This is the result of CVE-2024-44082 tracked as bug 2071740.
Ironic now explicitly enforces a list of permitted image types for deployment via the
[conductor]permitted_image_formats
setting, which defaults to “raw”, “qcow2”, and “iso”. While the project has classically always declared permissible images as “qcow2” and “raw”, it was previously possible to supply other image formats known toqemu-img
, and the utility would attempt to convert the images. The “iso” support is required for “boot from ISO” ramdisk support.
Ironic now explicitly passes the source input format to executions of
qemu-img
to limit the permitted qemu disk image drivers which may evaluate an image to prevent any mismatched format attacks againstqemu-img
.
The
ansible
deploy interface example playbooks now supply an input format to execution ofqemu-img
. If you are using customized playbooks, please add “-f {{ ironic.image.disk_format }}” to your invocations ofqemu-img
. If you do not do so,qemu-img
will automatically try and guess which can lead to known security issues with the incorrect source format driver.
Operators who have implemented any custom deployment drivers or additional functionality like machine snapshot, should review their downstream code to ensure they are properly invoking
qemu-img
. If there are any questions or concerns, please reach out to the Ironic project developers.
Operators are reminded that they should utilize cleaning in their environments. Disabling any security features such as cleaning or image inspection are at your own risk. Should you have any issues with security related features, please don’t hesitate to open a bug with the project.
The
[conductor]disable_deep_image_inspection
setting is conveyed to theironic-python-agent
ramdisks automatically, and will prevent those operating ramdisks from performing deep inspection of images before they are written.
The
[conductor]permitted_image_formats
setting is conveyed to theironic-python-agent
ramdisks automatically. Should a need arise to explicitly permit an additional format, that should take place in the Ironic service configuration.
An issue in Ironic has been resolved where image checksums would not be checked prior to the conversion of an image to a
raw
format image from another image format.With default settings, this normally would not take place, however the
image_download_source
option, which is available to be set at anode
level for a single deployment, by default for that baremetal node in all cases, or via the[agent]image_download_source
configuration option when set tolocal
. By default, this setting ishttp
.This was in concert with the
[DEFAULT]force_raw_images
when set toTrue
, which caused Ironic to download and convert the file.In a fully integrated context of Ironic’s use in a larger OpenStack deployment, where images are coming from the Glance image service, the previous pattern was not problematic. The overall issue was introduced as a result of the capability to supply, cache, and convert a disk image provided as a URL by an authenticated user.
Ironic will now validate the user supplied checksum prior to image conversion on the conductor. This can be disabled using the
[conductor]disable_file_checksum
configuration option.
Bug Fixes¶
Fixes multiple issues in the handling of images as it relates to the execution of the
qemu-img
utility, which is used for image format conversion, where a malicious user could craft a disk image to potentially extract information from anironic-conductor
process’s operating environment.Ironic now explicitly enforces a list of approved image formats as a
[conductor]permitted_image_formats
list, which mirrors the image formats the Ironic project has historically tested and expressed as known working. Testing is not based upon file extension, but upon content fingerprinting of the disk image files. This is tracked as CVE-2024-44082 via bug 2071740.
Fixes a security issue where Ironic would fail to checksum disk image files it downloads when Ironic had been requested to download and convert the image to a raw image format. This required the
image_download_source
to be explicitly set tolocal
, which is not the default.This fix can be disabled by setting
[conductor]disable_file_checksum
toTrue
, however this option will be removed in new major Ironic releases.As a result of this, parity has been introduced to align Ironic to Ironic-Python-Agent’s support for checksums used by
standalone
users of Ironic. This includes support for remote checksum files to be supplied by URL, in order to prevent breaking existing users which may have inadvertently been leveraging the prior code path. This support can be disabled by setting[conductor]disable_support_for_checksum_files
toTrue
.
Fixes Ironic integration with Cinder because of changes which resulted as part of the recent Security related fix in bug 2004555. The work in Ironic to track this fix was logged in bug 2019892. Ironic now sends a service token to Cinder, which allows for access restrictions added as part of the original CVE-2023-2088 fix to be appropriately bypassed. Ironic was not vulnerable, but the restrictions added as a result did impact Ironic’s usage. This is because Ironic volume attachments are not on a shared “compute node”, but instead mapped to the physical machines and Ironic handles the attachment life-cycle after initial attachment.
Fixes bug of iRMC driver in parse_driver_info where, if FIPS is enabled, SNMP version is always required to be version 3 even though iRMC driver’s xxx_interface doesn’t use SNMP actually.
Fixes an issue where a System Scoped user could not trigger a node into a
manageable
state with cleaning enabled, as the Neutron client would attempt to utilize their user’s token to create the Neutron port for the cleaning operation, as designed. This is because with requests made in thesystem
scope, there is no associated project and the request fails.Ironic now checks if the request has been made with a
system
scope, and if so it utilizes the internal credential configuration to communicate with Neutron.
Fixes SNMPv3 message authentication and encryption functionality of iRMC driver. The SNMPv3 authentication between iRMC driver and iRMC was only by the security name with no passwords and encryption. To increase security, the following parameters are now added to the node’s
driver_info
, and can be used for authentication:irmc_snmp_user
irmc_snmp_auth_password
irmc_snmp_priv_password
irmc_snmp_auth_proto
(Optional, defaults tosha
)irmc_snmp_priv_proto
(Optional, defaults toaes
)
irmc_snmp_user
replacesirmc_snmp_security
.irmc_snmp_security
will be ignored ifirmc_snmp_user
is set.irmc_snmp_auth_proto
andirmc_snmp_priv_proto
can also be set through the following options in the[irmc]
section of/etc/ironic/ironic.conf
:snmp_auth_proto
snmp_priv_proto
Modify iRMC driver to use ironic.conf [deploy] default_boot_mode to determine default boot_mode.
Fixes issues with Lenovo hardware where the system firmware may display a blue “Boot Option Restoration” screen after the agent writes an image to the host in UEFI boot mode, requiring manual intervention before the deployed node boots. This issue is rooted in multiple changes being made to the underlying NVRAM configuration of the node. Lenovo engineers have suggested to only change the UEFI NVRAM and not perform any further changes via the BMC to configure the next boot. Ironic now does such on Lenovo hardware. More information and background on this issue can be discovered in bug 2053064.
Fixes a race condition in PXE initialization where logic to retry what we suspect as potentially failed PXE boot operations was not consulting if an
agent token
had been established, which is the very first step in agent initialization.
Other Notes¶
Updates the minimum version of
python-scciclient
library to0.10.1
.
17.1.0¶
Upgrade Notes¶
On Wallaby release, to use certification file on HTTPS connection, iRMC driver requires python-scciclient version to be one of >=0.8.2,<0.9.0, >=0.9.5,<0.10.0 or >=0.10.1,<0.11.0 and packaging >=16.5
Security Issues¶
Modifies the
irmc
hardware type to include a capability to control enforcement of HTTPS certificate verification. By default this is enforced. python-scciclient version must be one of >=0.8.2,<0.9.0, >=0.9.5,<0.10.0, or >=0.10.1,<0.11.0 Or certificate verification will not occur.
Bug Fixes¶
Fixes the logic for the anaconda deploy interface. If the ironic node’s instance_info doesn’t have both ‘stage2’ and ‘ks_template’ specified, we weren’t using the instance_info at all. This has been fixed to use the instance_info if it was specified. Otherwise, ‘stage2’ is taken from the image’s properties (assumed that it is set there). ‘ks_template’ value is from the image properties if specified there (since it is optional); else we use the config setting ‘[anaconda] default_ks_template’.
For the anaconda deploy interface, the ‘stage2’ directory was incorrectly being created using the full path of the stage2 file; this has been fixed.
The anaconda deploy interface expects the node’s instance_info to be populated with the ‘image_url’; this is now populated (via PXEAnacondaDeploy’s prepare() method).
For the anaconda deploy interface, when the deploy was finished and the bm node was being rebooted, the node’s provision state was incorrectly being set to ‘active’ – the provisioning state-machine mechanism now handles that.
For the anaconda deploy interface, the code that was doing the validation of the kickstart file was incorrect and resulted in errors; this has been addressed.
For the anaconda deploy interface, the ‘%traceback’ section in the packaged ‘ks.cfg.template’ file is deprecated and fails validation, so it has been removed.
The anaconda deploy interface was saving internal information in the node’s instance_info, in the user-facing ‘stage2’ and ‘ks_template’ fields. This broke rebuilds using a different image with different stage2 or template specified in the image properties. This has been fixed by saving the information in the node’s driver_internal_info instead.
Fixes rebooting into the agent after changing BIOS settings in fast-track mode with the
redfish-virtual-media
boot interface. Previously, the ISO would not be configured.
Fixes a bug in the anaconda deploy interface where the ‘ks_options’ key was not found when rendering the default kickstart template.
Fixes issue where PXEAnacondaDeploy interface’s deploy() method did not return states.DEPLOYWAIT so the instance went straight to ‘active’ instead of ‘wait call-back’.
Fixes an issue where the anaconda deploy interface mistakenly expected ‘squashfs_id’ instead of ‘stage2_id’ property on the image.
Fixes the heartbeat mechanism in the default kickstart template ks.cfg.template as the heartbeat API only accepts ‘POST’ and expects a mandatory ‘callback_url’ parameter.
Fixes handling of tarball images in anaconda deploy interface. Allows user specified file extensions to be appended to the disk image symlink. Users can now set the file extensions by setting the ‘disk_file_extension’ property on the OS image. This enables users to deploy tarballs with anaconda deploy interface.
Fixes issue where automated cleaning was not supported when anaconda deploy interface is used.
Fixed an issue where duplicate extra DHCP options was passed in the port update request to the Networking service. The duplicate DHCP options caused an error in the Networking service and node provisioning would fail. See bug: 2009774.
Fixes
idrac-wsman
management interfaceset_boot_device
method that would fail deployment when there are existing jobs present with error “Failed to change power state to ‘’power on’’ by ‘’rebooting’’. Error: DRAC operation failed. Reason: Unfinished config jobs found: <list of existing jobs>. Make sure they are completed before retrying.”. Now there can be non-BIOS jobs present during deployment. This will still fail for cases when there are BIOS jobs present. In such cases should consider moving toidrac-redfish
that does not have this limitation when setting boot device.
Fixed an issue where provisioning/cleaning would fail on IPv6 routed provider networks. See bug: 2009773.
Fixes
redfish
andidrac-redfish
RAIDcreate_configuration
,apply_configuration
,delete_configuration
clean and deploy steps to update node’sraid_config
field at the end of the steps.
Fixes the determination of a failed RAID configuration task in the
redfish
hardware type. Prior to this fix the tasks that have failed were reported as successful.
Fixes the
redfish
hardware type RAID device creation and deletion when creating or deleting more than 1 logical disk on RAID controllers that require rebooting and do not allow more than 1 running task per RAID controller. Before this fix 2nd logical disk would fail to be created or deleted. With this change it is now possible to useredfish
raid
interface on iDRAC systems.
Fixes
redfish-virtual-media
boot
interface to allow it with iDRAC firmware from 6.00.00.00 (released June 2022) as it has virtual media boot issue fixed that prevented iDRAC firmware to work withredfish-virtual-media
before. Consider upgrading iDRAC firmware if not done already, otherwise will still get an error when trying to useredfish-virtual-media
with iDRAC.
Fixes an issue where clients would get a 404 due to the node pagination breaking at max_limit due to an uninitialised resource_url.
Fixes an issue where clients would get a 404 due to the port and portgroups pagination breaking at max_limit due to an uninitialised resource_url.
Fixes
File name too long
in the image caching code when a URL contains a long query string.
Fixes the
initrd
kernel parameter when booting ramdisk directly from Swift/RadosGW using iPXE. Previously it was alwaysdeploy_ramdisk
, even when the actual file name is different.
Adds
driver_info/irmc_verify_ca
option to specify certification file. Default value of driver_info/irmc_verify_ca is True.
Fixes an issue with installation of Ansible in
driver-requirements.txt
on Python 3.8. Since the release of Ansible 6.0.0, significant backtracking occurred in the Pip resolver.
Fixes connection caching issues with Redfish BMCs where AccessErrors were previously not disqualifying the cached connection from being re-used. Ironic will now explicitly open a new connection instead of using the previous connection in the cache. Under normal circumstances, the
sushy
redfish library would detect and refresh sessions, however a prior case exists where it may not detect a failure and contain cached session credential data which is ultimately invalid, blocking future access to the BMC via Redfish until the cache entry expired or theironic-conductor
service was restarted. For more information please see story 2009719.
17.0.4¶
Upgrade Notes¶
The query pattern for the database when lists of nodes are retrieved has been changed to a more efficient pattern at scale, where a list of nodes is generated, and then additional queries are executed to composite this data together. This is from a model where the database client in the conductor was having to deduplicate the resulting data set which is overall less efficent.
Critical Issues¶
Fixes upgrade failure caused by the missing version of
BIOSSetting
database objects.
Bug Fixes¶
Skips port creation during redfish inspect for devices reported without a MAC address.
Fixes potential cache coherency issues by caching the AgentClient per task, rather than globally.
Fixes a regression in the
ramdisk
deploy where custom kernel parameters were not used during inspection and cleaning.
Slow database retrieval of nodes has been addressed at the lower layer by explicitly passing and handling only the requested fields. The result is excess discarded work is not performed, making the overall process more efficent. This is particullarly beneficial for OpenStack Nova’s syncronization with Ironic.
Fixes configuring Redfish RAID using
interface_type
when error “failed to find matching physical disks for all logical disks” occurs.
Fixes issue in
idrac-redfish
clean/deploy stepimport_configuration
where partially successful jobs were treated as fully successful. Such jobs, completed with errors, are now treated as failures.
Fix
idrac-redfish
clean/deploy stepimport_configuration
to handle completed import configuration tasks that are deleted by iDRAC before Ironic has checked task’s status. Prior iDRAC firmware version 5.00.00.00 completed tasks are deleted after 1 minute in iDRAC Redfish. That is not always sufficient to check for their status in periodic check that runs every minute by default. Before this fix node got stuck in wait mode forever. This is fixed by failing the step with error informing to decrease periodic check interval or upgrade iDRAC firmware if not done already.
Fixes
idrac-wsman
BIOS and RAID interface steps to correctly check status of iDRAC job that completed with errors. Now these jobs are treated as failures. Before this fix node stayed in wait state as it was only checking for “Completed” or “Failed” job status, but not “Completed with Errors”.
Fixes
idrac-wsman
power interface to wait for the hardware to reach the target state before returning. For systems where soft power off at the end of deployment to boot to instance failed and forced hard power off was used, this left node successfully deployed in off state without any errors. This broke other workflows expecting node to be on booted into OS at the end of deployment. Additional information can be found in story 2009204.
When an
http(s)://
image is used, the cached copy of the image will always be updated if the HTTP server does not provide the last modification date and time. Previously the cached image would be considered up-to-date, which could cause invalid behavior if the image is generated on fly or was modified while being served.
Improves record retrieval performance for baremetal nodes by enabling ironic to not make redundant calls as part of generating API result sets for the baremetal nodes endpoint.
Fixes the pattern of execution for periodic tasks such that the majority of drivers now evaluate if work needs to be performed in advance of creating a node task. Depending on the individual driver query pattern, this prevents excess database queries from being triggered with every task execution.
Removes unused local images after ejecting a virtual media device via the
eject_vmedia
vendor passthru call of theredfish
vendor interface.
In Redfish RAID clean and deploy steps skip non-RAID storage controllers for RAID operations. In Redfish systems that do not implement
SupportedRAIDTypes
they are still processed and could result in unexpected errors.
Retries
ssl.SSLError
when connecting to the agent.
Fixes an issue of powering off with the
idrac-wsman
management interface while the execution of a clear job queue cleaning step is proceeding. Prior to this fix, the clean step would fail when powering off a node.
Other Notes¶
The default database query pattern has been changed which will result in additional database queries when compositing lists of
nodes
by separately queryingtraits
andtags
. Previously this was a joined query which requires deduplication of the result set before building composite objects.
17.0.3¶
Security Issues¶
Fixes an issue with the
/v1/nodes/detail
endpoint where an authenticated user could explicitly ask for aninstance_uuid
lookup and the associated node would be returned to the user with sensitive fields redacted in the result payload if the user did not explicitly haveowner
orlessee
permissions over the node. This is considered a low-impact low-risk issue as it requires the API consumer to already know the UUID value of the associated instance, and the returned information is mainly metadata in nature. More information can be found in Storyboard story 2008976.
Bug Fixes¶
If the agent accepts a command, but is unable to reply to Ironic (which sporadically happens before of the eventlet’s TLS implementation), we currently retry the request and fail because the command is already executing. Ironic now detects this situation by checking the list of executing commands after receiving a connection error. If the requested command is the last one, we assume that the command request succeeded.
When local boot is used (e.g. by default), the instance image validation now happens only in the deploy interface, not in the boot interface (as before). This means that the boot interface validation will now pass in many cases where it would previously fail.
Fixes an issue with the
/v1/nodes/detail
endpoint where requests for an explicitinstance_uuid
match would not follow the standard query handling path and thus not be filtered based on policy determined access level and node levelowner
orlessee
fields appropriately. Additional information can be found in story 2008976.
No longer masks configdrive when sending the node’s record to in-band deploy steps.
Fixes handling of single-value (non-key-value) parameters in the
[inspector]extra_kernel_params
configuration options.
The behavior when a bootable iso ramdisk is provided behind an http server is to download and serve the image from the conductor; the image is removed only when the node is undeployed. In certain cases, for example on large deployments, this could cause undesired behaviors, like the conductor nodes running out of disk storage. To avoid this event we provide an option
[deploy]ramdisk_image_download_source
to be able to tell the ramdisk interface to directly use the bootable iso url from its original source instead of downloading it and serving it from the conductor node. The default behavior is unchanged.
Fixes sub-optimal Ironic API performance where Secure RBAC related field level policy checks were executing without first checking if there were field results. This helps improve API performance when only specific columns have been requested by the API consumer.
17.0.2¶
Bug Fixes¶
Fixes the
idrac-wsman
BIOSfactory_reset
clean and deploy step to indicate success and update the cached BIOS settings to their defaults only when the BIOS settings have actually been reset. See story 2008058 for more details.
Removes temporary cleaning information on starting or restarting cleaning.
Removes unnecessary delay before the start of the cleaning process when fast-track is used.
Correctly processes in-band deploy steps on fast-track deployment.
Correctly wipes agent token on inspection start and abort.
Fixes providing agent tokens with pre-built ISO images and the
redfish-virtual-media
boot interface.
17.0.0¶
Prelude¶
The Ironic community is proud to release Ironic 17.0!
Where if it were developer years instead of major versions, we would all be very afraid since it already has access to the car keys.
This release of Ironic includes numerous advancements which extend an operator’s ability to customize and further extend their deployment to meet their needs.
Redfish enhancements including Out of Band RAID configuration management and automatic setting of Secure Boot on nodes deployed using
redfish
.Deployment enhancements including UEFI Partition Image handling, per-instance per-deployments of default interface selections, user requestable
deploy_steps
at deploy time, IPA file injection, and support for setting a node’s boot mode viainstance_info
.Support for
system
scoped Role Based Access controls andproject
scoped access is available by default for associated nodes when thenode
owner
orlessee
fields are set. This effort alone added over 1,500 new unit tests.Operator friendly fixes such as memory over-consumption guard for memory intensive tasks, vendor hardware aware handling to help address issues such as different settings being needed to invoke UEFI, and “lazy” loading of database attributes to reduce the overall database load.
Along with all of this massive amount of work, a number of bugs were fixed while we were along the road trip of this development cycle.
We sincerely hope you enjoy it!
New Features¶
It is now possible to configure a priority for both the delete and create configuration RAID cleaning steps which are disabled by default.
Adds
import_configuration
,export_configuration
andimport_export_configuration
steps toidrac-redfish
management interface. These steps allow to use configuration from another system as template and replicate that configuration to other, similarly capable, systems. Currently, this feature is experimental.
Adds support for passing a
kernel_append_param
setting to theilo-virtual-media
andilo-uefi-https
boot interfaces using the configuration parameter[ilo]/kernel_append_param
with theilo
andilo5
hardware types.
Adds support for the discovery of PXE Enabled NICs using the
idrac-redfish
inspect interface with theidrac
hardware type. With this feature, a port’spxe_enabled
status will be recorded on the bare metal port.
Adds support to manage certificates to the
ilo5
hardware type. A new optional boolean driver_info parameterilo_add_certificates
is introduced which can be used by the user to request addition of certificates to the iLO withilo-uefi-https
boot interface.
Adds the
[deploy]enable_nvme_secure_erase
option which allows the operator to enable NVMe format option for all nodes being managed by the conductor.
Add
anaconda
deploy interface to Ironic. This driver will deploy the OS using Anaconda installer and kickstart file instead of IPA. To support this feature a new configuration groupanaconda
is added to Ironic configuration file along withdefault_ks_template
configuration option.The deploy interface uses heartbeat API to communicate. The kickstart template must include %pre %post %onerror and %traceback sections that should send status of the deployment back to Ironic API using heartbeats. An example of such calls to hearbeat API can be found in the default kickstart template. To enable anaconda to send status back to Ironic API via heartbeat
agent_status
andagent_status_message
are added to the heartbeat API. Use of these new parameters require API microversion 1.72 or greater.
Adds support for fast-tracking to
ansible
deploy interface.
Allows providing a list of IPMI cipher suite versions via the new configuration option
[ipmi]/cipher_suite_versions
. The configuration is only used whenipmi_cipher_suite
is not set indriver_info
.
Adds a new
disable_ramdisk
parameter to the manual cleaning API. If set totrue
, IPA won’t get booted for cleaning. Only steps explicitly marked as compatible can be executed this way.The parameter is available in the API version 1.70.
Provides operator ability to override URL settings required for provisioning/cleaning in the event of virtual media based deployment. These scenarios tend to require more delineation than more traditional deployments as they often have a different environmental security requirements. Set these two new configuration options using an IP address that is available to these nodes (both the ramdisk and the BMCs):
[deploy] external_http_url = <routable URL of the HTTP server> external_callback_url = <routable URL of bare metal API>
Adds new GPU dynamic capabilities to
ilo
drivers inspection. gpu_<vendor>_count: Integer gpu_<gpu_device_name>_count: Integer gpu_<gpu_device_name>: Boolean
Enhance
idrac-wsman
inspect hardware interface to report an additional GPU device namely GV100GL [Tesla V100 PCIe 16GB]. With this enhancement, following GPU devices are reportedTU104GL [Tesla T4]
GV100GL [Tesla V100 PCIe 16GB]
Adds basic support for managing RAID configuration via the Redfish out-of-band (OOB) management protocol to the
idrac
hardware type by adding new interface namedidrac-redfish
. For this iDRAC firmware greater than 4.40.00.00 is required. Theidrac
hardware type now supportsidrac-wsman
,idrac
,idrac-redfish
, andno-raid
interfaces in given priority order.
Allows node
*_interface
values to be overridden by values in a nodeinstance_info
field. This gives non-administrative users a temporary method of setting interface values.
The network data schema is now configurable via the new configuration options
[api]network_data_schema
.
Adds capability to use
project
scoped requests in concert withsystem
scoped requests for a composite Role Based Access Control (RBAC) model. As Ironic is mainly an administrative service, this capability has only been extended to API endpoints which are not purely administrative in nature. This consists of the following API endpoints: nodes, ports, portgroups, volume connectors, volume targets, and allocations.
Project
scoped
requests for baremetal allocations, will automatically record theproject_id
of the requestor as theowner
of the node.
Adds support for automatic creation of ports for
redfish
enabled bare metal nodes using prior to ironic-inspector introspection. This feature is a part ofredfish
management interface.
Supplying configuration to the agent using the
redfish-virtual-media
boot interface now works through USB instead of floppy by default. Modern hardware (and even virtual machines) has limited support for floppies.
Adds support for pre-built ISO images to the
redfish-virtual-media
boot interface and its derivatives.
Adds a
redfish
nativeraid_interface
to theredfish
hardware type. See story 2003514 for details.Note that common RAID cases have been tested, but cases that are more complex or rely on vendor-specific implementation details may not work as desired due to capability limitations.
Adds support for managing an iDRAC – reset, clear job queue, and reset to known good state – via the Redfish out-of-band (OOB) management protocol to the
idrac
hardware type. This is offered by newidrac-redfish
management hardware interface implementation cleaning steps:reset_idrac
,clear_job_queue
, andknown_good_state
.known_good_state
both resets an iDRAC and clears its job queue.
Adds
[conductor]clean_step_priority_override
configuration parameter which allows the operator to define a custom order in which the cleaning steps are to run.
The Baremetal API, provided by the
ironic-api
process, now supports use ofsystem
scopedkeystone
authentication for the following endpoints: nodes, ports, portgroups, chassis, drivers, driver vendor passthru, volume targets, volume connectors, conductors, allocations, events, deploy templates
Introduces lazy-loading of ports, portgroups, volume connections and volume targets in task manager. For periodic tasks which create a task manager object but don’t require the aforementioned data (e.g. power sync), this change should reduce the number of database interactions by around two thirds, speeding up overall execution.
Adds support for multipath volumes. If the volume properties have multiple portals, then it will generate multiple iscsi urls and append them together for use in the generated ipxe file.
Known Issues¶
The addition of both
project
andsystem
scoped Role Based Access controls does add additional database queries when linked resources are accessed. Example, when attempting to access aport
orportgroup
, the associated node needs to be checked as this helps govern overall object access to the object forproject
scoped requests. This does not impactsystem
scoped requests. Operators who adopt project scoped access may find it necessary to verify or add additional database indexes in relation to the nodeuuid
column as well asnode_id
field in any table which may recieve heavy project query scope activity. Theironic
project anticipates that this will be a future work item of the project to help improve database performance.
Upgrade Notes¶
The
ilo-virtual-media
andilo-uefi-https
boot interfaces does not use[pxe]pxe_append_params
anymore. To pass kernel parameters use new configuration parameter[ilo]/kernel_append_param
.
Legacy policy rules have been deprecated. Operators are advised to review and update any custom policy files in use. Please see Secure Role Based Access Controls for more information.
The functionality of using a port.extra
vif_port_id
value to signal and control a VIF attachment has been removed to support changing the permission model and access control policy. Use ofvif_port_id
outside of the VIF attachment/detachment workflow has been deprecated since the Ocata development cycle.
Deprecated policy rules are not expressed via a default policy file generation from the source code. The generated default policy file indicates the new default policies with notes on the deprecation to which
oslo.policy
falls back to, until the[oslo_policy]enforce_scope
and[oslo_policy]enforce_new_defaults
have been set toTrue
. Please see the Victoria policy configuration documentation to reference prior policy configuration.
Operators are encouraged to move to
system
scope based authentication by setting[oslo_policy]enforce_scope
and[oslo_policy]enforce_new_defaults
. This requires a migration from using anadmin project
with thebaremetal_admin
andbaremetal_observer
. System wide administrators usingsystem
scopedadmin
andreader
accounts superceed the deprecated model.
Deprecation Notes¶
Deprecates ATA specific
agent_continue_if_ata_erase_failed
agent’s option which is replaced withagent_continue_if_secure_erase_failed
. The new option supports both ATA and NVMe secure erase. In order to ensure a smooth migration to the new configuration option, the operators need to upgrade Ironic Python Agent image to Wallaby release prior to upgrading Ironic Conductor to Xena.
- Pre-RBAC support rules have been deprecated. These consist of:
admin_api
is_member
is_observer
is_node_owner
is_node_lessee
is_allocation_owner
These rules will likely be removed in the Xena development cycle. Operators are advised to review any custom policy rules for these rules and move to the Secure Role Based Access Controls model.
The node’s
driver_info
parameterconfig_via_floppy
of theredfish-virtual-media
boot interface has been renamed toconfig_via_removable
. The old alias is deprecated.
Use of an
admin project
with ironic is deprecated. With this the custom roles,baremetal_admin
andbaremetal_observer
are also deprecated. Please migrate to using asystem
scoped account with theadmin
andreader
roles, respectively.
Security Issues¶
Ability to create an allocation has been restricted by a new policy rule
baremetal::allocation::create_pre_rbac
which prevents creation of allocations by any project administrator when operating with the new Role Based Access Control model. The use and enforcement of this rule is disabled when[oslo_policy]enforce_new_defaults
is set which also makes the population of aowner
field for allocations to become automatically populated. Most deployments should not encounter any issues with this security change, and the policy rule will be removed when support for the legacybaremetal_admin
custom role has been removed.
Fixes an issue where ironic was not properly labeling dynamicly built virtual media ramdisks with the signifier flag so the ramdisk understands it was booted from virtual media.
Bug Fixes¶
When using the Neutron DHCP driver, Ironic would only use the first fixed IP address to determine what IP versions are use on the port. Now, it checks for all the IP addresses and adds DHCP options for all IP versions.
Rejects
configdrive
that is not a JSON, a URL or a base64 string. Previously invalid JSON supplied to ironicclient could end up accepted as a configdrive, which would cause a failure much later.
Fixes the
[deploy]configdrive_use_object_store
option that was broken during the Python 3 transition.
Fixes the problem about grub2 config file. Some higher versions of grub2 (e.g. 2.05 or 2.06-rc1) use grub.cfg-01-MAC, while another lower versions of grub2 (e.g. 2.04) use MAC.conf, so we generate both paths in order to be compatible with both.
Fixes the missing
boot_method
ramdisk parameter for dynamicly build virtual media payloads. This value must be set tovmedia
for the ramdisk running on virtual media to understand it is executing from virtual media. This was fixed for cases where it is used with theredfish-virtual-media
based boot interfaces as well as theilo-virtual-media
boot interface, which is where dynamic virtual media deployment/cleaning ramdisk generation is supported.
Fixes
idrac-wsman
BIOSapply_configuration
andfactory_reset
clean and deploy steps to fail correctly in case of error when checking completed jobs. Before the fix when BIOS job failed, then node clean or deploy failed with timeout instead of actual error in cleaning or deploying step.
Adds handling of Redfish BMC’s which lack a
BootSourceOverrideMode
flag, such that it is no longer a fatal error for a deployment if the BMC does not support this field. This most common on BMCs which feature only a partial implementation of theComputerSystem
resourceboot
, but may also be observable on some older generations of BMCs which recieved updates to have partial Redfish support.
The fix for story 2008252 synced the boot mode after changing the boot device because Supermicro nodes reset the boot mode if not included in the boot device set. However this can cause a problem on Dell nodes when changing the mode uefi->bios or bios->uefi, see story 2008712 for details. Restrict the syncing of the boot mode to Supermicro.
Other Notes¶
Clean steps can now be marked with
requires_ramdisk=False
to make them compatible with the newdisable_ramdisk
argument of the manual cleaning API.
The API version of the Bare Metal API provided by the
ironic-api
service has been incremented to1.71
to signify that the API supports System and Project scoped Role Based Access Controls, which is purely informational in nature, as the version itself cannot be used to change the API behavior for access controls. In excess of 1500 unit tests were added as part of the effort to implement Role Based Access Controls to help ensure the effort did not break the API behavior.