Ironic is an OpenStack project which provisions bare metal (as opposed to virtual) machines. It may be used independently or as part of an OpenStack Cloud, and integrates with the OpenStack Identity (keystone), Compute (nova), Network (neutron), Image (glance) and Object (swift) services.
When the Bare Metal service is appropriately configured with the Compute and Network services, it is possible to provision both virtual and physical machines through the Compute service’s API. However, the set of instance actions is limited, arising from the different characteristics of physical servers and switch hardware. For example, live migration can not be performed on a bare metal instance.
The community maintains reference drivers that leverage open-source technologies (eg. PXE and IPMI) to cover a wide range of hardware. Ironic’s pluggable driver architecture also allows hardware vendors to write and contribute drivers that may improve performance or add functionality not provided by the community drivers.
Here are a few use-cases for bare metal (physical server) provisioning in cloud; there are doubtless many more interesting ones:
The following diagram shows the relationships and how all services come into play during the provisioning of a physical server. (Note that Ceilometer and Swift can be used with Ironic, but are missing from this diagram.)
The diagram below shows the logical architecture. It shows the basic components that form the Ironic service, the relation of Ironic service with other OpenStack services and the logical flow of a boot instance request resulting in the provisioning of a physical server.
The Ironic service is composed of the following components:
As in Figure 1.2. Logical Architecture, a user request to boot an instance is passed to the Nova Compute service via Nova API and Nova Scheduler. The Compute service hands over this request to the Ironic service, where the request passes from the Ironic API, to the Conductor, to a Driver to successfully provision a physical server for the user.
Just as the Nova Compute service talks to various OpenStack services like Glance, Neutron, Swift etc to provision a virtual machine instance, here the Ironic service talks to the same OpenStack services for image, network and other resource needs to provision a bare metal instance.
PXE is part of the Wired for Management (WfM) specification developed by Intel and Microsoft. The PXE enables system’s BIOS and network interface card (NIC) to bootstrap a computer from the network in place of a disk. Bootstrapping is the process by which a system loads the OS into local memory so that it can be executed by the processor. This capability of allowing a system to boot over a network simplifies server deployment and server management for administrators.
DHCP is a standardized networking protocol used on Internet Protocol (IP) networks for dynamically distributing network configuration parameters, such as IP addresses for interfaces and services. Using PXE, the BIOS uses DHCP to obtain an IP address for the network interface and to locate the server that stores the network bootstrap program (NBP).
NBP is equivalent to GRUB (GRand Unified Bootloader) or LILO (LInux LOader) - loaders which are traditionally used in local booting. Like the boot program in a hard drive environment, the NBP is responsible for loading the OS kernel into memory so that the OS can be bootstrapped over a network.
TFTP is a simple file transfer protocol that is generally used for automated transfer of configuration or boot files between machines in a local environment. In a PXE environment, TFTP is used to download NBP over the network using information from the DHCP server.
IPMI is a standardized computer system interface used by system administrators for out-of-band management of computer systems and monitoring of their operation. It is a method to manage systems that may be unresponsive or powered off by using only a network connection to the hardware rather than to an operating system.
The Ironic RESTful API service is used to enroll hardware that Ironic will manage. A cloud administrator usually registers the hardware, specifying their attributes such as MAC addresses and IPMI credentials. There can be multiple instances of the API service.
The Ironic conductor service does the bulk of the work. For security reasons, it is advisable to place the conductor service on an isolated host, since it is the only service that requires access to both the data plane and IPMI control plane.
There can be multiple instances of the conductor service to support various class of drivers and also to manage fail over. Instances of the conductor service should be on separate nodes. Each conductor can itself run many drivers to operate heterogeneous hardware. This is depicted in the following figure.
The API exposes a list of supported drivers and the names of conductor hosts servicing them.
What happens when a boot instance request comes in? The below diagram walks through the steps involved during the provisioning of a bare metal instance.
These pre-requisites must be met before the deployment process:
This describes a typical ironic node deployment using PXE and the Ironic Python Agent (IPA). Depending on the ironic driver interfaces used, some of the steps might be marginally different, however the majority of them will remain the same.
A boot instance request comes in via the Nova API, through the message queue to the Nova scheduler.
Nova scheduler applies filters and finds the eligible hypervisor. The nova
scheduler also uses the flavor’s extra_specs
, such as cpu_arch
, to
match the target physical node.
Nova compute manager claims the resources of the selected hypervisor.
Nova compute manager creates (unbound) tenant virtual interfaces (VIFs) in the Networking service according to the network interfaces requested in the nova boot request. A caveat here is, the MACs of the ports are going to be randomly generated, and will be updated when the VIF is attached to some node to correspond to the node network interface card’s (or bond’s) MAC.
A spawn task is created by the nova compute which contains all
the information such as which image to boot from etc. It invokes the
driver.spawn
from the virt layer of Nova compute. During the spawn
process, the virt driver does the following:
Nova’s ironic virt driver issues a deploy request via the Ironic API to the Ironic conductor servicing the bare metal node.
Virtual interfaces are plugged in and Neutron API updates DHCP port to
set PXE/TFTP options. In case of using neutron
network interface,
ironic creates separate provisioning ports in the Networking service, while
in case of flat
network interface, the ports created by nova are used
both for provisioning and for deployed instance networking.
The ironic node’s boot interface prepares (i)PXE configuration and caches deploy kernel and ramdisk.
The ironic node’s management interface issues commands to enable network boot of a node.
The ironic node’s deploy interface caches the instance image (in case of
iscsi
deploy interface or most pxe_*
classic drivers), and kernel
and ramdisk if needed (it is needed in case of netboot for example).
The ironic node’s power interface instructs the node to power on.
The node boots the deploy ramdisk.
Depending on the exact driver used, either the conductor copies the image over iSCSI to the physical node (iSCSI deploy) or the deploy ramdisk downloads the image from a temporary URL (Direct deploy). The temporary URL can be generated by Swift API-compatible object stores, for example Swift itself or RadosGW.
The image deployment is done.
The node’s boot interface switches pxe config to refer to instance images (or, in case of local boot, sets boot device to disk), and asks the ramdisk agent to soft power off the node. If the soft power off by the ramdisk agent fails, the bare metal node is powered off via IPMI/BMC call.
The deploy interface triggers the network interface to remove provisioning ports if they were created, and binds the tenant ports to the node if not already bound. Then the node is powered on.
Note
There are 2 power cycles during bare metal deployment; the first time the node is powered-on when ramdisk is booted, the second time after the image is deployed.
The bare metal node’s provisioning state is updated to active
.
Below is the diagram that describes the above process.
The following two examples describe what ironic is doing in more detail, leaving out the actions performed by nova and some of the more advanced options.
Except where otherwise noted, this document is licensed under Creative Commons Attribution 3.0 License. See all OpenStack Legal Documents.