We use the STATUS field on objects to indicate when a resource is ready by setting it to ACTIVE so external systems know when it’s safe to use that resource. Knowing when to set the status to ACTIVE is simple when there is only one entity responsible for provisioning a given object. When that entity has finishing provisioning, we just update the STATUS directly to active. However, there are resources in Neutron that require provisioning by multiple asynchronous entities before they are ready to be used so managing the transition to the ACTIVE status becomes more complex. To handle these cases, Neutron has the provisioning_blocks module to track the entities that are still provisioning a resource.
The main example of this is with ML2, the L2 agents and the DHCP agents. When a port is created and bound to a host, it’s placed in the DOWN status. The L2 agent now has to setup flows, security group rules, etc for the port and the DHCP agent has to setup a DHCP reservation for the port’s IP and MAC. Before the transition to ACTIVE, both agents must complete their work or the port user (e.g. Nova) may attempt to use the port and not have connectivity. To solve this, the provisioning_blocks module is used to track the provisioning state of each agent and the status is only updated when both complete.
To make use of the provisioning_blocks module, provisioning components should be added whenever there is work to be done by another entity before an object’s status can transition to ACTIVE. This is accomplished by calling the add_provisioning_component method for each entity. Then as each entity finishes provisioning the object, the provisioning_complete must be called to lift the provisioning block.
When the last provisioning block is removed, the provisioning_blocks module will trigger a callback notification containing the object ID for the object’s resource type with the event PROVISIONING_COMPLETE. A subscriber to this event can now update the status of this object to ACTIVE or perform any other necessary actions.
A normal state transition will look something like the following:
For a more concrete example, see the section below.
ML2 makes use of the provisioning_blocks module to prevent the status of ports from being transitioned to ACTIVE until both the L2 agent and the DHCP agent have finished wiring a port.
When a port is created or updated, the following happens to register the DHCP agent’s provisioning blocks:
When a port is created or updated, the following happens to register the L2 agent’s provisioning blocks:
Once the DHCP agent has finished setting up the reservation, it calls dhcp_ready_on_ports via the RPC API with the port ID. The DHCP RPC handler receives this and calls ‘provisioning_complete’ in the provisioning module with the port ID and the ‘DHCP’ entity to remove the provisioning block.
Once the L2 agent has finished setting up the reservation, it calls the normal update_device_list (or update_device_up) via the RPC API. The RPC callbacks handler calls ‘provisioning_complete’ with the port ID and the ‘L2 Agent’ entity to remove the provisioning block.
On the ‘provisioning_complete’ call that removes the last record, the provisioning_blocks module emits a callback PROVISIONING_COMPLETE event with the port ID. A function subscribed to this in ML2 then calls update_port_status to set the port to ACTIVE.
At this point the normal notification is emitted to Nova allowing the VM to be unpaused.
In the event that the DHCP or L2 agent is down, the port will not transition to the ACTIVE status (as is the case now if the L2 agent is down). Agents must account for this by telling the server that wiring has been completed after configuring everything during startup. This ensures that ports created on offline agents (or agents that crash and restart) eventually become active.
To account for server instability, the notifications about port wiring be complete must use RPC calls so the agent gets a positive acknowledgement from the server and it must keep retrying until either the port is deleted or it is successful.
If an ML2 driver immediately places a bound port in the ACTIVE state (e.g. after calling a backend in update_port_postcommit), this patch will not have any impact on that process.
[1] | Provisioning Blocks Module: http://git.openstack.org/cgit/openstack/neutron/tree/neutron/db/provisioning_blocks.py |