Conductor/node grouping affinity¶
https://storyboard.openstack.org/#!/story/2001795
This spec proposes adding a conductor_group
property to nodes, which can be
matched to one or more conductors configured with a matching
conductor_group
configuration option, to restrict control of those nodes to
these conductors.
Problem description¶
Today, there is no way to control the conductor-to-node mapping. This is desirable for a number of reasons:
An operator may have an Ironic deployment that spans multiple sites. Without control of this mapping, images may be pulled over WAN links. This causes slower deployments and may be less secure.
Similarly, an operator may want to map nodes to conductors that are physically closer to the nodes in the same site, to reduce the number of network hops between the node and the conductor. A prime example of this would be to place a conductor in each rack, reducing the path to only go through the top-of-rack switch.
A deployer may have multiple networks for out-of-band control, that must be completely isolated. This feature would allow isolating a conductor to a single out-of-band network.
A deployer may have multiple physical networks that not all conductors are connected to. By configuring the mapping correctly, conductors can manage only the nodes which they can communicate with. This is described further in another RFE.[0]
Proposed change¶
We propose adding a conductor_group
configuration option for the conductor,
which is a single arbitrary string specifying some grouping characteristic of
the conductor.
We also propose a conductor_group
field in the node record, which will be
used to map a node to a conductor. This matching will be done case-insensitive,
to make things a bit easier for operators.
A blank conductor_group
field or config is the default. A conductor without
a group can only manage nodes without a group, and a node without a group can
only be managed by a conductor without a group.
The hash ring will need to be modified to take grouping into account, as described below in the RPC API Impact section.
Alternatives¶
Another RFE[1] proposes a complex system of hard and soft affinity, affinity and anti-affinity, and scoring of placement to a conductor with multiple tags. This is quite complex, and I don’t believe we’ll get it done in the short term. Completing this more basic work doesn’t block this more complex work, and so we should take it one step at a time.
Data model impact¶
A conductor_group
field will be added to the nodes table, as a
VARCHAR(255)
. This will have a default of ""
, or the empty string.
This string will be used in the hash ring calculation, so there’s no sense in
defaulting to NULL
.
A conductor_group
field will also be added to the conductors table, also as
a VARCHAR(255)
. This will also have a default of ""
, or the empty
string. This will be used to build the hash ring to look up where nodes should
land.
State Machine Impact¶
None.
REST API impact¶
The conductor_group
field of the node will be added to the node object in
the REST API, with a microversion as usual. It will be allowed in POST and
PATCH requests. As with the database, it will be restricted to 255 characters.
There must be a conductor in that group available, as the conductor services
node creation and updates, and is selected via the hash ring.
It’s worth noting that we would like to expose the grouping of conductors via the REST API eventually. However, the best way to do this isn’t immediately clear, so we leave it outside the scope of this spec for now. Another RFE[3] proposes a service management API that may be a good fit.
Client (CLI) impact¶
“ironic” CLI¶
None, it’s deprecated.
“openstack baremetal” CLI¶
The conductor_group
field for a node will be exposed in the client output,
and added to the node create
and node set
commands.
RPC API impact¶
This will affect which conductor is the destination for RPC calls corresponding to a given node, however won’t have a direct effect on the RPC API itself.
The hash ring will change such that the internal keys for the hash ring will
now be of the structure "$conductor_group:$drivername"
. A colon (:
) is
used as the separator between the two, to eliminate conflicts between
conductor groups and drivers or hardware types. For example, an agent_ilo
key with no separator could mean a node with no group and the agent_ilo
driver, or it could mean a node with group agent_
using the ilo
hardware type. To handle upgrades, hash ring keys will be built without
the conductor group while the service is pinned to a version before this
feature, and built with the conductor group when the service is unpinned or
pinned to a version after this feature is implemented.
We handle upgrades by ignoring grouping for services which have a pin in the RPC version that is less than the release with this feature. Once everything is upgraded and unpinned, we begin using the grouping tags configured.
Operators should leave a sufficient number of conductors available without a grouping tag configured to run the cluster, until nodes can be configured with the grouping tag. Any nodes without a grouping tag will only be managed by conductors without a grouping tag.
Driver API impact¶
Hash ring generation and lookup will include the grouping tag, as specified above in the RPC API Impact section.
Nova driver impact¶
This change is transparent to Nova.
Ramdisk impact¶
None.
Security impact¶
No direct impact; however this provides another mechanism for securing a deployment by enabling logical infrastructure segregation.
Other end user impact¶
None.
Scalability impact¶
None.
Performance Impact¶
None.
Other deployer impact¶
Deployers that wish to use this feature will need to manage the process of labeling conductors and nodes to enable it, which may be a non-trivial task.
Developer impact¶
None.
Implementation¶
Assignee(s)¶
- Primary assignee:
jroll
- Other contributors:
dtantsur
Work Items¶
Add database fields.
Add conductor config and populate conductor DB field.
Change the hash ring calculation, and bump the RPC API so that we can pin during upgrades.
Add fields to the node and conductor objects.
Make the REST API changes.
Update the client library/CLI.
Document the feature.
Dependencies¶
None.
Testing¶
Unit tests should be sufficient, as that’s how we test our hash ring now. It’s difficult to test this with Tempest without exposing conductor grouping via the REST API.
Upgrades and Backwards Compatibility¶
This is described in the RPC API Impact section.
Documentation Impact¶
This should be documented in the install guide and admin guide.
References¶
[0] https://storyboard.openstack.org/#!/story/1734876
[1] https://storyboard.openstack.org/#!/story/1739426
- [2] Notes from the Rocky PTG session:
https://etherpad.openstack.org/p/ironic-rocky-ptg-location-awareness