Distributed OVSDB events handler¶
This document presents the problem and proposes a solution for handling OVSDB events in a distributed fashion in networking-ovn.
Problem description¶
In networking-ovn, the OVSDB Monitor class is responsible for listening to the OVSDB events and performing certain actions on them. We use it extensively for various tasks including critical ones such as monitoring for port binding events (in order to notify Neutron/Nova that a port has been bound to a certain chassis). Currently, this class uses a distributed OVSDB lock to ensure that only one instance handles those events at a time.
The problem with this approach is that it creates a bottleneck because even if we have multiple Neutron Workers running at the moment, only one is actively handling those events. And, this problem is highlighted even more when working with technologies such as containers which rely on creating multiple ports at a time and waiting for them to be bound.
Proposed change¶
In order to fix this problem, this document proposes using a Consistent Hash Ring to split the load of handling events across multiple Neutron Workers.
A new table called ovn_hash_ring
will be created in the Neutron
Database where the Neutron Workers capable of handling OVSDB events will
be registered. The table will use the following schema:
Column name |
Type |
Description |
---|---|---|
node_uuid |
String |
Primary key. The unique identification of a Neutron Worker. |
hostname |
String |
The hostname of the machine this Node is running on. |
created_at |
DateTime |
The time that the entry was created. For troubleshooting purposes. |
updated_at |
DateTime |
The time that the entry was updated. Used as a heartbeat to indicate that the Node is still alive. |
This table will be used to form the Consistent Hash Ring. Fortunately, we have an implementation already in the tooz library of OpenStack. It was contributed by the Ironic team which also uses this data structure in order to spread the API request load across multiple Ironic Conductors.
Here’s how a Consistent Hash Ring from tooz works:
from tooz import hashring
hring = hashring.HashRing({'worker1', 'worker2', 'worker3'})
# Returns set(['worker3'])
hring[b'event-id-1']
# Returns set(['worker1'])
hring[b'event-id-2']
How OVSDB Monitor will use the Ring¶
Every instance of the OVSDB Monitor class will be listening to a series of events from the OVSDB database and each of them will have a unique ID registered in the database which will be part of the Consistent Hash Ring.
When an event arrives, each OVSDB Monitor instance will hash that event UUID and the ring will return one instance ID, which will then be compared with its own ID and if it matches that instance will then process the event.
Verifying status of OVSDB Monitor instance¶
A new maintenance task will be created in networking-ovn which will
update the updated_at
column from the ovn_hash_ring
table for
the entries matching its hostname indicating that all Neutron Workers
running on that hostname are alive.
Note that only a single maintenance instance runs on each machine so the writes to the Neutron database are optimized.
When forming the ring, the code should check for entries where the
value of updated_at
column is newer than a given timeout. Entries
that haven’t been updated in a certain time won’t be part of the ring.
If the ring already exists it will be re-balanced.
Clean up and minimizing downtime window¶
Apart from heartbeating, we need to make sure that we remove the Nodes from the ring when the service is stopped or killed.
By stopping the neutron-server
service, all Nodes sharing the same
hostname as the machine where the service is running will be removed
from the ovn_hash_ring
table. This is done by handling the SIGTERM
event. Upon this event arriving, networking-ovn should invoke the clean
up method and then let the process halt.
Unfortunately nothing can be done in case of a SIGKILL, this will leave the nodes in the database and they will be part of the ring until the timeout is reached or the service is restarted. This can introduce a window of time which can result in some events being lost. The current implementation shares the same problem, if the instance holding the current OVSDB lock is killed abruptly, events will be lost until the lock is moved on to the next instance which is alive. One could argue that the current implementation aggravates the problem because all events will be lost where with the distributed mechanism some events will be lost. As far as distributed systems goes, that’s a normal scenario and things are soon corrected.
Ideas for future improvements¶
This section contains some ideas that can be added on top of this work to further improve it:
Listen to changes to the Chassis table in the OVSDB and force a ring re-balance when a Chassis is added or removed from it.
Cache the ring for a short while to minimize the database reads when the service is under heavy load.
To greater minimize/avoid event losses it would be possible to cache the last X events to be reprocessed in case a node times out and the ring re-balances.