https://blueprints.launchpad.net/octavia/+spec/activepassiveamphora
This blueprint describes how Octavia implements its Active/Standby solution. It will describe the high level topology and the proposed code changes from the current supported Single topology to realize the high availability loadbalancer scenario.
A tenant should be able to start high availability loadbalancer(s) for the tenant’s backend services as follows:
asciiflow:
+--------+
| Tenant |
|Service |
| (1) |
+--------+ +-----------+
| +--------+ +----+ Master +----+
| | Tenant | |VIP | Amphora |IP1 |
| |Service | +--+-+-----+-----+-+--+
| | (M) | | |MGMT |VRRP | |
| +--------+ | | IP | IP1 | |
| | Tenant | +--+--++----+ |
| | Network | | | | +-----------------+ Floating +---------+
v-v-------------^----+---v-^----v-^-+ Router | IP | |
^---------------+----v-^---+------+-+Floating <-> VIP <----------+ Internet|
| Management | | | | | | | |
| (MGMT) | | | | +-----------------+ +---------+
| Network | +--+--++----+ |
| Paired |MGMT |VRRP | |
| | | IP | IP2 | |
+-----------+ | +-----+-----+ |
| Octavia | ++---+ Backup +-+--+
|Controller | |VIP | Amphora |IP2 |
| (s) | +----+-----------+----+
+-----------+
The Active/Standby loadbalancers require the following high level changes:
We could use heartbeats as an alternative to VRRP, which is also a widely adopted solution. Heartbeats better suit redundant file servers, filesystems, and databases rather than network services such as routers, firewalls, and loadbalancers. Willy Tarreau, the creator of Haproxy, provides a detailed view on the major differences between heartbeats and VRRP in [5].
The data model of the Octavia database shall be impacted as follows:
** Changes to amphora API: see [11] **
PUT /listeners/{amphora_id}/{listener_id}/haproxy
PUT /vrrp/upload
PUT /vrrp/{action}
GET /interface/{ip_addr}
** Changes to operator API: see [10] **
POST /loadbalancers * Successful Status Code - 202 * JSON Request Body Attributes ** vip - another JSON object with one required attribute from the following * net_port_id - uuid * subnet_id - uuid * floating_ip_id - uuid * floating_ip_network_id - uuid ** tenant_id - string - optional - default “0” * 36 (for now) ** name - string - optional - default null ** description - string - optional - default null ** enabled - boolean - optional - default true * JSON Response Body Attributes ** id - uuid ** vip - another JSON object * net_port_id - uuid * subnet_id - uuid * floating_ip_id - uuid * floating_ip_network_id - uuid ** tenant_id - string ** name - string ** description - string ** enabled - boolean ** provisioning_status - string enum - (ACTIVE, PENDING_CREATE, PENDING_UPDATE, PENDING_DELETE, DELETED, ERROR) ** operating_status - string enum - (ONLINE, OFFLINE, DEGRADED, ERROR) ** topology - string enum - (SINGLE, ACTIVE_STANDBY)
PUT /loadbalancers/{lb_id} * Successful Status Code - 202 * JSON Request Body Attributes ** name - string ** description - string ** enabled - boolean * JSON Response Body Attributes ** id - uuid ** vip - another JSON object * net_port_id - uuid * subnet_id - uuid * floating_ip_id - uuid * floating_ip_network_id - uuid ** tenant_id - string ** name - string ** description - string ** enabled - boolean ** provisioning_status - string enum - (ACTIVE, PENDING_CREATE, PENDING_UPDATE, PENDING_DELETE, DELETED, ERROR) ** operating_status - string enum - (ONLINE, OFFLINE, DEGRADED, ERROR) ** topology - string enum - (SINGLE, ACTIVE_STANDBY)
GET /loadbalancers/{lb_id} * Successful Status Code - 200 * JSON Response Body Attributes ** id - uuid ** vip - another JSON object * net_port_id - uuid * subnet_id - uuid * floating_ip_id - uuid * floating_ip_network_id - uuid ** tenant_id - string ** name - string ** description - string ** enabled - boolean ** provisioning_status - string enum - (ACTIVE, PENDING_CREATE, PENDING_UPDATE, PENDING_DELETE, DELETED, ERROR) ** operating_status - string enum - (ONLINE, OFFLINE, DEGRADED, ERROR) ** topology - string enum - (SINGLE, ACTIVE_STANDBY)
None.
The Active/Standby can consume up to twice the resources (storage, network, compute) as required by the Single Topology. Nevertheless, one single amphora shall be active (i.e. serving end-user) at any point in time. If the Master amphora is healthy, the backup one shall remain idle until it receives no VRRP advertisements from the master.
The VRRP requires executing health checks in the amphorae at fine grain granularity period. The health checks shall be as lightweight as possible such that VRRP is able to execute all check scripts within a predefined interval. If the check scripts failed to run within this predefined interval, VRRP may become unstable and may alternate the amphorae roles between MASTER and BACKUP incorrectly.
None.
Sherif Abdelwahab (abdelwas)
Keepalived version deployed in the amphora image must be newer than 1.2.8 to support unicast VRRP mode.
[1] Implementing High Availability Instances with Neutron using VRRP http://goo.gl/eP71g7
[2] RFC3768 Virtual Router Redundancy Protocol (VRRP)
[3] https://review.openstack.org/#/c/38230/
[4] http://www.keepalived.org/LVS-NAT-Keepalived-HOWTO.html
[5] http://www.formilux.org/archives/haproxy/1003/3259.html
[6] https://blueprints.launchpad.net/octavia/+spec/base-image
[7] https://blueprints.launchpad.net/octavia/+spec/controller-worker
[8] https://blueprints.launchpad.net/octavia/+spec/amphora-driver-interface
[9] https://blueprints.launchpad.net/octavia/+spec/controller
[10] https://blueprints.launchpad.net/octavia/+spec/operator-api
[11] doc/main/api/haproxy-amphora-api.rst