Deployment

Congress has two modes for deployment: single-process and multi-process. If you are interested in test-driving Congress or are not concerned about high-availability, the single-process deployment is best because it is easiest to set up. If you are interested in making Congress highly-available you want the multi-process deployment.

In the single-process version, you run Congress as a single operating-system process on one node (i.e. container, VM, physical machine).

In the multi-process version, you start with the 3 components of Congress (the API, the policy engine, and the datasource drivers). You choose how many copies of each component you want to run, how you want to distribute those components across processes, and how you want to distribute those processes across nodes.

Section Configuration Options describes the common configuration options for both single-process and multi-process deployments. After that HA Overview and HA Deployment describe how to set up the multi-process deployment.

Configuration Options

In this section we highlight the configuration options that are specific to Congress. To generate a sample configuration file that lists all available options, along with descriptions run the following commands:

$ cd /path/to/congress
$ tox -egenconfig

The tox command will create the file etc/congress.conf.sample, which has a comprehensive list of options. All options have default values, which means that even if you specify no options Congress will run.

The options most important to Congress are described below, all of which appear under the [DEFAULT] section of the configuration file.

drivers
The list of permitted datasource drivers. Default is the empty list. The list is a comma separated list of Python class paths. For example: drivers = congress.datasources.neutronv2_driver.NeutronV2Driver,congress.datasources.glancev2_driver.GlanceV2Driver
datasource_sync_period
The number of seconds to wait between synchronizing datasource config from the database. Default is 0.
enable_execute_action
Whether or not congress will execute actions. If false, Congress will never execute any actions to do manual reactive enforcement, even if there are policy statements that say actions should be executed and the conditions of those actions become true. Default is True.

One of Congress’s new experimental features is distributing its services across multiple services and even hosts. Here are the options for using that feature.

bus_id
Unique ID of DSE bus. Can be any string. Defaults to ‘bus’. ID should be same across all the processes of a single congress instance and should be unique across different congress instances. Used if you want to create multiple, distributed instances of Congress and can be ignored if only one congress instance is deployed as single process in rabbitMQ cluster. Appears in the [dse] section.

Here are the most often-used, but standard OpenStack options. These are specified in the [DEFAULT] section of the configuration file.

auth_strategy
Method for authenticating Congress users. Can be assigned to either ‘keystone’ meaning that the user must provide Keystone credentials or to ‘noauth’ meaning that no authentication is required. Default is ‘keystone’.
verbose
Controls whether the INFO-level of logging is enabled. If false, logging level will be set to WARNING. Default is true. Deprecated.
debug
Whether or not the DEBUG-level of logging is enabled. Default is false.
transport_url
URL to the shared messaging service. It is not needed in a single-process Congress deployment, but must be specified in a multi-process Congress deployment.
[DEFAULT]
transport_url = rabbit://<rabbit-userid>:<rabbit-password>@<rabbit-host-address>:<port>

HA Overview

Some applications require Congress to be highly available. Some applications require a Congress Policy Engine (PE) to handle a high volume of queries. This guide describes Congress support for High Availability (HA) High Throughput (HT) deployment.

Please see the OpenStack High Availability Guide for details on how to install and configure OpenStack for High Availability.

HA Types

Warm Standby

Warm Standby is when a software component is installed and available on the secondary node. The secondary node is up and running. In the case of a failure on the primary node, the software component is started on the secondary node. This process is usually automated using a cluster manager. Data is regularly mirrored to the secondary system using disk based replication or shared disk. This generally provides a recovery time of a few minutes.

Active-Active (Load-Balanced)

In this method, both the primary and secondary systems are active and processing requests in parallel. Data replication happens through software capabilities and would be bi-directional. This generally provides a recovery time that is instantaneous.

Congress HAHT

Congress provides Active-Active for the Policy Engine and Warm Standby for the Datasource Drivers.

Run N instances of the Congress Policy Engine in active-active configuration, so both the primary and secondary systems are active and processing requests in parallel.

One Datasource Driver (DSD) per physical datasource, publishing data on oslo-messaging to all policy engines.

+-------------------------------------+      +--------------+
|       Load Balancer (eg. HAProxy)   | <----+ Push client  |
+----+-------------+-------------+----+      +--------------+
     |             |             |
PE   |        PE   |        PE   |        all+DSDs node
+---------+   +---------+   +---------+   +-----------------+
| +-----+ |   | +-----+ |   | +-----+ |   | +-----+ +-----+ |
| | API | |   | | API | |   | | API | |   | | DSD | | DSD | |
| +-----+ |   | +-----+ |   | +-----+ |   | +-----+ +-----+ |
| +-----+ |   | +-----+ |   | +-----+ |   | +-----+ +-----+ |
| | PE  | |   | | PE  | |   | | PE  | |   | | DSD | | DSD | |
| +-----+ |   | +-----+ |   | +-----+ |   | +-----+ +-----+ |
+---------+   +---------+   +---------+   +--------+--------+
     |             |             |                 |
     |             |             |                 |
     +--+----------+-------------+--------+--------+
        |                                 |
        |                                 |
+-------+----+   +------------------------+-----------------+
|  Oslo Msg  |   | DBs (policy, config, push data, exec log)|
+------------+   +------------------------------------------+

Details

  • Datasource Drivers (DSDs):
    • One datasource driver per physical datasource
    • All DSDs run in a single DSE node (process)
    • Push DSDs: optionally persist data in push data DB, so a new snapshot can be obtained whenever needed
    • Warm Standby:
      • Only one set of DSDs running at a given time; backup instances ready to launch
      • For pull DSDs, warm standby is most appropriate because warm startup time is low (seconds) relative to frequency of data pulls
      • For push DSDs, warm standby is generally sufficient except for use cases that demand sub-second latency even during a failover
  • Policy Engine (PE):
    • Replicate policy engine in active-active configuration.
    • Policy synchronized across PE instances via Policy DB
    • Every instance subscribes to the same data on oslo-messaging
    • Reactive Enforcement: All PE instances initiate reactive policy actions, but each DSD locally selects a leader to listen to. The DSD ignores execution requests initiated by all other PE instances.
      • Every PE instance computes the required reactive enforcement actions and initiates the corresponding execution requests over oslo-messaging
      • Each DSD locally picks a PE instance as leader (say the first instance the DSD hears from in the asymmetric node deployment, or the PE instance on the same node as the DSD in a symmetric node deployment) and executes only requests from that PE
      • If heartbeat contact is lost with the leader, the DSD selects a new leader
      • Each PE instance is unaware of whether it is a leader
    • Node Configurations:
      • Congress supports the Two Node-Types (API+PE nodes, all-DSDs) node configuration because it gives reasonable support for high-load DSDs while keeping the deployment complexities low.
    • Local Leader for Action Execution:
      • Local Leader: every PE instance sends action-execution requests, but each receiving DSD locally picks a “leader” to listen to
      • Because there is a single active DSD for a given data source, it is a natural spot to locally choose a “leader” among the PE instances sending reactive enforcement action execution requests. Congress supports the local leader style because it avoids the deployment complexities associated with global leader election. Furthermore, because all PE instances perform reactive enforcement and send action execution requests, the redundancy opens up the possibility for zero disruption to reactive enforcement when a PE instance fails.
  • API:
    • Each node has an active API service
    • Each API service routes requests for the PE to its associated intranode PE
    • Requests for any other service (eg. get data source status) are routed to the Datasource and/or Policy Engine, which will be fielded by some active instance of the service on some node
  • Load balancer:
    • Layer 7 load balancer (e.g. HAProxy) distributes incoming API calls among the nodes (each running an API service).
    • load balancer optionally configured to use sticky session to pin each API caller to a particular node. This configuration avoids the experience of going back in time.
  • External components (load balancer, DBs, and oslo messaging bus) can be made highly available using standard solutions (e.g. clustered LB, HA rabbitMQ)

Performance Impact

  • Increased latency due to network communication required by multi-node deployment
  • Increased reactive enforcement latency if action executions are persistently logged to facilitate smoother failover
  • PE replication can achieve greater query throughput

Cautions and Limitations

  • Replicated PE deployment is new in the Newton release and a major departure from the previous model. As a result, the deployer may be more likely to experience unexpected issues.
  • In the Newton release, creating a new policy requires locking a database table. As a result, it should not be deployed with a database backend that does not support table locking (e.g., Galera). The limitation is expected to be removed in the Ocata release.
  • Different PE instances may be out-of-sync in their data and policies (eventual consistency). The issue is generally made transparent to the end user by configuring the load balancer to make each user sticky to a particular PE instance. But if a user reaches a different PE instance (say because of load balancer configuration or because the original instance went down), the end user reaches a different instance and may experience out-of-sync artifacts.

HA Deployment

Overview

This section shows how to deploy Congress with High Availability (HA). For an architectural overview, please see the HA Overview.

An HA deployment of Congress involves five main steps.

  1. Deploy messaging and database infrastructure to be shared by all the Congress nodes.
  2. Prepare the hosts to run Congress nodes.
  3. Deploy N (at least 2) policy-engine nodes.
  4. Deploy one datasource-drivers node.
  5. Deploy a load-balancer to load-balance between the N policy-engine nodes.

The following sections describe each step in more detail.

Shared Services

All the Congress nodes share a database backend. To setup a database backend for Congress, please follow the database portion of separate install instructions.

Various solutions exist to avoid creating a single point of failure with the database backend.

Note: If a replicated database solution is used, it must support table locking. Galera, for example, would not work. This limitation is expected to be removed in the Ocata release.

A shared messaging service is also required. Refer to Shared Messaging for instructions for installing and configuring RabbitMQ.

Hosts Preparation

Congress should be installed on each host expected to run a Congress node. Please follow the directions in separate install instructions to install Congress on each host, skipping the local database portion.

In the configuration file, a transport_url should be specified to use the RabbitMQ messaging service configured in step 1.

For example:

[DEFAULT]
transport_url = rabbit://<rabbit-userid>:<rabbit-password>@<rabbit-host-address>:5672

All hosts should be configured with a database connection that points to the shared database deployed in step 1, not the local address shown in separate install instructions.

For example:

[database]
connection = mysql+pymysql://root:<database-password>@<shared-database-ip-address>/congress?charset=utf8

Datasource Drivers Node

In this step, we deploy a single datasource-drivers node in warm-standby style.

The datasource-drivers node can be started directly with the following command:

$ python /usr/local/bin/congress-server --datasources --node-id=<unique_node_id>

A unique node-id (distinct from all the policy-engine nodes) must be specified.

For warm-standby deployment, an external manager is used to launch and manage the datasource-drivers node. In this document, we sketch how to deploy the datasource-drivers node with Pacemaker .

See the OpenStack High Availability Guide for general usage of Pacemaker and how to deploy Pacemaker cluster stack. The guide also has some HA configuration guidance for other OpenStack projects.

Prepare OCF resource agent

You need a custom Resource Agent (RA) for DataSoure Node HA. The custom RA is located in Congress repository, /path/to/congress/script/ocf/congress-datasource. Install the RA with following steps.

$ cd /usr/lib/ocf/resource.d
$ mkdir openstack
$ cd openstack
$ cp /path/to/congress/script/ocf/congress-datasource ./congress-datasource
$ chmod a+rx congress-datasource

Configuring the Resource Agent

You can now add the Pacemaker configuration for Congress DataSource Node resource. Connect to the Pacemaker cluster with the crm configure command and add the following cluster resources. After adding the resource make sure commit the change.

primitive ds-node ocf:openstack:congress-datasource \
   params config="/etc/congress/congress.conf" \
   node_id="datasource-node" \
   op monitor interval="30s" timeout="30s"

Make sure that all nodes in the cluster have same config file with same name and path since DataSource Node resource, ds-node, uses config file defined at config parameter to launch the resource.

The RA has following configurable parameters.

  • config: a path of Congress’s config file
  • node_id(Option): a node id of the datasource node. Default is “datasource-node”.
  • binary(Option): a path of Congress binary Default is “/usr/local/bin/congress-server”.
  • additional_parameters(Option): additional parameters of congress-server

Policy Engine Nodes

In this step, we deploy N (at least 2) policy-engine nodes, each with an associated API server. This step should be done only after the Datasource Drivers Node is deployed. Each node can be started as follows:

$ python /usr/local/bin/congress-server --api --policy-engine --node-id=<unique_node_id>

Each node must have a unique node-id specified as a commandline option.

For high availability, each node is usually deployed on a different host. If multiple nodes are to be deployed on the same host, each node must have a different port specified using the bind_port configuration option in the congress configuration file.

Load-balancer

A load-balancer should be used to distribute incoming API requests to the N policy-engine (and API service) nodes deployed in step 3. It is recommended that a sticky configuration be used to avoid exposing a user to out-of-sync artifacts when the user hits different policy-engine nodes.

HAProxy is a popular load-balancer for this purpose. The HAProxy section of the OpenStack High Availability Guide has instructions for deploying HAProxy for high availability.