display name: workload_stabilization
goal: workload_balancing
Workload Stabilization control using live migration
This is workload stabilization strategy based on standard deviation algorithm. The goal is to determine if there is an overload in a cluster and respond to it by migrating VMs to stabilize the cluster.
It assumes that live migrations are possible in your cluster.
The workload_stabilization strategy requires the following metrics:
metric | service name | plugins | comment |
---|---|---|---|
compute.node.cpu.percent |
ceilometer | none | |
hardware.memory.used |
ceilometer | SNMP | |
cpu_util |
ceilometer | none | |
memory.resident |
ceilometer | none |
Default Watcher’s Compute cluster data model:
Nova cluster data model collector
The Nova cluster data model collector creates an in-memory representation of the resources exposed by the compute service.
Default Watcher’s actions:
action description migration
Migrates a server to a destination nova-compute host
This action will allow you to migrate a server to another compute destination host. Migration type ‘live’ can only be used for migrating active VMs. Migration type ‘cold’ can be used for migrating non-active VMs as well active VMs, which will be shut down while migrating.
The action schema is:
schema = Schema({ 'resource_id': str, # should be a UUID 'migration_type': str, # choices -> "live", "cold" 'destination_node': str, 'source_node': str, })The resource_id is the UUID of the server to migrate. The source_node and destination_node parameters are respectively the source and the destination compute hostname (list of available compute hosts is returned by this command:
nova service-list --binary nova-compute
).
Strategy parameters are:
parameter | type | default Value | description |
---|---|---|---|
metrics |
array | [“cpu_util”, “memory.resident”] | Metrics used as rates of cluster loads. |
thresholds |
object | {“cpu_util”: 0.2, “memory.resident”: 0.2} | Dict where key is a metric and value is a trigger value. |
weights |
object | {“cpu_util_weight”: 1.0, “memory.resident_weight”: 1.0} | These weights used to calculate common standard deviation. Name of weight contains meter name and _weight suffix. |
instance_metrics |
object | {“cpu_util”: “compute.node.cpu.percent”, “memory.resident”: “hardware.memory.used”} | Mapping to get hardware statistics using instance metrics. |
host_choice |
string | retry | Method of host’s choice. There are cycle, retry and fullsearch methods. Cycle will iterate hosts in cycle. Retry will get some hosts random (count defined in retry_count option). Fullsearch will return each host from list. |
retry_count |
number | 1 | Count of random returned hosts. |
periods |
object | {“instance”: 720, “node”: 600} | These periods are used to get statistic aggregation for instance and host metrics. The period is simply a repeating interval of time into which the samples are grouped for aggregation. Watcher uses only the last period of all recieved ones. |
{'value': 0, 'name': 'released_nodes_ratio', 'unit': '%', 'description': u'Ratio of released compute nodes divided by the total number of enabled compute nodes.'}
You can find description of overload algorithm and role of standard deviation here: https://specs.openstack.org/openstack/watcher-specs/specs/newton/implemented/sd-strategy.html
$ openstack optimize audittemplate create \
at1 workload_balancing --strategy workload_stabilization
$ openstack optimize audit create -a at1 \
-p thresholds='{"memory.resident": 0.05}' \
-p metrics='["memory.resident"]'
Except where otherwise noted, this document is licensed under Creative Commons Attribution 3.0 License. See all OpenStack Legal Documents.