System Architecture¶
High-Level Architecture¶
Each of Aodh’s services are designed to scale horizontally. Additional workers and nodes can be added depending on the expected load. It provides daemons to evaluate and notify based on defined alarming rules.
Evaluating the data¶
Alarming Service¶
The alarming component of Aodh, first delivered in Ceilometer service during Havana development cycle then split out to this independent project in Liberty development cycle, allows you to set alarms based on threshold evaluation for a collection of samples or a dedicate event. An alarm can be set on a single meter, or on a combination. For example, you may want to trigger an alarm when the memory consumption reaches 70% on a given instance if the instance has been up for more than 10 min. To setup an alarm, you will call Aodh’s API server specifying the alarm conditions and an action to take.
Of course, if you are not administrator of the cloud itself, you can only set alarms on meters for your own components.
There can be multiple form of actions, but only several actions have been implemented so far:
HTTP callback: you provide a URL to be called whenever the alarm has been set off. The payload of the request contains all the details of why the alarm was triggered.
log: mostly useful for debugging, stores alarms in a log file.
zaqar: Send notification to messaging service via Zaqar API.
Alarm Rules¶
composite¶
Composite alarm rule.
A simple dict type to preset composite rule.
event¶
Alarm Event Rule.
Describe when to trigger the alarm based on an event
gnocchi_aggregation_by_metrics_threshold¶
Base class Alarm Rule extension and wsme.types.
gnocchi_aggregation_by_resources_threshold¶
Base class Alarm Rule extension and wsme.types.
gnocchi_resources_threshold¶
Base class Alarm Rule extension and wsme.types.
loadbalancer_member_health¶
Base class Alarm Rule extension and wsme.types.
prometheus¶
Base class Alarm Rule extension and wsme.types.
Alarm Evaluators¶
composite¶
Base class for alarm rule evaluator plugins.
gnocchi_aggregation_by_metrics_threshold¶
Base class for alarm rule evaluator plugins.
gnocchi_aggregation_by_resources_threshold¶
Base class for alarm rule evaluator plugins.
gnocchi_resources_threshold¶
Base class for alarm rule evaluator plugins.
loadbalancer_member_health¶
Base class for alarm rule evaluator plugins.
prometheus¶
Base class for alarm rule evaluator plugins.
Alarm Notifiers¶
http¶
Rest alarm notifier.
https¶
Rest alarm notifier.
log¶
Log alarm notifier.
test¶
Test alarm notifier.
trust+heat¶
Heat autohealing notifier.
The auto-healing notifier works together with loadbalancer_member_health evaluator.
Presumably, the end user defines a Heat template which contains an autoscaling group and all the members in the group are joined in an Octavia load balancer in order to expose service to the outside, so that when the stack scales up or scales down, Heat makes sure the new members are joining the load balancer automatically and the old members are removed.
However, this notifier deals with the situation that when some member fails, the stack could be recovered by marking the given autoscaling group member unhealthy, then update Heat stack in place. In order to do that, the notifier needs to know:
Heat top/root stack ID.
Heat autoscaling group ID.
The failed Octavia pool members.
trust+http¶
Notifier supporting keystone trust authentication.
This alarm notifier is intended to be used to call an endpoint using keystone authentication. It uses the aodh service user to authenticate using the trust ID provided.
The URL must be in the form trust+http://host/action
.
trust+https¶
Notifier supporting keystone trust authentication.
This alarm notifier is intended to be used to call an endpoint using keystone authentication. It uses the aodh service user to authenticate using the trust ID provided.
The URL must be in the form trust+http://host/action
.
trust+zaqar¶
Zaqar notifier using a Keystone trust to post to user-defined queues.
The URL must be in the form trust+zaqar://?queue_name=example
.
zaqar¶
Zaqar notifier.
This notifier posts alarm notifications either to a Zaqar subscription or to an existing Zaqar queue with a pre-signed URL.
To create a new subscription in the service project, use a notification URL of the form:
zaqar://?topic=example&subscriber=mailto%3A//test%40example.com&ttl=3600
Multiple subscribers are allowed. ttl
is the time to live of the
subscription. The queue will be created automatically, in the service
project, with a name based on the topic and the alarm ID.
To use a pre-signed URL for an existing queue, use a notification URL with
the scheme zaqar://
and the pre-signing data from Zaqar in the query
string:
zaqar://?queue_name=example&project_id=foo&
paths=/messages&methods=POST&expires=1970-01-01T00:00Z&
signature=abcdefg
Alarm Storage¶
log¶
Log the data.
mysql¶
Put the data into a SQLAlchemy database.
mysql+pymysql¶
Put the data into a SQLAlchemy database.
postgresql¶
Put the data into a SQLAlchemy database.
sqlite¶
Put the data into a SQLAlchemy database.