Collector¶
Data format¶
Internally, CloudKitty’s data format is a bit more detailled than what can be found in the architecture documentation.
The internal data format is the following:
{
"bananas": [
{
"vol": {
"unit": "banana",
"qty": 1
},
"rating": {
"price": 1
},
"groupby": {
"xxx_id": "hello",
"yyy_id": "bye",
},
"metadata": {
"flavor": "chocolate",
"eaten_by": "gorilla",
},
}
],
}
However, developers implementing a collector don’t need to format the data themselves, as there are helper functions for these matters.
Implementation¶
Each collector must implement the following class:
- class cloudkitty.collector.BaseCollector(**kwargs)[source]
- static check_configuration(conf)[source]
Checks and validates metric configuration.
Collectors requiring extra parameters for metric collection should implement this method, call the method of the parent class, extend the
extra_args
key inMETRIC_BASE_SCHEMA
and validate the metric configuration against the new schema.
- abstract fetch_all(metric_name, start, end, project_id=None, q_filter=None)[source]
Fetches information about a specific metric for a given period.
This method must respect the
groupby
andmetadata
arguments provided in the metric conf at initialization. (Available inself.conf['groupby']
andself.conf['metadata']
).Returns a list of cloudkitty.dataframe.DataPoint objects.
- Parameters
metric_name (str) – Name of the metric to fetch
start (datetime.datetime) – start of the period
end (datetime.datetime) – end of the period
project_id (str) – ID of the scope for which data should be collected
q_filter (dict) – Optional filters
The retrieve
method of the BaseCollector
class is called by the
orchestrator. This method calls the fetch_all
method of the child class.
To create a collector, you need to implement at least the fetch_all
method.
Data collection¶
Collectors must implement a fetch_all
method. This method is called for
each metric type, for each scope, for each collect period. It has the
following prototype:
- class cloudkitty.collector.BaseCollector(**kwargs)[source]
- abstract fetch_all(metric_name, start, end, project_id=None, q_filter=None)[source]
Fetches information about a specific metric for a given period.
This method must respect the
groupby
andmetadata
arguments provided in the metric conf at initialization. (Available inself.conf['groupby']
andself.conf['metadata']
).Returns a list of cloudkitty.dataframe.DataPoint objects.
- Parameters
metric_name (str) – Name of the metric to fetch
start (datetime.datetime) – start of the period
end (datetime.datetime) – end of the period
project_id (str) – ID of the scope for which data should be collected
q_filter (dict) – Optional filters
This method is supposed to return a list of
cloudkitty.dataframe.DataPoint
objects.
Example code of a basic collector:
from cloudkitty.collector import BaseCollector
class MyCollector(BaseCollector):
def __init__(self, **kwargs):
super(MyCollector, self).__init__(**kwargs)
def fetch_all(self, metric_name, start, end,
project_id=None, q_filter=None):
data = []
for CONDITION:
# do stuff
data.append(dataframe.DataPoint(
unit,
qty, # int, float, decimal.Decimal or str
0, # price
groupby, # dict
metadata, # dict
))
return data
project_id
can be misleading, as it is a legacy name. It contains the
ID of the current scope. The attribute corresponding to the scope is specified
in the configuration, under [collect]/scope_key
. Thus, all queries should
filter based on this attribute. Example:
from oslo_config import cfg
from cloudkitty.collector import BaseCollector
CONF = cfg.CONF
class MyCollector(BaseCollector):
def __init__(self, **kwargs):
super(MyCollector, self).__init__(**kwargs)
def fetch_all(self, metric_name, start, end,
project_id=None, q_filter=None):
scope_key = CONF.collect.scope_key
filters = {'start': start, 'stop': stop, scope_key: project_id}
data = self.client.query(
filters=filters,
groupby=self.conf[metric_name]['groupby'])
# Format data etc
return output
Additional configuration¶
If you need to extend the metric configuration (add parameters to the
extra_args
section of metrics.yml
), you can overload the
check_configuration
method of the base collector:
- class cloudkitty.collector.BaseCollector(**kwargs)[source]
- static check_configuration(conf)[source]
Checks and validates metric configuration.
Collectors requiring extra parameters for metric collection should implement this method, call the method of the parent class, extend the
extra_args
key inMETRIC_BASE_SCHEMA
and validate the metric configuration against the new schema.
This method uses voluptuous for data validation. The base schema for each
metric can be found in cloudkitty.collector.METRIC_BASE_SCHEMA
. This schema
is meant to be extended by other collectors. Example taken from the gnocchi
collector code:
from cloudkitty import collector
GNOCCHI_EXTRA_SCHEMA = {
Required('extra_args'): {
Required('resource_type'): All(str, Length(min=1)),
# Due to Gnocchi model, metric are grouped by resource.
# This parameter allows to adapt the key of the resource identifier
Required('resource_key', default='id'): All(str, Length(min=1)),
Required('aggregation_method', default='max'):
In(['max', 'mean', 'min']),
},
}
class GnocchiCollector(collector.BaseCollector):
collector_name = 'gnocchi'
@staticmethod
def check_configuration(conf):
conf = collector.BaseCollector.check_configuration(conf)
metric_schema = Schema(collector.METRIC_BASE_SCHEMA).extend(
GNOCCHI_EXTRA_SCHEMA)
output = {}
for metric_name, metric in conf.items():
met = output[metric_name] = metric_schema(metric)
if met['extra_args']['resource_key'] not in met['groupby']:
met['groupby'].append(met['extra_args']['resource_key'])
return output
If your collector does not need any extra_args
, it is not required to
overload the check_configuration
method.