5.25. Telemetry Services resource consumption/scalability testing¶
- status
ready
- version
1.0
- Abstract
This document describes how scalability and performance testing is conducted on an OpenStack Cloud with a focus on OpenStack Telemetry Services. Currently this focuses on Telemetry Services collection/processing of metrics, further test cases can be added to scale and performance test other aspects of the OpenStack Telemetry Services.
5.25.1. Test Plan¶
Characterize the resource consumption and application performance of OpenStack Telemetry Services on an OpenStack Cloud as a workload increases over time. As the workload is increased, measure System Performance Metrics (CPU, Memory, Disk, IO) and Application Performance Metrics (responsiveness, health, utilization, functionality) until desired load is reached or system/application failures.
5.25.1.1. Test Environment¶
5.25.1.1.1. Preparation¶
Ideally this is run on a newly deployed cloud each time for repeatability purposes. Cloud deployment should be documented for each test case / run as deployment will set many configuration values which will impact performance.
5.25.1.1.2. Environment description¶
The environment description includes hardware specs, software versions, tunings and configuration of the OpenStack Cloud under test.
5.25.1.1.2.1. Hardware¶
List details of hardware for each node type here.
Deployment node (Undercloud)
Parameter |
Value |
model |
ex. Dell PowerEdge r610 |
CPU |
ex. 2xIntel(R) Xeon(R) X5650 @ 2.67GHz (12Cores/24Threads) |
Memory |
ex. 64GiB (@1333MHz) |
Disk |
ex. 4 x 146GiB 15K SAS Drives in RAID 0 |
Network |
ex. 2x1Gb/s Broadcom, 2x10Gb/s Intel X520 |
Controller
Parameter |
Value |
model |
ex. Dell PowerEdge r610 |
CPU |
ex. 2xIntel(R) Xeon(R) X5650 @ 2.67GHz (12Cores/24Threads) |
Memory |
ex. 64GiB (@1333MHz) |
Disk |
ex. 4 x 146GiB 15K SAS Drives in RAID 0 |
Network |
ex. 2x1Gb/s Broadcom, 2x10Gb/s Intel X520 |
Compute
Parameter |
Value |
model |
ex. Dell PowerEdge r610 |
CPU |
ex. 2xIntel(R) Xeon(R) X5650 @ 2.67GHz (12Cores/24Threads) |
Memory |
ex. 64GiB (@1333MHz) |
Disk |
ex. 4 x 146GiB 15K SAS Drives in RAID 0 |
Network |
ex. 2x1Gb/s Broadcom, 2x10Gb/s Intel X520 |
Additional Hardware for testing/monitoring/results
Performance Monitoring Host (Carbon/Graphite/Grafana)
Performance Results Host (ElasticSearch/Kibana)
5.25.1.1.2.2. Software¶
Record versions of Linux kernel, Base Operating System (ex. Centos 7.3), OpenStack version (ex. Newton), OpenStack Packages, testing harness/framework and any other pertinent software.
5.25.1.1.2.3. Tuning/Configuration¶
Record deployed configuration, including the following but not limited to
# of Gnocchi-metricd processes
# api processes/threads
api deployed in httpd? (If so include httpd configuration options)
Backend (file, swift, ceph)
Ceilometer polling interval
Other Services worker/process counts (Nova, Neutron, …)
5.25.1.1.2.4. System Performance Monitoring¶
Record System performance metrics into a separate metrics collection/storage/analysis system. Suggested system would be a separate machine with Carbon, Graphite, and Grafana with dashboards for monitoring system resource utilization. To push metrics into the TSDB, collectd can/should be installed on all monitored machines. (Deployment, Controllers, and Computes)
5.25.1.1.2.5. Test Diagram¶
Attach test diagram to display test topology.
5.25.1.2. Test Case 1¶
5.25.1.2.1. Description¶
Boot 50 persisting instances every 1200 seconds until 1000 instances booted and running in OpenStack cloud.
Parameters
Amount of Instances to boot per period (ex. 50)
Amount of time to wait between booting periods (ex. 1200 seconds)
Maximum number of instances desired for test (ex. 1000)
Depending upon available hardware, the above parameters will need to adjusted
Stopping/Failure Conditions
Max number of instances achieved
Failure to boot instances
Failure for Telemetry Services to consume metrics
Other service failures/errors
System out of Resources (ex. CPU 100% utilized)
5.25.1.2.2. Setup¶
Deploy OpenStack Cloud
Install testing and monitoring tooling
Gather metadata on Cloud
Run test
5.25.1.2.3. Analysis¶
Review System performance metrics graphs during test duration to observe for stopping/failure conditions. Review testing harness output for test failure conditions.
5.25.1.2.4. List of performance metrics¶
Performance
CPU utilization
Memory utilization
Disk IO utilization
Per-Process CPU/Memory/IO (Gnocchi, Ceilometer, Nova, Swift, Ceph …)
Time required to Boot Instances
Responsiveness of Gnocchi/Ceilometer or services
Failure Conditions
Errors in log files (Gnocchi, Ceilometer, Nova, Swift, …)
5.25.2. Reports¶
- Test plan execution reports: