3. Results of measuring performance of Kargo

Abstract:

This document includes performance test results of Kargo as for Kubernetes deployment solution. All tests have been performed regarding Measuring performance of Kargo.

Kargo sets up Kubernetes in the following way:

  • master: Calico, Kubernetes API services
  • minion: Calico, Kubernetes minion services
  • etcd: etcd service

Kargo deploys Kubernetes cluster with the following matching hostnames and roles:

  • node1: minion+master+etcd
  • node2: minion+master+etcd
  • node3: minion+etcd
  • all other nodes: minion

3.1. Environment description

3.1.1. Hardware configuration of each server

Description of servers hardware
server name node-{1..500} node-{1..500}
role kubernetes cluster kubernetes cluster
vendor,model Dell, R630 Lenovo, RD550-1U
operating_system
4.4.0-36-generic
Ubuntu-xenial
x86_64
4.4.0-36-generic
Ubuntu-xenial
x86_64
CPU vendor,model Intel, E5-2680v3 Intel, E5-2680 v3
processor_count 2 2
core_count 12 12
frequency_MHz 2500 2500
RAM vendor,model Hynix HMA42GR7MFR4N-TF Samsung M393A2G40DB0-CPB
amount_MB 262144 262144
NETWORK interface_name bond0 bond0
vendor,model Intel, X710 Dual Port Intel, X710 Dual Port
interfaces_count 2 2
bandwidth 10G 10G
STORAGE dev_name /dev/sda /dev/sda
vendor,model
raid1 PERC H730P Mini
2 disks Intel S3610
raid1 MegaRAID 3108
2 disks Intel S3610
SSD/HDD SSD SSD
size 800GB 800GB

3.1.2. Network scheme and part of configuration of hardware network switches

Network scheme of the environment:

Network Scheme of the environment

Here is the piece of switch configuration for each switch port which is a part of bond0 interface of a server:

show run int et1 interface Ethernet1

description - r02r13c33 switchport trunk native vlan 4 switchport trunk allowed vlan 4 switchport mode trunk channel-group 133 mode active lacp port-priority 16384 spanning-tree portfast

show run int po1 interface Port-Channel1

description osscr02r13c21 switchport trunk native vlan 131 switchport trunk allowed vlan 130-159 switchport mode trunk port-channel lacp fallback static port-channel lacp fallback timeout 30 mlag 1

3.1.3. Software configuration of Kargo

3.1.3.1. Setting up Kargo:

Kargo installation was performed on the bare metal Ubuntu Xenial servers. Kargo requires dedicated user (not root) to exist on the target nodes. To configure and launch Kargo section Launcher script has been used.

Versions of some software
Software Version
Ubuntu Ubuntu 16.04.1 LTS
fuel-ccp-installer 6b26170f70e523fb04bda8d6f15077d461fba9de
kargo 016b7893c64fede07269c01cac31e96c8ee0d257

3.1.3.2. Test tool:

We were using Dstat utility as main tool for collecting timing and system performarce durring tests. Script for parsing collected metrics was used to parse performance metrics after installation tests.

3.1.3.3. Operating system configuration:

You can find /etc folder contents from the one of the target servers where K8S cluster was deployed: etc_tarball_of_node1

3.2. Testing process

  1. Kargo launcher script was set up and executed on node1 server as described in Setting up Kargo: section.
  2. During Kargo run dstat tool was launched on the node1 with the following options:
root@node1:~# dstat --nocolor --time --cpu --mem --net -N bond0 --io --output /root/dstat.csv
  1. After finishing of Kargo run we parsed resulted “dstat.csv” files with Script for parsing collected metrics.

The above steps were repeated with the following numbers of nodes: 50,150,350

As a result of this part we got the following CSV files:

METRICS(NUMBER_OF_NODES=50)

METRICS(NUMBER_OF_NODES=150)

METRICS(NUMBER_OF_NODES=350)

3.3. Results

After simple processing results the following plots for performance metrics collected during provisioning of the nodes in depend on time created (click to expand an image):

Number of nodes Plot CPU(TIME) Plot RAM(TIME)
50 CPU_USAGE(TIME, NODES=50) RAM_USAGE(TIME, NODES=50)
150 ../../../../_images/150_nodes_-_CPU.png ../../../../_images/150_nodes_-_RAM.png
350 ../../../../_images/350_nodes_-_CPU.png ../../../../_images/350_nodes_-_RAM.png
Number of nodes Plot NET(TIME) Plot DISK(TIME)
50 CPU_USAGE(TIME, NODES=50) RAM_USAGE(TIME, NODES=50)
150 ../../../../_images/150_nodes_-_net.png ../../../../_images/150_nodes_-_disk.png
350 ../../../../_images/350_nodes_-_net.png ../../../../_images/350_nodes_-_disk.png

The following table shows how performance metrics and deployment time parameters depend on a number of nodes.

number of nodes 50 150 350
deployment time 2049.00 3922.00 13065.00
cpu_usage_max 99.0210 99.56 99.06
cpu_usage_min 0 0 0
cpu_usage_average 7.2920 10.03 12.63
cpu_usage_percentile 90% 19.6495 24.92 29.12
ram_usage_max 4466.10 13859.56 112079.57
ram_usage_min 1061.51 1033.32 1075.16
ram_usage_average 2121.20 4335.69 31288.94
ram_usage_percentile 90% 2876.33 8570.32 79915.96
net_all_max 3864760.75 20996615.75 60130883.88
net_all_min 0 0 0
net_all_average 70602.55 102913.32 177943.40
net_all_percentile 90% 253590.90 263933.25 180409.81
dsk_io_all_max 3503 3196 3470
dsk_io_all_min 0 0 0
dsk_io_all_average 26 37 56
dsk_io_all_percentile 90% 58 14 8

3.4. Issues that have been found during the tests

During the testing we’ve found several issues that prevented us from achieving test results at scale:

Issue Link
etcd list sometimes hangs https://github.com/kubespray/kargo/pull/448
K8S DNS services not working correctly https://github.com/kubespray/kargo/pull/458
Calico creates extra pool during run https://github.com/kubespray/kargo/pull/462
Timeout to quay.io to fetch etcd image https://github.com/kubespray/kargo/pull/481
Downloading images doesn’t scale well https://github.com/kubespray/kargo/pull/488
Kargo is too slow on scale https://github.com/kubespray/kargo/issues/478

3.5. Applications

3.5.1. Launcher script

#!/bin/bash -xe

if [[ -d ./fuel-ccp-installer ]] ; then
    rm -rf ./fuel-ccp-installer
fi

git clone https://review.openstack.org/openstack/fuel-ccp-installer

export ENV_NAME="kargo-test"
export DEPLOY_METHOD="kargo"
export WORKSPACE="~/workspace"
export ADMIN_USER="vagrant"
export ADMIN_PASSWORD="kargo"

# for 50 nodes
#export SLAVES_COUNT=50
#export ADMIN_IP="10.3.58.66"
#export SLAVE_IPS="10.3.58.66 10.3.58.33 10.3.58.30 10.3.58.27 10.3.58.32 10.3.58.28 10.3.58.34 10.3.58.35 10.3.58.29 10.3.58.31 10.3.58.51 10.3.58.41 10.3.58.43 10.3.58.53 10.3.58.45 10.3.58.54 10.3.58.55 10.3.58.38 10.3.58.40 10.3.58.48 10.3.58.42 10.3.58.46 10.3.58.36 10.3.58.37 10.3.58.52 10.3.58.50 10.3.58.39 10.3.58.10 10.3.58.58 10.3.58.7 10.3.57.254 10.3.58.4 10.3.57.255 10.3.58.1 10.3.58.3 10.3.58.57 10.3.58.23 10.3.58.13 10.3.58.12 10.3.58.21 10.3.58.5 10.3.58.22 10.3.58.9 10.3.58.24 10.3.58.15 10.3.58.19 10.3.58.16 10.3.56.6 10.3.56.7 10.3.56.83"

# for 150 nodes:
#export SLAVES_COUNT=150
#export ADMIN_IP="10.3.56.3"
#export SLAVE_IPS="10.3.56.3 10.3.56.254 10.3.56.4 10.3.56.6 10.3.56.7 10.3.56.83 10.3.56.82 10.3.56.84 10.3.56.86 10.3.56.87 10.3.56.89 10.3.56.12 10.3.56.11 10.3.56.13 10.3.56.15 10.3.56.16 10.3.56.17 10.3.56.18 10.3.56.20 10.3.56.21 10.3.56.22 10.3.56.23 10.3.56.25 10.3.56.26 10.3.56.27 10.3.56.29 10.3.56.30 10.3.56.31 10.3.56.32 10.3.56.34 10.3.56.33 10.3.56.37 10.3.56.38 10.3.56.39 10.3.56.41 10.3.56.43 10.3.56.45 10.3.56.46 10.3.56.47 10.3.56.48 10.3.56.50 10.3.56.49 10.3.56.51 10.3.56.52 10.3.56.133 10.3.56.135 10.3.56.137 10.3.56.136 10.3.56.113 10.3.56.139 10.3.56.141 10.3.56.148 10.3.56.142 10.3.56.117 10.3.56.143 10.3.56.145 10.3.56.123 10.3.56.122 10.3.56.128 10.3.56.144 10.3.56.250 10.3.56.251 10.3.56.126 10.3.56.180 10.3.56.181 10.3.56.184 10.3.56.182 10.3.56.185 10.3.56.183 10.3.56.188 10.3.56.191 10.3.56.192 10.3.56.187 10.3.56.195 10.3.56.190 10.3.56.199 10.3.56.193 10.3.56.204 10.3.56.207 10.3.56.205 10.3.56.206 10.3.56.201 10.3.56.202 10.3.56.208 10.3.56.217 10.3.56.216 10.3.56.209 10.3.56.210 10.3.56.215 10.3.56.218 10.3.56.212 10.3.56.213 10.3.56.214 10.3.56.211 10.3.56.221 10.3.56.224 10.3.56.227 10.3.56.149 10.3.56.219 10.3.56.223 10.3.56.231 10.3.56.228 10.3.56.235 10.3.56.236 10.3.56.230 10.3.56.233 10.3.56.229 10.3.56.232 10.3.56.234 10.3.59.95 10.3.59.92 10.3.59.88 10.3.59.96 10.3.59.111 10.3.59.115 10.3.59.116 10.3.56.146 10.3.59.119 10.3.59.117 10.3.59.112 10.3.59.110 10.3.59.109 10.3.59.120 10.3.59.137 10.3.59.136 10.3.59.133 10.3.59.132 10.3.59.138 10.3.59.134 10.3.59.135 10.3.59.139 10.3.59.131 10.3.59.130 10.3.59.74 10.3.59.80 10.3.59.73 10.3.59.77 10.3.59.84 10.3.59.105 10.3.59.82 10.3.59.83 10.3.59.81 10.3.59.98 10.3.59.108 10.3.59.106 10.3.59.102 10.3.59.107 10.3.59.86 10.3.58.66 10.3.58.33"

# for 350 nodes:
#export SLAVES_COUNT=350
#export ADMIN_IP="10.3.56.3"
#export SLAVE_IPS="10.3.56.3 10.3.56.254 10.3.56.4 10.3.56.6 10.3.56.7 10.3.56.83 10.3.56.82 10.3.56.84 10.3.56.86 10.3.56.87 10.3.56.89 10.3.56.12 10.3.56.11 10.3.56.13 10.3.56.15 10.3.56.16 10.3.56.17 10.3.56.18 10.3.56.20 10.3.56.21 10.3.56.22 10.3.56.23 10.3.56.25 10.3.56.26 10.3.56.27 10.3.56.29 10.3.56.30 10.3.56.31 10.3.56.32 10.3.56.34 10.3.56.33 10.3.56.37 10.3.56.38 10.3.56.39 10.3.56.41 10.3.56.43 10.3.56.45 10.3.56.46 10.3.56.47 10.3.56.48 10.3.56.50 10.3.56.49 10.3.56.51 10.3.56.52 10.3.56.133 10.3.56.135 10.3.56.137 10.3.56.136 10.3.56.113 10.3.56.139 10.3.56.141 10.3.56.148 10.3.56.142 10.3.56.117 10.3.56.143 10.3.56.145 10.3.56.123 10.3.56.122 10.3.56.128 10.3.56.144 10.3.56.250 10.3.56.251 10.3.56.126 10.3.56.180 10.3.56.181 10.3.56.184 10.3.56.182 10.3.56.185 10.3.56.183 10.3.56.188 10.3.56.191 10.3.56.192 10.3.56.187 10.3.56.195 10.3.56.190 10.3.56.199 10.3.56.193 10.3.56.204 10.3.56.207 10.3.56.205 10.3.56.206 10.3.56.201 10.3.56.202 10.3.56.208 10.3.56.217 10.3.56.216 10.3.56.209 10.3.56.210 10.3.56.215 10.3.56.218 10.3.56.212 10.3.56.213 10.3.56.214 10.3.56.211 10.3.56.221 10.3.56.224 10.3.56.227 10.3.56.149 10.3.56.219 10.3.56.223 10.3.56.231 10.3.56.228 10.3.56.235 10.3.56.236 10.3.56.230 10.3.56.233 10.3.56.229 10.3.56.232 10.3.56.234 10.3.59.95 10.3.59.92 10.3.59.88 10.3.59.96 10.3.59.111 10.3.59.115 10.3.59.116 10.3.56.146 10.3.59.119 10.3.59.117 10.3.59.112 10.3.59.110 10.3.59.109 10.3.59.120 10.3.59.137 10.3.59.136 10.3.59.133 10.3.59.132 10.3.59.138 10.3.59.134 10.3.59.135 10.3.59.139 10.3.59.131 10.3.59.130 10.3.59.74 10.3.59.80 10.3.59.73 10.3.59.77 10.3.59.84 10.3.59.105 10.3.59.82 10.3.59.83 10.3.59.81 10.3.59.98 10.3.59.108 10.3.59.106 10.3.59.102 10.3.59.107 10.3.59.86 10.3.59.93 10.3.59.100 10.3.59.87 10.3.59.99 10.3.59.97 10.3.59.89 10.3.59.46 10.3.59.35 10.3.59.40 10.3.59.47 10.3.59.55 10.3.59.51 10.3.59.48 10.3.59.63 10.3.59.56 10.3.59.68 10.3.59.32 10.3.59.43 10.3.59.36 10.3.59.54 10.3.59.53 10.3.59.71 10.3.59.57 10.3.59.62 10.3.59.69 10.3.59.65 10.3.59.70 10.3.59.72 10.3.59.66 10.3.59.76 10.3.59.75 10.3.59.79 10.3.59.78 10.3.59.64 10.3.59.25 10.3.59.22 10.3.59.16 10.3.59.24 10.3.59.15 10.3.59.11 10.3.59.10 10.3.58.241 10.3.59.12 10.3.59.42 10.3.59.31 10.3.59.28 10.3.59.34 10.3.59.37 10.3.59.27 10.3.59.30 10.3.59.29 10.3.59.58 10.3.59.52 10.3.59.38 10.3.59.61 10.3.59.59 10.3.59.49 10.3.59.39 10.3.58.176 10.3.58.178 10.3.58.251 10.3.58.179 10.3.58.188 10.3.58.184 10.3.58.181 10.3.58.194 10.3.58.196 10.3.58.205 10.3.58.201 10.3.58.192 10.3.58.197 10.3.58.193 10.3.58.254 10.3.58.186 10.3.58.180 10.3.58.198 10.3.58.252 10.3.58.189 10.3.58.253 10.3.58.195 10.3.58.200 10.3.58.210 10.3.58.183 10.3.58.199 10.3.58.182 10.3.58.208 10.3.58.209 10.3.58.100 10.3.58.127 10.3.58.146 10.3.58.136 10.3.58.118 10.3.58.132 10.3.58.142 10.3.58.131 10.3.58.144 10.3.58.121 10.3.58.123 10.3.58.134 10.3.58.120 10.3.58.129 10.3.58.135 10.3.58.137 10.3.58.117 10.3.58.125 10.3.58.155 10.3.58.162 10.3.58.154 10.3.58.153 10.3.58.148 10.3.58.159 10.3.58.171 10.3.58.167 10.3.58.166 10.3.58.165 10.3.58.164 10.3.58.156 10.3.58.147 10.3.58.170 10.3.58.149 10.3.58.168 10.3.58.160 10.3.58.172 10.3.58.157 10.3.58.71 10.3.58.59 10.3.58.70 10.3.58.67 10.3.58.69 10.3.58.79 10.3.58.64 10.3.58.73 10.3.58.77 10.3.58.65 10.3.58.86 10.3.58.63 10.3.58.80 10.3.58.75 10.3.58.62 10.3.58.84 10.3.58.74 10.3.58.76 10.3.58.85 10.3.58.78 10.3.58.60 10.3.58.72 10.3.58.81 10.3.58.61 10.3.58.82 10.3.58.87 10.3.58.66 10.3.58.33 10.3.58.30 10.3.58.27 10.3.58.32 10.3.58.28 10.3.58.34 10.3.58.35 10.3.58.29 10.3.58.31 10.3.58.51 10.3.58.41 10.3.58.43 10.3.58.53 10.3.58.45 10.3.58.54 10.3.58.55 10.3.58.38 10.3.58.40 10.3.58.48 10.3.58.42 10.3.58.46 10.3.58.36 10.3.58.37 10.3.58.52 10.3.58.50 10.3.58.39 10.3.58.10 10.3.58.58 10.3.58.7 10.3.57.254 10.3.58.4 10.3.57.255 10.3.58.1 10.3.58.3 10.3.58.57 10.3.58.23 10.3.58.13 10.3.58.12 10.3.58.21 10.3.58.5 10.3.58.22 10.3.58.9 10.3.58.24 10.3.58.15 10.3.58.19 10.3.58.16"

export CUSTOM_YAML='docker_version: 1.12
hyperkube_image_repo: "quay.io/coreos/hyperkube"
hyperkube_image_tag: "v1.3.5_coreos.0"
etcd_image_repo: "quay.io/coreos/etcd"
etcd_image_tag: "v3.0.1"
calicoctl_image_repo: "calico/ctl"
#calico_node_image_repo: "calico/node"
calico_node_image_repo: "l23network/node"
calico_node_image_tag: "v0.20.0"
calicoctl_image_tag: "v0.20.0"
kube_apiserver_insecure_bind_address: "0.0.0.0"'

mkdir -p $WORKSPACE
echo "Running on $NODE_NAME: $ENV_NAME"
cd ./fuel-ccp-installer

bash -xe "./utils/jenkins/run_k8s_deploy_test.sh"

3.5.2. Script for parsing collected metrics

#!/bin/bash -e

if [[ ! $1 ]] || [[ ! $2 ]] ; then
    echo \$1 = kargo_env_name, \$2 = csv file path
    exit 1
fi

WORKDIR='~/worked_up_results/'
cur_dir="${WORKDIR}kargo_${1}"
csv_name=`basename $2`
if [[ ! -d $cur_dir ]] ; then mkdir -p $cur_dir ; fi

awk -F "," 'BEGIN {getline;getline;getline;getline;getline;getline;getline;
    print "time,cpu_usage,ram_usage,net_recv,net_send,net_all,dsk_io_read,dsk_io_writ,dsk_all"}
    {printf "%s,%0.3f,%0.3f,%0.3f,%0.3f,%0.3f,%d,%d,%d\n", $1,100-$4,$8/1048576,$12/8,$13/8,($12+$13)/8,$14,$15,$14+$15 }' $2 > $cur_dir/${csv_name}