As of Newton, Sahara supports the creation of image generation and image validation tooling as part of the plugin. If implemented properly, this feature will enable your plugin to:
All of these features can use the same image declaration, meaning that logic for these three use cases can be maintained in one place.
This guide will explain how to enable this feature for your plugin, as well as how to write or modify the image generation manifests that this feature uses.
The key user-facing interface to this feature is the CLI script sahara-image-pack. This script will be installed with all other Sahara binaries.
The script sahara-image-pack takes the following primary arguments:
--config-file PATH Path to a config file to use. Multiple config files
can be specified, with values in later files taking
precedence. Defaults to None.
--image IMAGE The path to an image to modify. This image will be
modified in-place: be sure to target a copy if you
wish to maintain a clean master image.
--root-filesystem ROOT_FS
The filesystem to mount as the root volume on the
image. Novalue is required if only one filesystem is
detected.
--test-only If this flag is set, no changes will be made to the
image; instead, the script will fail if discrepancies
are found between the image and the intended state.
After these arguments, the script takes PLUGIN and VERSION arguments. These arguments will allow any plugin and version combination which supports the image packing feature. Plugins may require their own arguments at specific versions; use the --help feature with PLUGIN and VERSION to see the appropriate argument structure.
a plausible command-line invocation would be:
sahara-image-pack --image CentOS.qcow2 \
--config-file etc/sahara/sahara.conf \
cdh 5.7.0 [cdh 5.7.0 specific arguments, if any]
This script will modify the target image in-place. Please copy your image if you want a backup or if you wish to create multiple images from a single base image.
This CLI will automatically populate the set of available plugins and versions from the plugin set loaded in Sahara, and will show any plugin for which the image packing feature is available. The next sections of this guide will first describe how to modify an image packing specification for one of the plugins, and second, how to enable the image packing feature for new or existing plugins.
The script depends on a python library which is not packaged in pip, but is available through yum, dnf, and apt. If you have installed Sahara through yum, dnf, or apt, you should have appropriate dependencies, but if you wish to use the script but are working with Sahara from source, run whichever of the following is appropriate to your OS:
sudo yum install -y libguestfs python-libguestfs libguestfs-tools
sudo dnf install -y libguestfs python-libguestfs libguestfs-tools
sudo apt-get install -y libguestfs python-libguestfs libguestfs-tools
If you are using tox to create virtual environments for your Sahara work, please use the images environment to run sahara-image-pack. This environment is configured to use system site packages, and will thus be able to find its dependency on python-libguestfs.
As you’ll read in the next section, Sahara’s image packing tools allow plugin authors to use any toolchain they choose. However, Sahara does provide a built-in image packing framework which is uniquely suited to OpenStack use cases, as it is designed to run the same logic while pre-packing an image or while preparing an instance to launch a cluster after it is spawned in OpenStack.
By convention, the image specification, and all the scripts that it calls, should be located in the plugin’s resources directory under a subdirectory named “images”.
A sample specification is below; the example is reasonably silly in practice, and is only designed to highlight the use of the currently available validator types. We’ll go through each piece of this specification, but the full sample is presented for context.
arguments:
java-distro:
description: The java distribution.
default: openjdk
required: false
choices:
- oracle-java
- openjdk
validators:
- os_case:
- redhat:
- package: nfs-utils
- debian:
- package: nfs-common
- argument_case:
argument_name: java-distro
cases:
openjdk:
- any:
- all:
- package: java-1.8.0-openjdk-devel
- argument_set:
argument_name: java-version
value: 1.8.0
- all:
- package: java-1.7.0-openjdk-devel
- argument_set:
argument_name: java-version
value: 1.7.0
oracle-java:
- script: install_oracle_java.sh
- script: setup_java.sh
- package:
- hadoop
- hadoop-libhdfs
- hadoop-native
- hadoop-pipes
- hadoop-sbin
- hadoop-lzo
- lzo
- lzo-devel
- hadoop-lzo-native
First, the image specification should describe any arguments that may be used to adjust properties of the image:
arguments: # The section header
- java-distro: # The friendly name of the argument, and the name of the variable passed to scripts
description: The java distribution. # A friendly description to be used in help text
default: openjdk # A default value for the argument
required: false # Whether or not the argument is required
choices: # The argument value must match an element of this list
- oracle-java
- openjdk
Specifications may contain any number of arguments, as declared above, by adding more members to the list under the arguments key.
This is where the logical flow of the image packing and validation process is declared. A tiny example validator list is specified below.
validators:
- package: nfs-utils
- script: setup_java.sh
This is fairly straightforward: this specification will install the nfs-utils package (or check that it’s present) and then run the setup_java.sh script.
All validators may be run in two modes: reconcile mode and test-only mode (reconcile == false). If validators are run in reconcile mode, any image or instance state which is not already true will be updated, if possible. If validators are run in test-only mode, they will only test the image or instance, and will raise an error if this fails.
We’ll now go over the types of validators that are currently available in Sahara. This framework is made to easily allow new validators to be created and old ones to be extended: if there’s something you need, please do file a wishlist bug or write and propose your own!
These validators take specific, concrete actions to assess or modify your image or instance.
This validator type will install a package on the image, or validate that a package is installed on the image. It can take several formats, as below:
validators:
- package: hadoop
- package:
- hadoop-libhdfs
- nfs-utils:
version: 1.3.3-8
As you can see, a package declaration can consist of:
This validator will run a script on the image. It can take several formats as well:
validators:
- script: simple_script.sh # Runs this file
- script:
set_java_home: # The name of a script file
arguments: # Only the named environment arguments are passed, for clarity
- jdk-home
- jre-home
output: OUTPUT_VAR
- script:
store_nfs_version: # Because inline is set, this is just a friendly name
- inline: rpm -q nfs-utils # Runs this text directly, rather than reading a file
- output: nfs-version # Places the stdout of this script into an argument
# for future scripts to consume; if none exists, the
# argument is created
Two variables are always available to scripts run under this framework:
These validators are used to build more complex logic into your specifications explicitly in the yaml layer, rather than by deferring too much logic to scripts.
This validator runs different logic depending on which distribution of Linux is being used in the guest.
validators:
- os_case: # The contents are expressed as a list, not a dict, to preserve order
- fedora: # Only the first match runs, so put distros before families
- package: nfs_utils # The content of each case is a list of validators
- redhat: # Red Hat distros include fedora, centos, and rhel
- package: nfs-utils
- debian: # The major supported Debian distro in Sahara is ubuntu
- package: nfs-common
This validator runs different logic depending on the value of an argument.
validators:
- argument_case:
argument_name: java-distro # The name of the argument
cases: # The cases are expressed as a dict, as only one can equal the argument's value
openjdk:
- script: setup-openjdk # The content of each case is a list of validators
oracle-java:
- script: setup-oracle-java
This validator runs all the validators within it, as one logical block. If any validators within it fail to validate or modify the image or instance, it will fail.
validators:
- all:
- package: nfs-utils
- script: setup-nfs.sh
This validator attempts to run each validator within it, until one succeeds, and will report success if any do. If this is run in reconcile mode, it will first try each validator in test-only mode, and will succeed without making changes if any succeed (in the case below, if openjdk 1.7.0 were already installed, the validator would succeed and would not install 1.8.0.)
validators:
- any: # This validator will try to install openjdk-1.8.0, but it will settle for 1.7.0 if that fails
- package: java-1.8.0-openjdk-devel
- package: java-1.7.0-openjdk-devel
You may find that you wish to store state in one place in the specification for use in another. In this case, you can use this validator to set an argument for future use.
validators:
- argument_set:
argument_name: java-version
value: 1.7.0
In order to make this feature available for your plugin, you must implement the following optional plugin SPI methods.
When implementing these, you may choose to use your own framework of choice (Packer for image packing, etc.) By doing so, you can ignore the entire framework and specification language described above. However, you may wish to instead use the abstraction we’ve provided (its ability to keep logic in one place for both image packing and cluster validation is useful in the OpenStack context.) We will, of course, focus on that framework here.
def get_image_arguments(self, hadoop_version):
"""Gets the argument set taken by the plugin's image generator"""
def pack_image(self, hadoop_version, remote,
reconcile=True, image_arguments=None):
"""Packs an image for registration in Glance and use by Sahara"""
def validate_images(self, cluster, reconcile=True, image_arguments=None):
"""Validates the image to be used by a cluster"""
The validate_images method is called after Heat provisioning of your cluster, but before cluster configuration. If the reconcile keyword of this method is set to False, the method should only test the instances without modification. If it is set to True, the method should make any necessary changes (this can be used to allow clusters to be spun up from clean, OS-only images.) This method is expected to use an ssh remote to communicate with instances, as per normal in Sahara.
The pack_image method can be used to modify an image file (it is called by the CLI above). This method expects an ImageRemote, which is essentially a libguestfs handle to the disk image file, allowing commands to be run on the image directly (though it could be any concretion that allows commands to be run against the image.)
By this means, the validators described above can execute the same logic in the image packing, instance validation, and instance preparation cases with the same degree of interactivity and logical control.
In order to future-proof this document against possible changes, the doctext of these methods will not be reproduced here, but they are documented very fully in the sahara.plugins.provisioning abstraction.
These abstractions can be found in the module sahara.plugins.images. You will find that the framework has been built with extensibility and abstraction in mind: you can overwrite validator types, add your own without modifying any core sahara modules, declare hierarchies of resource locations for shared resources, and more. These features are documented in the sahara.plugins.images module itself (which has copious doctext,) and we encourage you to explore and ask questions of the community if you are curious or wish to build your own image generation tooling.