The Cloudera plugin is a Sahara plugin which allows the user to deploy and operate a cluster with Cloudera Manager.
The Cloudera plugin is enabled in Sahara by default. You can manually modify the Sahara configuration file (default /etc/sahara/sahara.conf) to explicitly enable or disable it in “plugins” line.
You need to build images using Building Images for Cloudera Plugin to produce images used to provision cluster or you could download prepared images from http://sahara-files.mirantis.com/images/upstream/ They already have Cloudera Express installed (version 5.0.0, 5.3.0, 5.4.0, 5.5.0, 5.7.x and 5.9.x).
The cloudera plugin requires an image to be tagged in Sahara Image Registry with two tags: ‘cdh’ and ‘<cloudera version>’ (e.g. ‘5’, ‘5.3.0’, ‘5.4.0’, ‘5.5.0’, ‘5.7.0’, ‘5.9.0’ or ‘5.9.1’, here ‘5’ stands for ‘5.0.0’).
The default username specified for these images is different for each distribution:
for 5.0.0, 5.3.0 and 5.4.0 version:
OS | username |
---|---|
Ubuntu 12.04 | ubuntu |
CentOS 6.6 | cloud-user |
for 5.5.0 and higher versions:
OS | username |
---|---|
Ubuntu 14.04 | ubuntu |
CentOS 6.6 | cloud-user |
CentOS 7 | centos |
Currently below services are supported in both versions of Cloudera plugin: HDFS, Oozie, YARN, Spark, Zookeeper, Hive, Hue, HBase. 5.3.0 version of Cloudera Plugin also supported following services: Impala, Flume, Solr, Sqoop, and Key-value Store Indexer. In version 5.4.0 KMS service support was added based on version 5.3.0. Kafka 2.0.2 was added for CDH 5.5 and higher.
Note
Sentry service is enabled in Cloudera plugin. However, as we do not enable Kerberos authentication in the cluster for CDH version < 5.5 (which is required for Sentry functionality) then using Sentry service will not really take any effect, and other services depending on Sentry will not do any authentication too.
Currently HDFS NameNode High Availability is supported beginning with Cloudera 5.4.0 version. You can refer to Features Overview for the detail info.
YARN ResourceManager High Availability is supported beginning with Cloudera 5.4.0 version. This feature adds redundancy in the form of an Active/Standby ResourceManager pair to avoid the failure of single RM. Upon failover, the Standby RM become Active so that the applications can resume from their last check-pointed state.
When the user performs an operation on the cluster using a Cloudera plugin, the cluster topology requested by the user is verified for consistency.
The following limitations are required in the cluster topology for all cloudera plugin versions:
dfs_replication
datanodes.In case of 5.3.0, 5.4.0, 5.5.0, 5.7.x or 5.9.x version of Cloudera Plugin there are few extra limitations in the cluster topology:
In case of version 5.5.0, 5.7.x or 5.9.x of Cloudera Plugin additional services in the cluster topology are available:
If you want to protect your clusters using MIT Kerberos security you have to complete a few steps below.
If you would like to create a cluster protected by Kerberos security you
just need to enable Kerberos by checkbox in the General Parameters
section of the cluster configuration. If you prefer to use the OpenStack CLI
for cluster creation, you have to put the data below in the
cluster_configs
section:
"cluster_configs": {
"Enable Kerberos Security": true,
}
Sahara in this case will correctly prepare KDC server and will create principals along with keytabs to enable authentication for Hadoop services.
Ensure that you have the latest hadoop-openstack jar file distributed
on your cluster nodes. You can download one at
https://tarballs.openstack.org/sahara-extra/dist/
Sahara will create principals along with keytabs for system users
like hdfs
and spark
so that you will not have to
perform additional auth operations to execute your jobs on top of the
cluster.
Except where otherwise noted, this document is licensed under Creative Commons Attribution 3.0 License. See all OpenStack Legal Documents.