Welcome!
Create HDP Application on BlueData App Workbench
Hello. Today we will be teaching you how to create an image on the Blue Data EPIC platform. We will be creating an image consisting of HDP using the Blue Data EPIC Application Workbench on a CentOS base image container.
Prerequisites:
-Basic knowledge on containers
-Linux administration
-Git
-HDP
This scenario is developed by:
Congratulations!
You've completed the scenario!
Scenario Rating
You have successfully finished the HDP in BlueData App Workbench environment.!!
Your environment is currently being packaged as a Docker container and the download will begin shortly. To run the image locally, once Docker has been installed, use the commands
cat scrapbook_bluedata_hdp-scenario_container.tar | docker load
docker run -it /bluedata_hdp-scenario:
Oops!! Sorry, it looks like this scenario doesn't currently support downloads. We'll fix that shortly.

Steps
Create HDP Application on BlueData App Workbench
Step 1 - Preparing the Environment
For creating the image in Bluedata EPIC platform, you need to install Bluedata EPIC App workbench.
We have already installed BlueData Epic Work Bench, for more information, please click on this link.here
For checking the version of BlueData App-Workbench, execute the following command.
bdwb --version
Step 2 - Getting Started.
To begin the application development, we will first need to create a directory called “HDP” or any directory name of your choice. This directory will house all the files and components necessary to create the application image.
To create a directory, execute the following command:
mkdir ~/HDP
Navigate to the newly created directory:
cd ~/HDP
Next, we need to create a skeleton file structure. To do so execute the following command:
Now, we will create a folder called “centos” inside the empty image directory.
Execute the following command to do so :
mkdir image
mkdir ~/HDP/image/centos
Execute the below command to list all the files and folders created
ls -R
Step 3 - Create the Dockerfile.
The next step is to create a Dockerfile.
"A docker file is a text file that the Docker engine understands to automatically build an image by reading the file. The Dockerfile consists of all the commands a user would call to assemble the desired image."
Let’s go ahead and create a Dockerfile inside the newly created centos folder.
For your reference, we have already created a ready Dockerfile in the ~/test directory. We will copy that file into the centos folder using the following command:
copy that file into the centos folder using the following command:
cp ~/test/Dockerfile ~/HDP/image/centos
To view the contents of the Dockerfile, you can use vi, vim, or cat out the contents. To view the contents in the terminal console execute the following command:
cat ~/HDP/image/centos/Dockerfile
You will now see many commands populate your terminal. These are the commands you would use if you were to install your application manually on a host.
The first line of the Dockerfile determines what is the “base” image you will be using to install your application on.
Blue Data provides their own base image, which you can use by simply adding the following command at the top of your Dockerfile:
FROM bluedata/centos7:latest
You have ability to create your own base image.
e.g: FROM ubuntu:12.04, ubuntu:12.04 is the base image used here.
All the commands proceeding the base image, are the commands used to setup the application.
These files or commands will be setup on top of the base image from BlueData and will eventually compile into a .Bin file for use on the EPIC platform.
Step 4 - Creating configuration files
Setup additional dependencies which are needed for HDP.
In this step, we will be showing you the additional steps needed to create a successful HDP image.
We have already referenced the files that are needed, all you would need to do is copy them into the appropriate location.
It is always good to check what version of HDP you are using to understand the dependencies you may need setup in your base image.
We need to add additional configuration files under the HDP directory. We have already made these files for you, to add them in, please execute the following commands:
Execute the below command
yum install wget -y
Add the Configuration files using the below command(Due to space constraint in Katacoda we have uploaded the required appconfig files into dropbox)
wget https://bluedata-srujan.s3.amazonaws.com/dev/bins/ambari-26-setup.zip
yum install unzip -y
Unzip the file
unzip ambari-26-setup.zip
Check the files under ambari-26-setup directory
ls ambari-26-setup
Startscript is a script file which contains code to start all service(s).
appjob provides the information on the type of job to be launched and we can also add application specific jobs.
Logging.sh This provides the logging facilities for a catalog configuration bundle.
Utils.sh contains utility functions defined which provides information on docker id, cpu share, memory status and fqdn of the current container.
modify_host.py While doing scaling up/down this code will get execute which will use modify_host.py file to scale up/down nodes for HDP cluster
setup_cluster.py script is responsible for setting up CDH Cluster
add_remove_node.py scripts is responsible for scaling up/down the nodes in CDH cluster
Enable_Kerberos.py script enable Kerberos in CDH cluster
Let's Look in to Startscript
Startscript contain scripts like cluster creation metadata for HDP image
cluster config choice selections for HDP image
Tenant Level settings for HDP .
cat /root/HDP/ambari-26-setup/startscript
The Start script will execute in each and very host. Once the host is created, the process for Ambari installation will occur. Ambari_server and agents will register in each host. Followed by Hive Oozie databases getting created.
check the Link to know more about process of Ambari installation and setup
Cluster creation metadata section :
Here we are using bdvcli utility to create information regarding node role, node fqdn , node distro_id and node group id
In blueprint templates section :
Here HDP and HDP HA will generate the configuration structure for HDP cluster and HDP HA .
The main script in files contain code for Cluster creation, Activating licence ,Deploying services
Remove the zip file from the folder
rm -rf ambari-26-setup.zip
Step 5 - Coping the Image file to HDP .
Lets Copy the image file to our working directory
When our image is ready to deploy in the EPIC Application Catalog, we need to include a picture that represents the image. For your reference, we have already created a .png file for your use.
cp ~/test/Logo_HortonWorks.png ~/HDP
Logo.png file includes a logo file (400px x 200px .png) to visually identify each application in the App Store
Step 6 - Create the json file for HDP .
Now, we will Look in to our .json file.
Copy the json file from test directory to HDP
cp ~/test/hdp26-ambari26.json ~/HDP
To view the content of the file, execute the following
cat ~/HDP/hdp26-ambari26.json
Json file Contains the application registration and deployment information.
Following configuration will be done in the JSON file :
- we are setting cardinality for different- different role(controller, standby, arbiter, worker, ..etc)
- Exporting and defining endpoint for a service
- Enabling GUI service
- Providing name, description and distro_id for HDP image Also we are deploying selected services in a particular role
Json file contain below sections
- Identification
- Components
- Services
- Node Roles
- Configuration
Below is the example snippet for identification:
"distro_id": "bluedata/hdp26-ambari26-7x-macys"
"label": {
"name": ""name": "HDP 2.6 on 7.x with Ambari 2.6",
"description": "HDP 2.6.4.0 on 7.x with Ambari 2.6.2.2 with YARN support. Includes Pig, Hive, Oozie and HBase"
},
"version": "1.0",
"epic_compatible_versions": ["3.4"],
"categories": [ "Hadoop", "HBase" ],
distro_id is unique identifier for either a Catalog entry or a versioned set of Catalog entries.
The label is a property contains the following parameters:
name which is the "short name" of the Catalog entry. The Catalog API does not allow entries with different distro IDs to share the same name.
description which is a longer, more detailed blurb about the entry.
version is a discriminator between multiple Catalog entries that share the same distro ID.
epic_compatible_versions lists the EPIC platform versions where this Catalog entry may be used.
categories is a list of strings used by the EPIC interface to group Catalog entries during cluster creation.
Note: Changes that needs to be done in json file for upgrading HDP image is: Name and distro_id needs to be updated based on the version of the HDP image to reflect the changes in the App Store for the upgraded image
Below is the example snippet for components:
"image": {
"checksum": "", "source_file": ""
},
"setup_package": {
"config_api_version": 7,
"checksum": "",
"source_file": ""
},
image is a property that identifies the location for the image used to launch virtual nodes for this Catalog entry.
setup_package is similar to the image property except for the configuration scripts package that runs inside the launched virtual node.
Below is the example snippet for services:
"services": [
{
"id": "hbase_master",
"exported_service": "hbase",
"label": {
"name": "HMaster"
},
"endpoint" : {
"url_scheme" : "http",
"port" : "60010",
"path" : "/",
"is_dashboard" : true
}
},
In this example, services is a list of service objects.
The defined services will be referenced by other elements of this JSON file to determine which services are active on which nodes within the cluster.
The example snippet for Node Roles:
"node_roles": [
{
"id": "controller",
"cardinality": "1",
"anti_affinity_group_id": "CM",
"min_cores": "4",
"min_memory": "12288"
},
In this example,node_roles is a list of objects describing roles that may be deployed for this Catalog entry. Each role is a particular configuration instantiated from the entry's virtual node image and configured by the setup scripts
Selected Roles-lists IDs of roles that will be deployed.
Node Services-lists IDs of services that will be present on nodes of a given role, if that role is deployed.
Config Metadata- lists of string key/value pairs that can be referenced by the setup scripts.
Config Choices- lists both the choices available to the UI/API user and the possible selections for each choice.
Step 7 - Building the Bin File using BlueData App Workbench.
In this step we will be creating the bin using BlueData App Workbench by executing the following commands.
.wb file contains a series of App Workbench commands that control the creation of the Catalog Image
Please review the link before proceeding for wb file: Link
Task 1:
For your reference, we have already created a ready .wb file in the ~/test directory. We will copy that file into the HDP directory using the following command:
cp ~/test/hdp26-ambari26.wb ~/HDP
To check the files in HDP folder
ls
Task 2:
To view the content of the file, execute the following
cat ~/HDP/hdp26-ambari26.wb
Following configuration will be done in the .WB file :
loading json file, generating scripts(loggin.sh, appjob etc..) inside ambari-26-setup directory and adding logo to the HDP Image
creating the docker image with dockerfile under “image/centos” and packaging docker image
Task 3:
Execute the wb file
./hdp26-ambari26.wb
After executing the above command wait for sometime until bin gets created!!
Step 8 - Finalising the build for HDP .
Checking the Bin File For HDP.
Let's see what you built.
The newly built application package (or bundle) is saved in the deliverables directory.
cd deliverables
ls
To make the new image appear in App Store
Copy the bin file to /srv/bluedata/catalog using the following commad
cp bdcatalog-centos7-bluedata-hdp26-ambari26-1.0.bin /srv/bluedata/catalog
Make it an executable using the below command
chmod +x bdcatalog-centos7-bluedata-hdp26-ambari26-1.0.bin
Go to App Store in the EPIC GUI and click on Refresh button to bring the image in to App Store.
Once the image appears on the App Store click on Install button to install the image.