Creating a Hive Cluster job using Amazon EMR CLI

Following from my previous post on Installing the Amazon EMR Command Line Interface for Windows, I will look at how to create an Elastic MapReduce job (Hive Cluster) from the CLI. This includes creation of Instances i.e a Master Instance and Slave Instances using an Interactive Hive Session.

Step 1: without specifying the number of instances to create

– Run the following command from the Amazon EMR CLI directory:

–create –alive –hive-interactive –name “Test_Job Flow”  –instance-type m1.small –hive-interactive

emr-cli-instance01

– The instance should now be up and running

emr-cli-instance02

– As seen this created just one instance which will be the Master instance

emr-cli-instance03

Step 2: Let’s now add the number of instances switch and see the result

–create –alive –hive-interactive –name “Test_Job Flow”  –instance-type m1.small –num-instances 2 –hive-interactive

emr-cli-instance04

– The number of instances is now 2

emr-cli-instance05

– Viewing the Instance Groups, we have one acting as MASTER and second as CORE

emr-cli-instance06

This is also same in a 3 instance scenario, you will also have 1 MASTER and the rest will be the CORE instances (2)

      emr-cli-instance07

 
comments powered by Disqus