Louai Alarabi: February 2014

Tuesday, February 18, 2014

Pass parameter to Hadoop Mapper

A typical example of word counts in hadoop, counts all keywords in file. let's assume that we have the following file contents:

Louai
Wael
Ahmed
Wael

If we run a typical example of word count "could be found in many websites"

/bin/hadoop jar wordcount /input.txt /output

The output of word counts will be
Louai 1
Ahmed 1
Wael 2

What if we would like to count only specific words in the input documents; for instance we would like to count only wael keyword. In order to accomplish this task we need to pass argument to hadoop mapper using -D command line, as in the following command line:

/bin/hadoop wordcount /input.txt /output -D parameter wael

* First we need to add this code before the definition of the Job :

//Set the configuration for the job

Configuration conf = getConf();

conf.set("parameter", args[4]);

Job job = new Job(conf, "louai word count");

* Second we need to get the passed parameter in the mapper :

String parameter = context.getConfiguration().get("parameter");

Well, let's see the code on github :https://github.umn.edu/alar0021/blogRepository/blob/master/wordCount/WordCount.java

Wednesday, February 12, 2014

Setup JAVA_HOME Ubuntu

There are several ways to set-up JAVA_HOME :

1) Temporary for current active session, open the terminal and enter the two command line:

export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
export PATH=$PATH:/$JAVA_HOME/bin

2) Permanent and for all users by changing either bashrc or profile files, open the terminal and enter the command line:

nano ~/.bashrc

nano /etc/profile

3) at the beginning of the opened file past the two line :

export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
export PATH=$PATH:/$JAVA_HOME/bin

SSH without password

SSH Login without password:

Let's say that your machine is A and you want to access machine B without password, here are a few steps:

1- in Machine A write the following command line.

ssh-keygen -t rsa

2- Press enter to all prompt questions, no need to type anything.

3- Copy the content of the following file

less /home/A/.ssh/id_rsa.pub

4- Access machine B with ssh and password

ssh user@B....

5- Go to .ssh directory in machine B

cd .ssh

5- Create new file (machine B)

nano authorized_key

6- past the content of id_rsa.pub into that authorized_key file