Installing Hadoop on Windows / cygwin

Published: 2011-01-10
Highly unnecessary... unless you're stuck with a Windows machine :-/

1) Install Cygwin

This is as straightforward as it gets, but don't forget to add your favorite editor : vim is not included in the default install (!).

The default directory is c:\cygwin, no reason to change it.

2) Install the Java Development Kit

You should avoid unnecessary spaces in the installation directory : c:\jdk1.6 will do nicely.

3) Install Hadoop

Download the latest release (0.21.0 at the time of writing) and extract it to d:\work (or something similar).

4) Fix your environment variables

Start cygwin and append the following lines at the end of your .bashrc file:

$ export JAVA_HOME=/cygdrive/c/jdk1.6
$ export HADOOP_INSTALL=/cygdrive/d/work/hadoop-0.21.0
$ export PATH=$PATH:$HADOOP_INSTALL/bin

5) Fix the hadoop-config script

$ vi $HADOOP_INSTALL/bin/hadoop-config.sh

Locate this section starting with "# cygwin path translation" and add the following line :

CLASSPATH=`cygpath -wp "$CLASSPATH"`

Save and exit.

6) Test your installation

$ hadoop version
Hadoop 0.21.0
etc etc.


That's it. Happy hadoop'ing :)