Installing Hadoop on Windows / cygwin

Published: 2011-01-10
Highly unnecessary... unless you're stuck with a Windows machine :-/

1) Install Cygwin

This is as straightforward as it gets, but don't forget to add your favorite editor : vim is not included in the default install (!).

The default directory is c:\cygwin, no reason to change it.

2) Install the Java Development Kit

You should avoid unnecessary spaces in the installation directory : c:\jdk1.6 will do nicely.

3) Install Hadoop

Download the latest release (0.21.0 at the time of writing) and extract it to d:\work (or something similar).

4) Fix your environment variables

Start cygwin and append the following lines at the end of your .bashrc file:

$ export JAVA_HOME=/cygdrive/c/jdk1.6
$ export HADOOP_INSTALL=/cygdrive/d/work/hadoop-0.21.0
$ export PATH=$PATH:$HADOOP_INSTALL/bin

5) Fix the hadoop-config script

$ vi $HADOOP_INSTALL/bin/hadoop-config.sh

Locate this section starting with "# cygwin path translation" and add the following line :

CLASSPATH=`cygpath -wp "$CLASSPATH"`

Save and exit.

6) Test your installation

$ hadoop version
Hadoop 0.21.0
etc etc.


That's it. Happy hadoop'ing :)

About the Author

Julien Simon is the Chief Evangelist at Arcee AI , specializing in Small Language Models and enterprise AI solutions. Recognized as the #1 AI Evangelist globally by AI Magazine in 2021, he brings over 30 years of technology leadership experience to his role.

With 650+ speaking engagements worldwide and 350+ technical blog posts, Julien is a leading voice in practical AI implementation, cost-effective AI solutions, and the democratization of artificial intelligence. His expertise spans open-source AI, Small Language Models, enterprise AI strategy, and edge computing optimization.

Previously serving as Principal Evangelist at Amazon Web Services and Chief Evangelist at Hugging Face, Julien has helped thousands of organizations implement AI solutions that deliver real business value. He is the author of "Learn Amazon SageMaker," the first book ever published on AWS's flagship machine learning service.

Julien's mission is to make AI accessible, understandable, and controllable for enterprises through transparent, open-weights models that organizations can deploy, customize, and trust.