Thursday, May 9, 2013

Hive-0.10.0 Setup on Pseudo-distributed Hadoop-1.1.2 on Cygwin/Windows7

As all the Apache products supporting distributed parallel computing under the Hadoop umbrella are expected to run on Unix like environment, they might need little bit of tweaking here and there to run properly on Cygwin/Windows platform. These are some of my experimental observation while setting up Hive on Hadoop on Cygwin/Windows7 platform.
 
  • Untar tarball hive-0.10.0.tar.gz
          $ tar -xzvf hive-0.10.0.tar.gz

  • Update C:/cygwin /home/{User}/.bash_profile, add the following to cygwin PATH:
          export HIVE_HOME=/cygdrive/c/hadoop/hive-0.10.0
          export PATH=$HIVE_HOME/bin:$PATH

  • Update $HIVE_HOME/conf/hive-env.sh, add the following:
          HADOOP_HOME=/cygdrive/c/hadoop/hadoop-1.1.2

  • Update $HADOOP_HOME/conf/hadoop-env.sh, add the following:
          "export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$HIVE_HOME/lib"


The above setup helps to deal with some of the common exceptions HIVE  might throw during configuration –

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hive/conf/HiveConf
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.conf.HiveConf


Exception in thread "main" java.lang.RuntimeException: Failed to load Hive builtin functions
Caused by: java.lang.ClassNotFoundException: org.apache.hive.builtins.BuiltinUtils

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hive/ql/session/SessionState
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.ql.session.SessionState