Hive Developing,hivedeveloping


Building hive from source

clone and build from source

downlaod

git clone https://git-wip-us.apache.org/repos/asf/hive.git
or 
git clone git@github.com:wankunde/hive.git

build

git branch -va
// checkout the branch which you are intrest.
git checkout -b branch-0.14 origin/branch-0.14
// or git checkout --track origin/branch-0.14

// compile and dist
mvn clean install -DskipTests -Phadoop-2 -Pdist

//  generate protobuf code
cd ql
mvn clean install -DskipTests -Phadoop-2,protobuf

// generate Thrift code
mvn clean install -Phadoop-2,thriftif -DskipTests -Dthrift.home=/usr/local

Tips


  • By default,before compile,maven will download many dependency packages and meet timeout exception.I use nexus and add <timeout>120000</timeout> configuration to <server> configuration item. Not Test
  • Hive HiveDeveloperFAQ Wiki
  • If you want to compile and test hive in Eclipse,continue the follow steps.
    • mvn eclipse:eclipse
    • import project
    • Configure - Convert to maven project
    • Some maven plugins may not work well,(may need to set local proxy host and port in eclipse),connect to m2e market and install m2e connectors what you need.(include antlr,build-helper)

You can access the m2e market place from the preferences: Preferences>Maven>Discovery>Open Catalog. Installing WTP integration solved most plugin issues for me.

Test Unit

Change hive log level

bin/hive -hiveconf hive.root.logger=DEBUG,console

Or change log4j properties.

cp conf/hive-log4j.properties.template conf/hive-log4j.properties

Connecting a Java Debugger to hive

Example java remote debug

  • Run remote java program using script
JVM_OPTS="-server -XX:+UseParNewGC -XX:+HeapDumpOnOutOfMemoryError"
DEBUG="-Xdebug -Xnoagent -Djava.compiler=NONE -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=2345"
JVM_OPTS="$JVM_OPTS $DEBUG"
$JAVA_HOME/bin/java $JVM_OPTS -cp tools-1.0.jar com.wankun.tools.hdfs.Test2
  • Add and run new remote debug configuration in eclipse
Created with Raphaël 2.1.0debug asdebug configurationsRemote Java Applicaiton connect (configure host and post using above parameters)

Start hive debug

help command
hive --help --debug

Run hive without a hadoop cluster

export HIVE_OPTS='--hiveconf mapred.job.tracker=local --hiveconf fs.default.name=file:///tmp \
    --hiveconf hive.metastore.warehouse.dir=file:///tmp/warehouse \
    --hiveconf javax.jdo.option.ConnectionURL=jdbc:derby:;databaseName=/tmp/metastore_db;create=true'

Hive test unit

Two kind of unit tests

  • Normal unit test
mvn test -Dtest=ClassName#methodName -Phadoop-2

For example,
mvn test -Dtest=TestAbc -Phadoop-2 which TestAbc is the test case.
mvn test -Dtest='org.apache.hadoop.hive.ql.*' -Phadoop-2 .

Help Links : Maven Surefire Plugin

  • Query files

There are many test scripts. Not successed

$ ls ql/src/test/queries/
clientcompare  clientnegative  clientpositive  negative  positive

// run test unit,ql as example
cd ql
mvn test -Dtest=TestCliDriver -Dqfile=groupby1.q -Phadoop-2

//Take src/test/queries/clientpositive/groupby1.q for example.

mvn test -Dmodule=ql -Phadoop-2 -Dtest=TestCliDriver -Dqfile=groupby1.q -Dtest.output.overwrite=true

Help Links1

Help Links2

版权声明:本文为博主原创文章,未经博主允许不得转载。

相关内容