Ubuntu安装Thrift连接Hive


      Thrift提供了一种通过代码生成引擎生成中间代码来实现跨语言调用的机制。本文介绍

                   1.如何安装Thrift编译器,

                   2.通过一个小例子展示如何使用python通过Thrift调用Hive

必要前提条件:1.已安装配置jdk 2.已安装Hadoop,Hive并已启动、可使用。3.已安装gcc与g++并可使用。

本机环境说明:

         jdk1.7

         hadoop-1.1.2

         hive-0.11.0

         操作系统:Ubuntu12.04
一.安装Thrift

         1.安装需要的工具和类库

        

sudo apt-get install libboost-dev libboost-test-dev libboost-program-options-dev libevent-dev automake libtool flex bison pkg-config g++ libssl-dev 



         2.下载\解压\编译\安装 Thrift

wget https://pypi.python.org/packages/source/t/thrift/thrift-0.9.1.tar.gz
tar -zxvf thrift-0.9.1.tar.gz
cd thrift-0.9.1
./configure
sudo make && sudo make install



二.将hive_home/lib/py下的文件copy至python可加载的目录

cp -r $HIVE_HOME/lib/py/*  /usr/local/lib/python2.7/dist-packages/


至此环境已搭建完毕

三.启动hiveserver

hive --service hiveserver -p 100028 -v




请在已将HIVE_HOME/bin目录配置进系统环境变量的情况下执行该命令,否则请在HIVE_HOME/bin目录下执行

四.测试程序

测试程序如下,在本机环境下测试成功,在生产环境中一定要需要捕获的异常捕获,将所有异常情况控制在自己手里。

#!/usr/bin/env python
#*-*coding:UTF-8 *-*

#Author:JohnWang
#Date:2014-01-15
#Version:1.0

import sys
from hive_service import ThriftHive
from hive_service.ttypes import HiveServerException
from thrift import Thrift
from thrift.transport import TSocket,TTransport
from thrift.transport.TTransport import TTransportException
from thrift.protocol import TBinaryProtocol

class HiveClient(object):
    def __init__(self,conInfo):
        self.__dict__.update(conInfo)
    def connect(self):
        transport = None
        client = None
        try:
            transport = TSocket.TSocket(self.ip,self.port)
            transport = TTransport.TBufferedTransport(transport)
                    protocol = TBinaryProtocol.TBinaryProtocol(transport)
            client = ThriftHive.Client(protocol)
            transport.open()
        except TTransportException,error:
            print error
            return False
        self.client = client
        self.tranport = transport
        if self.db:
            return self.execute('use %s' %self.db)
        return True
    def execute(self,sql):
        try:
            self.client.execute(sql)
        except HiveServerException,error:
            print error
            return False
        return True
    def getOne(self):
        pass
    def getAll(self):
        pass
    
    def close(self):
        if self.transport:
            self.transport.close()
            return True
        else:
            return False
if __name__ == '__main__':
    
    conInfo = {'ip':'127.0.0.1','port':10028,'user':'','passwd':'','db':'test_hive'}
    
    client = HiveClient(conInfo)
    if   client.connect():
          print 'connect hive succes'
 


相关内容