如何使用Python Impyla客戶端連接Hive和Impala
溫馨提示:要看高清無碼套圖,請使用手機打開並單擊圖片放大查看。
1.文檔編寫目的
繼上一章講述如何在CDH集群安裝Anaconda&搭建Python私有源後,本章節主要講述如何使用Pyton Impyla客戶端連接CDH集群的HiveServer2和Impala Daemon,並進行SQL操作。
內容概述
1.依賴包安裝
2.代碼編寫
3.代碼測試
測試環境
1.CM和CDH版本為5.11.2
2.RedHat7.2
前置條件
1.CDH集群環境正常運行
2.Anaconda已安裝並配置環境變數
3.pip工具能夠正常安裝Python包
4.Python版本2.6+ or 3.3+
5.非安全集群環境
2.Impyla依賴包安裝
Impyla所依賴的Python包
six
bit_array
thrift (on Python 2.x) orthriftpy (on Python 3.x)
thrift_sasl
sasl
1.首先安裝Impyla依賴的Python包
[root@ip-172-31-22-86~]# pip install bit_array
[root@ip-172-31-22-86~]# pip install thrift==0.9.3
[root@ip-172-31-22-86~]# pip install six
[root@ip-172-31-22-86~]# pip install thrift_sasl
[root@ip-172-31-22-86~]# pip install sasl
注意:thrift的版本必須使用0.9.3,默認安裝的為0.10.0版本,需要卸載後重新安裝0.9.3版本,卸載命令pip uninstall thrift
2.安裝Impyla包
impyla版本,默認安裝的是0.14.0,需要將卸載後安裝0.13.8版本
[root@ip-172-31-22-86ec2-user]# pip install impyla==0.13.8
Collectingimpyla
Downloadingimpyla-0.14.0.tar.gz (151kB)
100%████████████████████████████████153kB1.0MB/s
Requirementalready satisfied: sixin/opt/cloudera/parcels/Anaconda-4.2.0/lib/python2.7/site-packages (fromimpyla)
Requirementalreadysatisfied: bitarrayin/opt/cloudera/parcels/Anaconda-4.2.0/lib/python2.7/site-packages (fromimpyla)
Requirementalreadysatisfied: thriftin/opt/cloudera/parcels/Anaconda-4.2.0/lib/python2.7/site-packages (fromimpyla)
Buildingwheelsforcollected packages: impyla
Runningsetup.py bdist_wheelforimpyla ...done
Storedindirectory: /root/.cache/pip/wheels/96/fa/d8/40e676f3cead7ec45f20ac43eb373edc471348ac5cb485d6f5
Successfullybuilt impyla
Installingcollected packages: impyla
Successfullyinstalled impyla-0.14.0
3.編寫Python代碼
Python連接Hive(HiveTest.py)
fromimpala.dbapiimportconnect
conn=connect(host= ip-172-31-21-45.ap-southeast-1.compute.internal ,port=10000,database= default ,auth_mechan
ism= PLAIN )
print(conn)
cursor=conn.cursor()
cursor.execute( show databases )
printcursor.description# prints the result set s schema
results=cursor.fetchall()
print(results)
cursor.execute( SELECT*FROM testlimit10 )
printcursor.description# prints the result set s schema
results=cursor.fetchall()
print(results)
Python連接Impala(ImpalaTest.py)
fromimpala.dbapiimportconnect
conn=connect(host= ip-172-31-26-80.ap-southeast-1.compute.internal ,port=21050)
print(conn)
cursor=conn.cursor()
cursor.execute( show databases )
printcursor.description# prints the result set s schema
results=cursor.fetchall()
print(results)
cursor.execute( SELECT*FROM testlimit10 )
printcursor.description# prints the result set s schema
results=cursor.fetchall()
print(results)
4.測試代碼
在shell命令行執行Python代碼測試
1.測試連接Hive
[root@ip-172-31-22-86ec2-user]# python HiveTest.py
[( database_name , STRING , None, None, None, None, None)]
[( default ,)]
[( test.s1 , STRING ,None, None, None, None, None),( test.s2 , STRING , None, None, None, None, None)]
[( name1 , age1 ),( name2 , age2 ),( name3 , age3 ),( name4 , age4 ),( name5 , age5 ),( name6 , age6 ),( name7 , age7 ),( name8 , age8 ),( name9 , age9 ),( name10 , age10 )]
[root@ip-172-31-22-86ec2-user]#
2.測試連接Impala
[root@ip-172-31-22-86ec2-user]# python ImpalaTest.py
[( name , STRING , None, None, None, None, None),( comment , STRING , None, None, None, None, None)]
[( _impala_builtins , Systemdatabase for Impala builtin functions ),( default , Default Hive database )]
[( s1 , STRING , None, None, None,None, None),( s2 , STRING , None, None, None,None, None)]
[( name1 , age1 ),( name2 , age2 ),( name3 , age3 ),( name4 , age4 ),( name5 , age5 ),( name6 , age6 ),( name7 , age7 ),( name8 , age8 ),( name9 , age9 ),( name10 , age10 )]
[root@ip-172-31-22-86ec2-user]#
5.常見問題
1.錯誤一
building sasl.saslwrapper extension
creatingbuild/temp.linux-x86_64-2.7
creatingbuild/temp.linux-x86_64-2.7/sasl
gcc-pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -Isasl -I/opt/cloudera/parcels/Anaconda/include/python2.7 -c sasl/saslwrapper.cpp -o build/temp.linux-x86_64-2.7/sasl/saslwrapper.o
unableto execute gcc : No such file or directory
error:command gcc failed with exit status1
----------------------------------------
Command"/opt/cloudera/parcels/Anaconda/bin/python -u -c "import setuptools, tokenize;__file__= /tmp/pip-build-kD6tvP/sasl/setup.py ;f=getattr(tokenize, open ,open)(__file__);code=f.read().replace(
,
);f.close();exec(compile(code,__file__, exec ))" install --record /tmp/pip-WJFNeG-record/install-record.txt --single-version-externally-managed --compile"failed with error code1in/tmp/pip-build-kD6tvP/sasl/
解決方法:
[root@ip-172-31-22-86ec2-user]# yum -y install gcc
[root@ip-172-31-22-86ec2-user]# yum install gcc-c++
2.錯誤二
gcc-pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -Isasl -I/opt/cloudera/parcels/Anaconda/include/python2.7 -c sasl/saslwrapper.cpp -o build/temp.linux-x86_64-2.7/sasl/saslwrapper.o
cc1plus:warning: command line option 『-Wstrict-prototypes』isvalidforC/ObjC but notforC++ [enabled by default]
Infile included from sasl/saslwrapper.cpp:254:0:
sasl/saslwrapper.h:22:23:fatal error: sasl/sasl.h: No such file or directory
#include
^
compilationterminated.
error:command gcc failed with exit status1
解決方法:
[root@ip-172-31-22-86ec2-user]# yum -y install python-devel.x86_64 cyrus-sasl-devel.x86_64
醉酒鞭名馬,少年多浮誇! 嶺南浣溪沙,嘔吐酒肆下!摯友不肯放,數據玩的花!
溫馨提示:要看高清無碼套圖,請使用手機打開並單擊圖片放大查看。
您可能還想看
※Python高薪系統學習班,機不可失失不再來
※最全Pycharm使用教程-(三)
※最全Pycharm使用教程-(二)
※最全Pycharm使用教程-(一)
※Python和Ruby 誰是最好的Web開發語言?
TAG:Python |