带有Kerberos的PyHive在几次调用后引发身份验证错误

问题描述 投票:1回答:1

我正在尝试使用Python(PyHive Lib)]连接到Hive读取一些数据,然后将其进一步连接到Hive Flask以便在仪表板上显示。

几次调用蜂巢都可以正常工作,但是此后不久我就收到了错误提示。

Traceback (most recent call last):
  File "libs/hive.py", line 63, in <module>
    cur = h.connect().cursor()
  File "libs/hive.py", line 45, in connect
    kerberos_service_name='hive')
  File "/home1/igns/git/emsr/.venv/lib/python2.7/site-packages/pyhive/hive.py", line 94, in connect
    return Connection(*args, **kwargs)
  File "/home1/igns/git/emsr/.venv/lib/python2.7/site-packages/pyhive/hive.py", line 192, in __init__
    self._transport.open()
  File "/home1/igns/git/emsr/.venv/lib/python2.7/site-packages/thrift_sasl/__init__.py", line 79, in open
    message=("Could not start SASL: %s" % self.sasl.getError()))
thrift.transport.TTransport.TTransportException: Could not start SASL: Error in sasl_client_start (-1) SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure.  Minor code may provide more information (No Kerberos credentials available (default cache: FILE:/tmp/krb5cc_cdc995595290_51CD7j))

以下是我的代码

from pyhive import hive
class Hive(object):
    def connect(self):
        return hive.connect(host='hive.hadoop-prod.abc.com',
                            port=10000,
                            database='temp',
                            username='gaurang.shah',
                            auth='KERBEROS',
                            kerberos_service_name='hive')


if __name__ == '__main__':

    h = Hive()
    cur = h.connect().cursor()
    cur.execute("select * from temp.migration limit 1")
    res = cur.fetchall()
    print res

调用脚本

source .venv/bin/activate
for i in {1..50}
do
    python get_hive_data.py
    sleep 300
done

观察

当它运行时,当我执行klist时,我可以在服务主体中看到hive,但是当我看到以上错误消息时,我看不到。

正在工作的Klist

Ticket cache: FILE:/tmp/krb5cc_cdc995595290_XyMnhu
Default principal: [email protected]

Valid starting       Expires              Service principal
12/04/2018 14:37:28  12/05/2018 00:37:28  krbtgt/[email protected]
    renew until 12/05/2018 14:37:24
12/04/2018 14:39:06  12/05/2018 00:37:28  hive/[email protected]
    renew until 12/05/2018 14:37:24

不工作时列出清单

Ticket cache: FILE:/tmp/krb5cc_cdc995595290_XyMnhu
Default principal: [email protected]

Valid starting       Expires              Service principal
12/04/2018 14:37:28  12/05/2018 00:37:28  krbtgt/[email protected]
    renew until 12/05/2018 14:37:24

更新:

因此,我认为不是在某些电话会议之后,而是在某些时间之后的会议。 (我想一个小时)。我将sleep time

更改为3600 sec,并且在第一次通话后刚开始出现错误。

这很奇怪,hive/[email protected]的票证有效期为7天

我正在尝试使用Python(PyHive Lib)连接到Hive以读取一些数据,然后将其进一步连接到Hive Flask以在仪表板中显示。很少调用蜂巢,一切都很好,但是很快...

python hive kerberos pyhive
1个回答
0
投票

我知道这是旧帖子。但是,如果每次通话时都建立新的连接,则应该解决此问题。

© www.soinside.com 2019 - 2024. All rights reserved.