Impala/HDFS intro and Setup

Getting started

You’re going to want to make sure you can import ibis

In [1]:
import ibis
import os

If you have WebHDFS available, connect to HDFS with according to your WebHDFS config. For kerberized or more complex HDFS clusters please look at for info on connecting. You can use a connection from that library instead of using hdfs_connect

In [2]:
hdfs_port = os.environ.get('IBIS_WEBHDFS_PORT', 50070)
hdfs = ibis.hdfs_connect(host='quickstart.cloudera', port=hdfs_port)

Finally, create the Ibis client

In [3]:
con = ibis.impala.connect('quickstart.cloudera', hdfs_client=hdfs)
<ibis.impala.client.ImpalaClient at 0x7fd5e2974da0>

Obviously, substitute the parameters that are appropriate for your environment (see docstring for ibis.impala.connect). impala.connect uses the same parameters as Impyla’s ( DBAPI interface