ibis.impala.api.ImpalaClient.parquet_file

ImpalaClient.parquet_file(hdfs_dir, schema=None, name=None, database=None, external=True, like_file=None, like_table=None, persist=False)

Make indicated parquet file in HDFS available as an Ibis table.

The table created can be optionally named and persisted, otherwise a unique name will be generated. Temporarily, for any non-persistent external table created by Ibis we will attempt to drop it when the underlying object is garbage collected (or the Python interpreter shuts down normally).

Parameters:

hdfs_dir : string

Path in HDFS

schema : ibis Schema

If no schema provided, and neither of the like_* argument is passed, one will be inferred from one of the parquet files in the directory.

like_file : string

Absolute path to Parquet file in HDFS to use for schema definitions. An alternative to having to supply an explicit schema

like_table : string

Fully scoped and escaped string to an Impala table whose schema we will use for the newly created table.

name : string, optional

random unique name generated otherwise

database : string, optional

Database to create the (possibly temporary) table in

external : boolean, default True

If a table is external, the referenced data will not be deleted when the table is dropped in Impala. Otherwise (external=False) Impala takes ownership of the Parquet file.

persist : boolean, default False

Do not drop the table upon Ibis garbage collection / interpreter shutdown

Returns:

parquet_table : ImpalaTable