ibis.impala.api.ImpalaClient.parquet_file

ImpalaClient.parquet_file(self, hdfs_dir, schema=None, name=None, database=None, external=True, like_file=None, like_table=None, persist=False)

Make indicated parquet file in HDFS available as an Ibis table.

The table created can be optionally named and persisted, otherwise a unique name will be generated. Temporarily, for any non-persistent external table created by Ibis we will attempt to drop it when the underlying object is garbage collected (or the Python interpreter shuts down normally).

Parameters
hdfs_dirstring

Path in HDFS

schemaibis Schema

If no schema provided, and neither of the like_* argument is passed, one will be inferred from one of the parquet files in the directory.

like_filestring

Absolute path to Parquet file in HDFS to use for schema definitions. An alternative to having to supply an explicit schema

like_tablestring

Fully scoped and escaped string to an Impala table whose schema we will use for the newly created table.

namestring, optional

random unique name generated otherwise

databasestring, optional

Database to create the (possibly temporary) table in

externalboolean, default True

If a table is external, the referenced data will not be deleted when the table is dropped in Impala. Otherwise (external=False) Impala takes ownership of the Parquet file.

persistboolean, default False

Do not drop the table upon Ibis garbage collection / interpreter shutdown

Returns
parquet_tableImpalaTable