parquet_file(hdfs_dir, schema=None, name=None, database=None, external=True, like_file=None, like_table=None, persist=False)¶
Make indicated parquet file in HDFS available as an Ibis table.
The table created can be optionally named and persisted, otherwise a unique name will be generated. Temporarily, for any non-persistent external table created by Ibis we will attempt to drop it when the underlying object is garbage collected (or the Python interpreter shuts down normally).
- hdfs_dir : string
Path in HDFS
- schema : ibis Schema
If no schema provided, and neither of the like_* argument is passed, one will be inferred from one of the parquet files in the directory.
- like_file : string
Absolute path to Parquet file in HDFS to use for schema definitions. An alternative to having to supply an explicit schema
- like_table : string
Fully scoped and escaped string to an Impala table whose schema we will use for the newly created table.
- name : string, optional
random unique name generated otherwise
- database : string, optional
Database to create the (possibly temporary) table in
- external : boolean, default True
If a table is external, the referenced data will not be deleted when the table is dropped in Impala. Otherwise (external=False) Impala takes ownership of the Parquet file.
- persist : boolean, default False
Do not drop the table upon Ibis garbage collection / interpreter shutdown
- parquet_table : ImpalaTable