Note: These release notes will only include notable or major bug fixes since most minor bug fixes tend to be esoteric and not generally interesting. Point (minor, e.g. 0.5.1) releases will generally not be found here and contain only bug fixes.
0.8 (May 19, 2016)¶
This release brings initial PostgreSQL backend support along with a number of critical bug fixes and usability improvements. As several correctness bugs with the SQL compiler were fixed, we recommend that all users upgrade from earlier versions of Ibis.
- Initial PostgreSQL backend contributed by Philip Cloud.
groupbyas an alias for
group_byto table expressions
- Fix an expression error when filtering based on a new field
- Fix Impala’s SQL compilation of using
ORwith compound filters
- Various fixes with the
having(...)function in grouped table expressions
- Fix CTE (
WITH) extraction inside
ImportErroron Python 2 when
mocklibrary not installed
- The deprecated
ibis.make_clientAPIs have been removed
0.7 (March 16, 2016)¶
This release brings initial Kudu-Impala integration and improved Impala and SQLite support, along with several critical bug fixes.
- Apache Kudu (incubating) integration for Impala users. See the blog post for now. Will add some documentation here when possible.
ibis.hdfs_connectfor WebHDFS connections in secure (Kerberized) clusters without SSL enabled.
- Correctly compile aggregate expressions involving multiple subqueries.
To explain this last point in more detail, suppose you had:
table = ibis.table([('flag', 'string'), ('value', 'double')], 'tbl') flagged = table[table.flag == '1'] unflagged = table[table.flag == '0'] fv = flagged.value uv = unflagged.value expr = (fv.mean() / fv.sum()) - (uv.mean() / uv.sum())
The last expression now generates the correct Impala or SQLite SQL:
SELECT t0.`tmp` - t1.`tmp` AS `tmp` FROM ( SELECT avg(`value`) / sum(`value`) AS `tmp` FROM tbl WHERE `flag` = '1' ) t0 CROSS JOIN ( SELECT avg(`value`) / sum(`value`) AS `tmp` FROM tbl WHERE `flag` = '0' ) t1
VARCHAR(n)Impala types now correctly map to Ibis string expressions
- Fix inappropriate projection-join-filter expression rewrites resulting in incorrect generated SQL.
STORED AS PARQUETfor
- Fixed several issues with Ibis dependencies (impyla, thriftpy, sasl, thrift_sasl), especially for secure clusters. Upgrading will pull in these new dependencies.
- Do not fail in
ibis.impala.connectwhen trying to create the temporary Ibis database if no HDFS connection passed.
- Fix join predicate evaluation bug when column names overlap with table attributes.
- Fix handling of fully-materialized joins (aka
select *joins) in SQLAlchemy / SQLite.
Thank you to all who contributed patches to this release.
$ git log v0.6.0..v0.7.0 --pretty=format:%aN | sort | uniq -c | sort -rn 21 Wes McKinney 1 Uri Laserson 1 Kristopher Overholt
0.6 (December 1, 2015)¶
This release brings expanded pandas and Impala integration, including support for managing partitioned tables in Impala. See the new Ibis for Impala Users guide for more on using Ibis with Impala.
The Ibis for SQL Programmers guide also was written since the 0.5 release.
This release also includes bug fixes affecting generated SQL correctness. All users should upgrade as soon as possible.
- New integrated Impala functionality. See Ibis for Impala Users for more details on these things.
- Improved Impala-pandas integration. Create tables or insert into existing
tables from pandas
- Partitioned table metadata management API. Add, drop, alter, and insert into table partitions.
- Added support for
LOAD DATADDL using the
load_datafunction, also supporting partitioned tables.
- Modify table metadata (location, format, SerDe properties etc.) using
- Interrupting Impala expression execution with Control-C will attempt to cancel the running query with the server.
- Set the compression codec (e.g. snappy) used with
- Get and set query options for a client session with
ImpalaTable.metadatamethod that parses the output of the
DESCRIBE FORMATTEDDDL to simplify table metadata inspection.
ImpalaTable.column_statsto see computed table and partition statistics.
invalidate_metadataDDL options and add
COMPUTE INCREMENTAL STATS.
- Improved Impala-pandas integration. Create tables or insert into existing tables from pandas
substitutemethod for performing multiple value substitutions in an array or scalar expression.
- Division is by default true division like Python 3 for all numeric
data. This means for SQL systems that use C-style division semantics, the
CASTwill be automatically inserted in the generated SQL.
- Easier joins on tables with overlapping column names. See Ibis for SQL Programmers.
- Expressions like
string_expr[:3]now work as expected.
coalesceinstance method to all value expressions.
executemethod on expressions disables any default row limits.
ImpalaTable.renameno longer mutates the calling table expression.
$ git log v0.5.0..v0.6.0 --pretty=format:%aN | sort | uniq -c | sort -rn 46 Wes McKinney 3 Uri Laserson 1 Phillip Cloud 1 mariusvniekerk 1 Kristopher Overholt
0.5 (September 10, 2015)¶
Highlights in this release are the SQLite, Python 3, Impala UDA support, and an asynchronous execution API. There are also many usability improvements, bug fixes, and other new features.
- SQLite client and built-in function support
- Ibis now supports Python 3.4 as well as 2.6 and 2.7
- Ibis can utilize Impala user-defined aggregate (UDA) functions
- SQLAlchemy-based translation toolchain to enable more SQL engines having SQLAlchemy dialects to be supported
- Many window function usability improvements (nested analytic functions and deferred binding conveniences)
- More convenient aggregation with keyword arguments in
- Built preliminary wrapper API for MADLib-on-Impala
stdaggregation methods and support in Impala
nullifzeronumeric method for all SQL engines
renamemethod to Impala tables (for renaming tables in the Hive metastore)
ImpalaClientfor session cleanup (#533)
relabelmethod to table expressions
insertmethod to Impala tables
verifymethods to all expressions to test compilation and ability to compile (since many operations are unavailable in SQLite, for example)
- Impala Ibis client creation now uses only
ibis.make_clienthas been deprecated
$ git log v0.4.0..v0.5.0 --pretty=format:%aN | sort | uniq -c | sort -rn 55 Wes McKinney 9 Uri Laserson 1 Kristopher Overholt
0.4 (August 14, 2015)¶
- Add tooling to use Impala C++ scalar UDFs within Ibis (#262, #195)
- Support and testing for Kerberos-enabled secure HDFS clusters
- Many table functions can now accept functions as parameters (invoked on the calling table) to enhance composability and emulate late-binding semantics of languages (like R) that have non-standard evaluation (#460)
notallreductions on boolean arrays, as well as
topknow produces an analytic expression that is executable (as an aggregation) but can also be used as a filter as before (#392, #91)
- Added experimental database object “usability layer”, see
compute_statsAPI to table expressions referencing physical Impala tables
ImpalaClientto show query plan for an expression
HDFSinterface for superusers
convert_basemethod to strings and integer types
- Add option to
ImpalaClient.create_tableto create empty partitioned tables
ibis.cross_joincan now join more than 2 tables at once
ImpalaClient.raw_sqlmethod for running naked SQL queries
ImpalaClient.insertnow validates schemas locally prior to sending query to cluster, for better usability.
- Add conda installation recipes
$ git log v0.3.0..v0.4.0 --pretty=format:%aN | sort | uniq -c | sort -rn 38 Wes McKinney 9 Uri Laserson 2 Meghana Vuyyuru 2 Kristopher Overholt 1 Marius van Niekerk
0.3 (July 20, 2015)¶
First public release. See http://ibis-project.org for more.
- Implement window / analytic function support
- Enable non-equijoins (join clauses with operations other than
- Add remaining string functions supported by Impala.
pipemethod to tables (hat-tip to the pandas dev team).
mutateconvenience method to tables.
- Fleshed out
WebHDFSimplementations: get/put directories, move files, etc. See the full HDFS API.
truncatemethod for timestamp values
ImpalaClientcan execute scalar expressions not involving any table.
- Can also create internal Impala tables with a specific HDFS path.
- Make Ibis’s temporary Impala database and HDFS paths configurable (see
truncate_tablefunction to client (if the user’s Impala cluster supports it).
- Python 2.6 compatibility
- Enable Ibis to execute concurrent queries in multithreaded applications (earlier versions were not thread-safe).
- Test data load script in
- Add an internal operation type signature API to enhance developer productivity.
$ git log v0.2.0..v0.3.0 --pretty=format:%aN | sort | uniq -c | sort -rn 59 Wes McKinney 29 Uri Laserson 4 Isaac Hodes 2 Meghana Vuyyuru
0.2 (June 16, 2015)¶
insertmethod on Ibis client for inserting data into existing tables.
avro_fileclient methods for querying datasets not yet available in Impala
HDFSclient API for WebHDFS for writing files and directories to HDFS
- New timedelta API and improved timestamp data support
histogrammethods on numeric expressions
categorylogical datatype for handling bucketed data, among other things
summaryAPI to numeric expressions
value_countsconvenience API to array expressions
- New string methods
containsfor fuzzy and regex searching
options.verboseoption and configurable
options.verbose_logcallback function for improved query logging and visibility
- Support for new SQL built-in functions
ibis.wherefor conditional logic (see also
nullifmethod on value expressions
- New aggregate functions:
whereargument in aggregate functions
- Added group-by convenience
- Add default expression names to most aggregate functions
- New Impala database client helper methods
list_tablessearching / listing method
sub, and other explicit arithmetic methods to value expressions
- New Ibis client and Impala connection workflow. Client now combined from an Impala connection and an optional HDFS connection
- Numerous expression API bug fixes and rough edges fixed
$ git log v0.1.0..v0.2.0 --pretty=format:%aN | sort | uniq -c | sort -rn 71 Wes McKinney 1 Juliet Hougland 1 Isaac Hodes
0.1 (March 26, 2015)¶
First Ibis release.
- Expression DSL design and type system
- Expression to ImpalaSQL compiler toolchain
- Impala built-in function wrappers
$ git log 84d0435..v0.1.0 --pretty=format:%aN | sort | uniq -c | sort -rn 78 Wes McKinney 1 srus 1 Henry Robinson