These release notes will only include notable or major bug fixes since most
minor bug fixes tend to be esoteric and not generally interesting. Point
0.5.1) will generally not be found here and contain
only bug fixes.
This release brings refactored, more composable core components and rule system to ibis.
- Allow keyword arguments in Node subclasses (#968)
- Splat args into Node subclasses instead of requiring a list (#969)
- Add support for
UNIONin the BigQuery backend (#1408, #1409)
- Support for writing UDFs in BigQuery (#1377). See the BigQuery UDF docs for more details.
- Support for cross-project expressions in the BigQuery backend. (#1428)
- The previous, publicly not exposed rule system has been rewritten
- Defining input arguments for operations happens in a more readable fashion instead of the previous input_type list.
v0.13.0 (March 30, 2018)¶
This release brings new backends, including support for executing against files, MySQL, Pandas user defined scalar and aggregations along with a number of bug fixes and reliability enhancements. We recommend that all users upgrade from earlier versions of Ibis.
- Support for Unsigned Integer Types (#1194)
- Support for Interval types and expressions with support for execution on the Impala and Clickhouse backends (#1243)
- Isnan, isinf operations for float and double values (#1261)
- Support for an interval with a quarter period (#1259)
ibis.pandas.from_dataframeconvenience function (#1155)
- Remove the restriction on
ROW_NUMBER()requiring it to have an
ORDER BYclause (#1371)
.get()operation on a Map type (#1376)
- Allow visualization of custom defined expressions
- Add experimental support for pandas UDFs/UDAFs (#1277)
- Functions can be used as groupby keys (#1214, #1215)
- Generalize the use of the
whereparameter to reduction operations (#1220)
- Support for interval operations thanks to @kszucs (#1243, #1260, #1249)
- Support for the
PARTITIONTIMEcolumn in the BigQuery backend (#1322)
arbitrary()method for selecting the first non null value in a column (#1230, #1309)
MultiQuantileoperation in the pandas backend thanks to @DiegoAlbertoTorres (#1343)
- Rules for validating table expressions thanks to @DiegoAlbertoTorres (#1298)
- Complete end-to-end testing framework for all supported backends (#1256)
not containsnow supported in the pandas backend (#1210, #1211)
- CI builds are now reproducible locally thanks to @kszucs (#1121, #1237, #1255, #1311)
isinfoperations thanks to @kszucs (#1261)
- Framework for generalized dtype and schema inference, and implicit casting thanks to @kszucs (#1221, #1269)
- Generic utilities for expression traversal thanks to @kszucs (#1336)
day_of_weekAPI (#306, #1047)
- Design documentation for ibis (#1351)
- Unbound parameters were failing in the simple case of a
mutate()call with no operation (#1378)
- Fix parameterized subqueries (#1300, #1331, #1303, #1378)
- Fix subquery extraction, which wasn’t happening in topological order (#1342)
- Fix parenthesization if
- Calling drop after mutate did not work (#1296, #1299)
- SQLAlchemy backends were missing an implementation of
REGEX_EXTRACTin PostgreSQL 10 (#1276, #1278)
- Fixing #1378 required the removal of the
nameparameter to the
param()function. Use the
v0.12.0 (October 28, 2017)¶
This release brings Clickhouse and BigQuery SQL support along with a number of bug fixes and reliability enhancements. We recommend that all users upgrade from earlier versions of Ibis.
- Add support for
Binarydata type (#1183)
- Allow users of the BigQuery client to define their own API proxy classes (#1188)
- Add support for HAVING in the pandas backend (#1182)
- Add struct field tab completion (#1178)
- Add expressions for Map/Struct types and columns (#1166)
- Support Table.asof_join (#1162)
- Allow right side of arithmetic operations to take over (#1150)
- Add a data_preload step in pandas backend (#1142)
- expressions in join predicates in the pandas backend (#1138)
- Scalar parameters (#1075)
- Limited window function support for pandas (#1083)
- Implement Time datatype (#1105)
- Implement array ops for pandas (#1100)
- support for passing multiple quantiles in
- support for clip and quantile ops on DoubleColumns (#1090)
- Enable unary math operations for pandas, sqlite (#1071)
- Enable casting from strings to temporal types (#1076)
- Allow selection of whole tables in pandas joins (#1072)
- Implement comparison for string vs date and timestamp types (#1065)
- Implement isnull and notnull for pandas (#1066)
- Allow like operation to accept a list of conditions to match (#1061)
- Add a pre_execute step in pandas backend (#1189)
- Remove global expression caching to ensure repeatable code generation (#1179, #1181)
ORDER BYgeneration without a
GROUP BY(#1180, #1181)
- Ensure that
DataTypeand subclasses hash properly (#1172)
- Ensure that the pandas backend can deal with unary operations in groupby
- Incorrect impala code generated for NOT with complex argument (#1176)
- BUG/CLN: Fix predicates on Selections on Joins (#1149)
- Don’t use SET LOCAL to allow redshift to work (#1163)
- Allow empty arrays as arguments (#1154)
- Fix column renaming in groupby keys (#1151)
- Ensure that we only cast if timezone is not None (#1147)
- Fix location of conftest.py (#1107)
- TST/Make sure we drop tables during postgres testing (#1101)
- Fix misleading join error message (#1086)
- BUG/TST: Make hdfs an optional dependency (#1082)
- Memoization should include expression name where available (#1080)
The following people contributed to the 0.12.0 release
$ git shortlog -sn --no-merges v0.11.2..v0.12.0 63 Phillip Cloud 8 Jeff Reback 2 Krisztián Szűcs 2 Tory Haavik 1 Anirudh 1 Szucs Krisztian 1 dlovell 1 kwangin
0.11.0 (June 28, 2017)¶
This release brings initial Pandas backend support along with a number of bug fixes and reliability enhancements. We recommend that all users upgrade from earlier versions of Ibis.
- Experimental pandas backend to allow execution of ibis expression against pandas DataFrames
- Graphviz visualization of ibis expressions. Implements
_repr_png_for Jupyter Notebook functionality
- Ability to create a partitioned table from an ibis expression
- Support for missing operations in the SQLite backend: sqrt, power, variance, and standard deviation, regular expression functions, and missing power support for PostgreSQL
- Support for schemas inside databases with the PostgreSQL backend
- Appveyor testing on core ibis across all supported Python versions
- Ability to sort, group by and project columns according to positional index rather than only by name
- Added a
ibis.literalto allow user specification of literal types
- Fix broken conda recipe
- Fix incorrectly typed fillna operation
- Fix postgres boolean summary operations
- Fix kudu support to reflect client API Changes
- Fix equality of nested types and construction of nested types when the value type is specified as a string
- Deprecate passing integer values to the
ibis.timestampliteral constructor, this will be removed in 0.12.0
- Added the
admin_timeoutparameter to the kudu client
$ git shortlog --summary --numbered v0.10.0..v0.11.0 58 Phillip Cloud 1 Greg Rahn 1 Marius van Niekerk 1 Tarun Gogineni 1 Wes McKinney
0.8 (May 19, 2016)¶
This release brings initial PostgreSQL backend support along with a number of critical bug fixes and usability improvements. As several correctness bugs with the SQL compiler were fixed, we recommend that all users upgrade from earlier versions of Ibis.
- Initial PostgreSQL backend contributed by Phillip Cloud.
groupbyas an alias for
group_byto table expressions
- Fix an expression error when filtering based on a new field
- Fix Impala’s SQL compilation of using
ORwith compound filters
- Various fixes with the
having(...)function in grouped table expressions
- Fix CTE (
WITH) extraction inside
ImportErroron Python 2 when
mocklibrary not installed
- The deprecated
ibis.make_clientAPIs have been removed
0.7 (March 16, 2016)¶
This release brings initial Kudu-Impala integration and improved Impala and SQLite support, along with several critical bug fixes.
- Apache Kudu (incubating) integration for Impala users. See the blog post for now. Will add some documentation here when possible.
ibis.hdfs_connectfor WebHDFS connections in secure (Kerberized) clusters without SSL enabled.
- Correctly compile aggregate expressions involving multiple subqueries.
To explain this last point in more detail, suppose you had:
table = ibis.table([('flag', 'string'), ('value', 'double')], 'tbl') flagged = table[table.flag == '1'] unflagged = table[table.flag == '0'] fv = flagged.value uv = unflagged.value expr = (fv.mean() / fv.sum()) - (uv.mean() / uv.sum())
The last expression now generates the correct Impala or SQLite SQL:
SELECT t0.`tmp` - t1.`tmp` AS `tmp` FROM ( SELECT avg(`value`) / sum(`value`) AS `tmp` FROM tbl WHERE `flag` = '1' ) t0 CROSS JOIN ( SELECT avg(`value`) / sum(`value`) AS `tmp` FROM tbl WHERE `flag` = '0' ) t1
VARCHAR(n)Impala types now correctly map to Ibis string expressions
- Fix inappropriate projection-join-filter expression rewrites resulting in incorrect generated SQL.
STORED AS PARQUETfor
- Fixed several issues with Ibis dependencies (impyla, thriftpy, sasl, thrift_sasl), especially for secure clusters. Upgrading will pull in these new dependencies.
- Do not fail in
ibis.impala.connectwhen trying to create the temporary Ibis database if no HDFS connection passed.
- Fix join predicate evaluation bug when column names overlap with table attributes.
- Fix handling of fully-materialized joins (aka
select *joins) in SQLAlchemy / SQLite.
Thank you to all who contributed patches to this release.
$ git log v0.6.0..v0.7.0 --pretty=format:%aN | sort | uniq -c | sort -rn 21 Wes McKinney 1 Uri Laserson 1 Kristopher Overholt
0.6 (December 1, 2015)¶
This release brings expanded pandas and Impala integration, including support for managing partitioned tables in Impala. See the new Ibis for Impala Users guide for more on using Ibis with Impala.
The Ibis for SQL Programmers guide also was written since the 0.5 release.
This release also includes bug fixes affecting generated SQL correctness. All users should upgrade as soon as possible.
- New integrated Impala functionality. See Ibis for Impala Users for more details on these things.
- Improved Impala-pandas integration. Create tables or insert into existing
tables from pandas
- Partitioned table metadata management API. Add, drop, alter, and insert into table partitions.
- Added support for
LOAD DATADDL using the
load_datafunction, also supporting partitioned tables.
- Modify table metadata (location, format, SerDe properties etc.) using
- Interrupting Impala expression execution with Control-C will attempt to cancel the running query with the server.
- Set the compression codec (e.g. snappy) used with
- Get and set query options for a client session with
ImpalaTable.metadatamethod that parses the output of the
DESCRIBE FORMATTEDDDL to simplify table metadata inspection.
ImpalaTable.column_statsto see computed table and partition statistics.
invalidate_metadataDDL options and add
COMPUTE INCREMENTAL STATS.
- Improved Impala-pandas integration. Create tables or insert into existing tables from pandas
substitutemethod for performing multiple value substitutions in an array or scalar expression.
- Division is by default true division like Python 3 for all numeric
data. This means for SQL systems that use C-style division semantics, the
CASTwill be automatically inserted in the generated SQL.
- Easier joins on tables with overlapping column names. See Ibis for SQL Programmers.
- Expressions like
string_expr[:3]now work as expected.
coalesceinstance method to all value expressions.
executemethod on expressions disables any default row limits.
ImpalaTable.renameno longer mutates the calling table expression.
$ git log v0.5.0..v0.6.0 --pretty=format:%aN | sort | uniq -c | sort -rn 46 Wes McKinney 3 Uri Laserson 1 Phillip Cloud 1 mariusvniekerk 1 Kristopher Overholt
0.5 (September 10, 2015)¶
Highlights in this release are the SQLite, Python 3, Impala UDA support, and an asynchronous execution API. There are also many usability improvements, bug fixes, and other new features.
- SQLite client and built-in function support
- Ibis now supports Python 3.4 as well as 2.6 and 2.7
- Ibis can utilize Impala user-defined aggregate (UDA) functions
- SQLAlchemy-based translation toolchain to enable more SQL engines having SQLAlchemy dialects to be supported
- Many window function usability improvements (nested analytic functions and deferred binding conveniences)
- More convenient aggregation with keyword arguments in
- Built preliminary wrapper API for MADLib-on-Impala
stdaggregation methods and support in Impala
nullifzeronumeric method for all SQL engines
renamemethod to Impala tables (for renaming tables in the Hive metastore)
ImpalaClientfor session cleanup (#533)
relabelmethod to table expressions
insertmethod to Impala tables
verifymethods to all expressions to test compilation and ability to compile (since many operations are unavailable in SQLite, for example)
- Impala Ibis client creation now uses only
ibis.make_clienthas been deprecated
$ git log v0.4.0..v0.5.0 --pretty=format:%aN | sort | uniq -c | sort -rn 55 Wes McKinney 9 Uri Laserson 1 Kristopher Overholt
0.4 (August 14, 2015)¶
- Add tooling to use Impala C++ scalar UDFs within Ibis (#262, #195)
- Support and testing for Kerberos-enabled secure HDFS clusters
- Many table functions can now accept functions as parameters (invoked on the calling table) to enhance composability and emulate late-binding semantics of languages (like R) that have non-standard evaluation (#460)
notallreductions on boolean arrays, as well as
topknow produces an analytic expression that is executable (as an aggregation) but can also be used as a filter as before (#392, #91)
- Added experimental database object “usability layer”, see
compute_statsAPI to table expressions referencing physical Impala tables
ImpalaClientto show query plan for an expression
HDFSinterface for superusers
convert_basemethod to strings and integer types
- Add option to
ImpalaClient.create_tableto create empty partitioned tables
ibis.cross_joincan now join more than 2 tables at once
ImpalaClient.raw_sqlmethod for running naked SQL queries
ImpalaClient.insertnow validates schemas locally prior to sending query to cluster, for better usability.
- Add conda installation recipes
$ git log v0.3.0..v0.4.0 --pretty=format:%aN | sort | uniq -c | sort -rn 38 Wes McKinney 9 Uri Laserson 2 Meghana Vuyyuru 2 Kristopher Overholt 1 Marius van Niekerk
0.3 (July 20, 2015)¶
First public release. See http://ibis-project.org for more.
- Implement window / analytic function support
- Enable non-equijoins (join clauses with operations other than
- Add remaining string functions supported by Impala.
pipemethod to tables (hat-tip to the pandas dev team).
mutateconvenience method to tables.
- Fleshed out
WebHDFSimplementations: get/put directories, move files, etc. See the full HDFS API.
truncatemethod for timestamp values
ImpalaClientcan execute scalar expressions not involving any table.
- Can also create internal Impala tables with a specific HDFS path.
- Make Ibis’s temporary Impala database and HDFS paths configurable (see
truncate_tablefunction to client (if the user’s Impala cluster supports it).
- Python 2.6 compatibility
- Enable Ibis to execute concurrent queries in multithreaded applications (earlier versions were not thread-safe).
- Test data load script in
- Add an internal operation type signature API to enhance developer productivity.
$ git log v0.2.0..v0.3.0 --pretty=format:%aN | sort | uniq -c | sort -rn 59 Wes McKinney 29 Uri Laserson 4 Isaac Hodes 2 Meghana Vuyyuru
0.2 (June 16, 2015)¶
insertmethod on Ibis client for inserting data into existing tables.
avro_fileclient methods for querying datasets not yet available in Impala
HDFSclient API for WebHDFS for writing files and directories to HDFS
- New timedelta API and improved timestamp data support
histogrammethods on numeric expressions
categorylogical datatype for handling bucketed data, among other things
summaryAPI to numeric expressions
value_countsconvenience API to array expressions
- New string methods
containsfor fuzzy and regex searching
options.verboseoption and configurable
options.verbose_logcallback function for improved query logging and visibility
- Support for new SQL built-in functions
ibis.wherefor conditional logic (see also
nullifmethod on value expressions
- New aggregate functions:
whereargument in aggregate functions
- Added group-by convenience
- Add default expression names to most aggregate functions
- New Impala database client helper methods
list_tablessearching / listing method
sub, and other explicit arithmetic methods to value expressions
- New Ibis client and Impala connection workflow. Client now combined from an Impala connection and an optional HDFS connection
- Numerous expression API bug fixes and rough edges fixed
$ git log v0.1.0..v0.2.0 --pretty=format:%aN | sort | uniq -c | sort -rn 71 Wes McKinney 1 Juliet Hougland 1 Isaac Hodes
0.1 (March 26, 2015)¶
First Ibis release.
- Expression DSL design and type system
- Expression to ImpalaSQL compiler toolchain
- Impala built-in function wrappers
$ git log 84d0435..v0.1.0 --pretty=format:%aN | sort | uniq -c | sort -rn 78 Wes McKinney 1 srus 1 Henry Robinson