ibis.expr.api.TableExpr.projection

TableExpr.projection(exprs)

Compute new table expression with the indicated column expressions from this table.

Parameters:
exprs : column expression, or string, or list of column expressions and

strings. If strings passed, must be columns in the table already

Returns:
projection : TableExpr

Notes

Passing an aggregate function to this method will broadcast the aggregate’s value over the number of rows in the table. See the examples section for more details.

Examples

Simple projection

>>> import ibis
>>> fields = [('a', 'int64'), ('b', 'double')]
>>> t = ibis.table(fields, name='t')
>>> proj = t.projection([t.a, (t.b + 1).name('b_plus_1')])
>>> proj  # doctest: +NORMALIZE_WHITESPACE
ref_0
UnboundTable[table]
  name: t
  schema:
    a : int64
    b : float64
<BLANKLINE>
Selection[table]
  table:
    Table: ref_0
  selections:
    a = Column[int64*] 'a' from table
      ref_0
    b_plus_1 = Add[float64*]
      left:
        b = Column[float64*] 'b' from table
          ref_0
      right:
        Literal[int8]
          1
>>> proj2 = t[t.a, (t.b + 1).name('b_plus_1')]
>>> proj.equals(proj2)
True

Aggregate projection

>>> agg_proj = t[t.a.sum().name('sum_a'), t.b.mean().name('mean_b')]
>>> agg_proj  # doctest: +NORMALIZE_WHITESPACE, +ELLIPSIS
ref_0
UnboundTable[table]
  name: t
  schema:
    a : int64
    b : float64
<BLANKLINE>
Selection[table]
  table:
    Table: ref_0
  selections:
    sum_a = WindowOp[int64*]
      sum_a = Sum[int64]
        a = Column[int64*] 'a' from table
          ref_0
        where:
          None
      <ibis.expr.window.Window object at 0x...>
    mean_b = WindowOp[float64*]
      mean_b = Mean[float64]
        b = Column[float64*] 'b' from table
          ref_0
        where:
          None
      <ibis.expr.window.Window object at 0x...>

Note the <ibis.expr.window.Window> objects here, their existence means that the result of the aggregation will be broadcast across the number of rows in the input column. The purpose of this expression rewrite is to make it easy to write column/scalar-aggregate operations like

t[(t.a - t.a.mean()).name('demeaned_a')]