Column selectors

Choose Table columns based on dtype, regex, and other criteria

where

selectors.where(predicate)

Select columns that satisfy predicate.

Use this selector when one of the other selectors does not meet your needs.

Parameters

Name Type Description Default
predicate Callable[[ir.Value], bool] A callable that accepts an ibis value expression and returns a bool required

Examples

>>> import ibis
>>> import ibis.selectors as s
>>> t = ibis.table(dict(a="float32"), name="t")
>>> expr = t.select(s.where(lambda col: col.get_name() == "a"))
>>> expr.columns
['a']

numeric

selectors.numeric()

Return numeric columns.

Examples

>>> import ibis
>>> import ibis.selectors as s
>>> t = ibis.table(dict(a="int", b="string", c="array<string>"), name="t")
>>> t
UnboundTable: t
  a int64
  b string
  c array<string>
>>> expr = t.select(s.numeric())  # `a` has integer type, so it's numeric
>>> expr.columns
['a']

See Also

of_type

of_type

selectors.of_type(dtype)

Select columns of type dtype.

Parameters

Name Type Description Default
dtype dt.DataType | str | type[dt.DataType] DataType instance, str or DataType class required

Examples

Select according to a specific DataType instance

>>> import ibis
>>> import ibis.expr.datatypes as dt
>>> import ibis.selectors as s
>>> t = ibis.table(
...     dict(name="string", siblings="array<string>", parents="array<int64>")
... )
>>> expr = t.select(s.of_type(dt.Array(dt.string)))
>>> expr.columns
['siblings']

Strings are also accepted

>>> expr = t.select(s.of_type("array<string>"))
>>> expr.columns
['siblings']

Abstract/unparametrized types may also be specified by their string name (e.g. “integer” for any integer type), or by passing in a DataType class instead. The following options are equivalent.

>>> expr1 = t.select(s.of_type("array"))
>>> expr2 = t.select(s.of_type(dt.Array))
>>> expr1.equals(expr2)
True
>>> expr2.columns
['siblings', 'parents']

See Also

numeric

startswith

selectors.startswith(prefixes)

Select columns whose name starts with one of prefixes.

Parameters

Name Type Description Default
prefixes str | tuple[str, …] Prefixes to compare column names against required

Examples

>>> import ibis
>>> import ibis.selectors as s
>>> t = ibis.table(dict(apples="int", oranges="float", bananas="bool"), name="t")
>>> expr = t.select(s.startswith(("a", "b")))
>>> expr.columns
['apples', 'bananas']

See Also

endswith

endswith

selectors.endswith(suffixes)

Select columns whose name ends with one of suffixes.

Parameters

Name Type Description Default
suffixes str | tuple[str, …] Suffixes to compare column names against required

See Also

startswith

contains

selectors.contains(needles, how=any)

Return columns whose name contains needles.

Parameters

Name Type Description Default
needles str | tuple[str, …] One or more strings to search for in column names required
how Callable[[Iterable[bool]], bool] A boolean reduction to allow the configuration of how needles are summarized. any

Examples

Select columns that contain either "a" or "b"

>>> import ibis
>>> import ibis.selectors as s
>>> t = ibis.table(
...     dict(
...         a="int64", b="string", c="float", d="array<int16>", ab="struct<x: int>"
...     )
... )
>>> expr = t.select(s.contains(("a", "b")))
>>> expr.columns
['a', 'b', 'ab']

Select columns that contain all of "a" and "b", that is, both "a" and "b" must be in each column’s name to match.

>>> expr = t.select(s.contains(("a", "b"), how=all))
>>> expr.columns
['ab']

See Also

matches

matches

selectors.matches(regex)

Return columns whose name matches the regular expression regex.

Parameters

Name Type Description Default
regex str | re.Pattern A string or re.Pattern object required

Examples

>>> import ibis
>>> import ibis.selectors as s
>>> t = ibis.table(dict(ab="string", abd="int", be="array<string>"))
>>> expr = t.select(s.matches(r"ab+"))
>>> expr.columns
['ab', 'abd']

See Also

contains

any_of

selectors.any_of(*predicates)

Include columns satisfying any of predicates.

all_of

selectors.all_of(*predicates)

Include columns satisfying all of predicates.

c

selectors.c(*names)

Select specific column names.

across

selectors.across(selector, func, names=None)

Apply data transformations across multiple columns.

Parameters

Name Type Description Default
selector Selector | Iterable[str] | str An expression that selects columns on which the transformation function will be applied, an iterable of str column names or a single str column name. required
func Deferred | Callable[[ir.Value], ir.Value] | Mapping[str | None, Deferred | Callable[[ir.Value], ir.Value]] A function (or dictionary of functions) to use to transform the data. required
names str | Callable[[str, str | None], str] | None A lambda function or a format string to name the columns created by the transformation function. None

Returns

Type Description
Across An Across selector object

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> from ibis import _, selectors as s
>>> t = ibis.examples.penguins.fetch()
>>> t.select(s.startswith("bill")).mutate(
...     s.across(s.numeric(), dict(centered=_ - _.mean()), names="{fn}_{col}")
... )
/Users/cody/repos/ibis-birdbrain/venv/lib/python3.11/site-packages/google/auth/_default.py:76: UserWarning:

Your application has authenticated using end user credentials from Google Cloud SDK without a quota project. You might receive a "quota exceeded" or "API not enabled" error. See the following page for troubleshooting: https://cloud.google.com/docs/authentication/adc-troubleshooting/user-creds. 
┏━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ bill_length_mm  bill_depth_mm  centered_bill_length_mm  centered_bill_depth_mm ┃
┡━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━┩
│ float64float64float64float64                │
├────────────────┼───────────────┼─────────────────────────┼────────────────────────┤
│           39.118.7-4.821931.54883 │
│           39.517.4-4.421930.24883 │
│           40.318.0-3.621930.84883 │
│            nannannannan │
│           36.719.3-7.221932.14883 │
│           39.320.6-4.621933.44883 │
│           38.917.8-5.021930.64883 │
│           39.219.6-4.721932.44883 │
│           34.118.1-9.821930.94883 │
│           42.020.2-1.921933.04883 │
│               │
└────────────────┴───────────────┴─────────────────────────┴────────────────────────┘

if_any

selectors.if_any(selector, predicate)

Return the disjunction of predicate applied on all selector columns.

Parameters

Name Type Description Default
selector Selector A column selector required
predicate Deferred | Callable A callable or deferred object defining a predicate to apply to each column from selector. required

Examples

>>> import ibis
>>> from ibis import selectors as s, _
>>> ibis.options.interactive = True
>>> penguins = ibis.examples.penguins.fetch()
>>> cols = s.across(s.endswith("_mm"), (_ - _.mean()) / _.std())
>>> expr = penguins.mutate(cols).filter(s.if_any(s.endswith("_mm"), _.abs() > 2))
>>> expr_by_hand = penguins.mutate(cols).filter(
...     (_.bill_length_mm.abs() > 2)
...     | (_.bill_depth_mm.abs() > 2)
...     | (_.flipper_length_mm.abs() > 2)
... )
>>> expr.equals(expr_by_hand)
True
>>> expr
┏━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓
┃ species  island  bill_length_mm  bill_depth_mm  flipper_length_mm  body_mass_g  sex     year  ┃
┡━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩
│ stringstringfloat64float64float64int64stringint64 │
├─────────┼────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┤
│ Adelie Biscoe-1.1029180.733585-2.0562653150female2007 │
│ Gentoo Biscoe1.113200-0.4309722.0683265700male  2007 │
│ Gentoo Biscoe2.871441-0.0765422.0683266050male  2007 │
│ Gentoo Biscoe1.900745-0.7347692.1394395650male  2008 │
│ Gentoo Biscoe1.076570-0.1778072.0683265700male  2008 │
│ Gentoo Biscoe0.856789-0.5828712.0683265800male  2008 │
│ Gentoo Biscoe1.497815-0.0765422.0683265550male  2009 │
│ Gentoo Biscoe1.387925-0.4309722.0683265500male  2009 │
│ Gentoo Biscoe2.047266-0.5828712.0683265850male  2009 │
│ Adelie Dream -2.165189-0.836035-0.9184473050female2009 │
│  │
└─────────┴────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┘

if_all

selectors.if_all(selector, predicate)

Return the conjunction of predicate applied on all selector columns.

Parameters

Name Type Description Default
selector Selector A column selector required
predicate Deferred | Callable A callable or deferred object defining a predicate to apply to each column from selector. required

Examples

>>> import ibis
>>> from ibis import selectors as s, _
>>> ibis.options.interactive = True
>>> penguins = ibis.examples.penguins.fetch()
>>> cols = s.across(s.endswith("_mm"), (_ - _.mean()) / _.std())
>>> expr = penguins.mutate(cols).filter(s.if_all(s.endswith("_mm"), _.abs() > 1))
>>> expr_by_hand = penguins.mutate(cols).filter(
...     (_.bill_length_mm.abs() > 1)
...     & (_.bill_depth_mm.abs() > 1)
...     & (_.flipper_length_mm.abs() > 1)
... )
>>> expr.equals(expr_by_hand)
True
>>> expr
┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓
┃ species  island     bill_length_mm  bill_depth_mm  flipper_length_mm  body_mass_g  sex     year  ┃
┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩
│ stringstringfloat64float64float64int64stringint64 │
├─────────┼───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┤
│ Adelie Dream    -1.1578631.088015-1.4162433300female2007 │
│ Adelie Torgersen-1.2311231.138648-1.2029023900male  2008 │
│ Gentoo Biscoe   1.149830-1.4436301.2149625700male  2007 │
│ Gentoo Biscoe   1.039940-1.0892001.0727354750male  2008 │
│ Gentoo Biscoe   1.131515-1.0892001.7127575000male  2008 │
│ Gentoo Biscoe   1.241405-1.0892001.5705305550male  2008 │
│ Gentoo Biscoe   1.351295-1.4942631.2149625300male  2009 │
└─────────┴───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┘

r

selectors.r

Ranges of columns.

first

selectors.first()

Return the first column of a table.

last

selectors.last()

Return the last column of a table.

all

selectors.all()

Return every column from a table.

Back to top