Skip to main content
Version: 0.5

Time-Window Aggregation Functions Reference

Time-window aggregation functions are built-in functions that are used by defining an Aggregation object in a Batch Feature View or a Stream Feature View.

This page is a reference that contains the available time-window aggregation functions. The aggregation functions discussed on this page are either available exclusively under the tecton.aggregation_functions namespace or can only be specified through string representations. For specific examples of how to use these functions, please refer to the examples provided under each aggregation function.

count

An aggregation function that returns, for a materialization time window, the number of row values for a column, per entity value (such as a user_id value). Null values are excluded.

Supported Data Platforms

  • Tecton on Spark (Databricks and EMR)
  • Tecton on Snowflake

Input column types

  • Tecton on Spark: All types
  • Tecton on Snowflake: All types

Output column types

  • Int64

Usage

To use this aggregation, define an Aggregation object, using function="count", in a Batch Feature View or a Stream Feature View.

Example

Aggregation(column="transaction_id", function="count", time_window=timedelta(days=1))

last_distinct(n)

An aggregation function that returns, for a materialization time window, the last N distinct row values for a column, per entity value (such as a user_id value).

For example, if the last 2 distinct row values for a column, in the materialization time window, are 10 and 20, then the function returns [10,20].

note

The output sequence is in ascending order based on the timestamp.

Supported data platforms

  • Tecton on Spark (Databricks and EMR)

Input column types

  • String

Output column type

  • Array[String]

Usage

Import this aggregation with from tecton.aggregation_functions import last_distinct.

Then, define an Aggregation object, using function=last_distinct(n), where n is an integer > 0 and <= 1000, in a Batch Feature View or a Stream Feature View.

Example

Aggregation(column="amt", function=last_distinct(2), time_window=timedelta(days=1))

max

An aggregation function that returns, for a materialization time window, the maximum of the row values for a column, per entity value (such as a user_id value).

Supported data platforms

  • Tecton on Spark (Databricks and EMR)
  • Tecton on Snowflake

Input column types

  • Int64, Int32, Float64, String

Output column type

  • Int64, Float64, String

Usage

To use this aggregation, define an Aggregation object, using function="max", in a Batch Feature View or a Stream Feature View.

Example

Aggregation(column="amt", function="max", time_window=timedelta(days=1))

mean

An aggregation function that returns, for a materialization time window, the mean of the row values for a column, per entity value (such as a user_id value).

Supported data platforms

  • Tecton on Spark (Databricks and EMR)
  • Tecton on Snowflake

Input column types

  • Int64, Int32, Float64

Output column type

  • Float64

Usage

To use this aggregation, define an Aggregation object, using function="mean", in a Batch Feature View or a Stream Feature View.

Example

Aggregation(column="amt", function="mean", time_window=timedelta(days=1))

min

An aggregation function that returns, for a materialization time window, the minimum of the row values for a column, per entity value (such as a user_id value).

Supported data platforms

  • Tecton on Spark (Databricks and EMR)
  • Tecton on Snowflake

Input column types

  • Int64, Int32, Float64, String

Output column type

  • Int64, Float64, String

Usage

To use this aggregation, define an Aggregation object, using function="min", in a Batch Feature View or a Stream Feature View.

Example

Aggregation(column="amt", function="min", time_window=timedelta(days=1))

stddev_pop

An aggregation function that returns, for a materialization time window, the standard deviation of the row values for a column around the population mean, per entity value (such as a user_id value).

Supported data platforms

  • Tecton on Spark (Databricks and EMR)
  • Tecton on Snowflake

Input column types

  • Int64, Int32, Float64

Output column type

  • Float64

Usage

To use this aggregation, define an Aggregation object, using function="stddev_pop", in a Batch Feature View or a Stream Feature View.

Example

Aggregation(column="amt", function="stddev_pop", time_window=timedelta(days=1))

stddev_samp

An aggregation function that returns, for a materialization time window, the standard deviation of the row values for a column around the sample mean, per entity value (such as a user_id value).

Supported data platforms

  • Tecton on Spark (Databricks and EMR)
  • Tecton on Snowflake

Input column types

  • Int64, Int32, Float64

Output column type

  • Float64

Usage

To use this aggregation, define an Aggregation object, using function="stddev_samp", in a Batch Feature View or a Stream Feature View.

Example

Aggregation(column="amt", function="stddev_samp", time_window=timedelta(days=1))

sum

An aggregation function that returns, for a materialization time window, the sum of the row values for a column, per entity value (such as a user_id value).

Supported data platforms

  • Tecton on Spark (Databricks and EMR)
  • Tecton on Snowflake

Input column types

  • Int64, Int32, Float64

Output column type

  • Int64 or Float64

Usage

To use this aggregation, define an Aggregation object, using function="sum", in a Batch Feature View or a Stream Feature View.

Example

Aggregation(column="amt", function="sum", time_window=timedelta(days=1))

var_pop

An aggregation function that returns, for a materialization time window, the variance of the row values for a column around the population mean, per entity value (such as a user_id value).

Supported data platforms

  • Tecton on Spark (Databricks and EMR)
  • Tecton on Snowflake

Input column types

  • Int64, Int32, Float64

Output column type

  • Float64

Usage

To use this aggregation, define an Aggregation object, using function="var_pop", in a Batch Feature View or a Stream Feature View.

Example

Aggregation(column="amt", function="var_pop", time_window=timedelta(days=1))

var_samp

An aggregation function that returns, for a materialization time window, the variance of the row values for a column around the sample mean, per entity value (such as a user_id value).

Supported data platforms

  • Tecton on Spark (Databricks and EMR)
  • Tecton on Snowflake

Input column types

  • Int64, Int32, Float64

Output column type

  • Float64

Usage

To use this aggregation, define an Aggregation object, using function="var_samp", in a Batch Feature View or a Stream Feature View.

Example

Aggregation(column="amt", function="var_samp", time_window=timedelta(days=1))

Was this page helpful?

Happy React is loading...

🧠 Hi! Ask me anything about Tecton!

Floating button icon