Skip to main content
Version: 0.6

Time-Window Aggregation Functions Reference

Time-window aggregation functions are built-in functions that are used by defining an Aggregation object in a Batch Feature View or a Stream Feature View.

This page is a reference that contains the available time-window aggregation functions. The aggregation functions discussed on this page are either available exclusively under the tecton.aggregation_functions namespace or can only be specified through string representations. For specific examples of how to use these functions, please refer to the examples provided under each aggregation function.

count​

An aggregation function that returns, for a materialization time window, the number of row values for a column, per entity value (such as a user_id value). Null values are excluded.

Input column types

  • Tecton on Spark: All types
  • Tecton on Snowflake: All types

Output column types

  • Int64

Usage

To use this aggregation, define an Aggregation object, using function="count", in a Batch Feature View or a Stream Feature View.

Example

Aggregation(column="transaction_id", function="count", time_window=timedelta(days=1))

first_distinct(n)​

An aggregation function that returns, for a materialization time window, the first N distinct row values for a column, per entity value (such as a user_id value).

For example, if the first 2 distinct row values for a column, in the materialization time window, are 10 and 20, then the function returns [10,20].

note

The output sequence is in ascending order based on timestamp.

Not currently supported with:

  • Tecton on Snowflake
  • Serverless Feature Retrieval with Athena

Input column types

  • String

Output column type

  • Array[String]

Import this aggregation with from tecton.aggregation_functions import first_distinct.

Then, define an Aggregation object, using function=first_distinct(n), where n is an integer > 0 and <= 1000, in a Batch Feature View or a Stream Feature View.

Example

Aggregation(column="amt", function=first_distinct(2), time_window=timedelta(days=1)).

first(n)​

An aggregation function that returns, for a materialization time window, the first N row values for a column, per entity value (such as a user_id value).

For example, if the first 2 row values for a column, in the materialization time window, are 10 and 20, then the function returns [10,20].

note

The output sequence is in ascending order based on the timestamp.

Not currently supported with:

  • Serverless Feature Retrieval with Athena

Input column types

  • String

Output column type

  • Array[String]

Usage

To use this aggregation, define an Aggregation object, using function=first(n), where n is an integer > 0 and <= 1000, in a Batch Feature View or a Stream Feature View.

Example

Aggregation(column="amt", function=first(2), time_window=timedelta(days=1))

last_distinct(n)​

An aggregation function that returns, for a materialization time window, the last N distinct row values for a column, per entity value (such as a user_id value).

For example, if the last 2 distinct row values for a column, in the materialization time window, are 10 and 20, then the function returns [10,20].

note

The output sequence is in ascending order based on the timestamp.

Not currently supported with:

  • Tecton on Snowflake
  • Serverless Feature Retrieval with Athena

Input column types

  • String

Output column type

  • Array[String]

Usage

Import this aggregation with from tecton.aggregation_functions import last_distinct.

Then, define an Aggregation object, using function=last_distinct(n), where n is an integer > 0 and <= 1000, in a Batch Feature View or a Stream Feature View.

Example

Aggregation(column="amt", function=last_distinct(2), time_window=timedelta(days=1))

last​

An aggregation function that returns, for a materialization time window, the last row value for a column, per entity value (such as a user_id value).

Not currently supported with:

  • Tecton on Snowflake
  • Serverless Feature Retrieval with Athena

Input column types

  • Int64, Int32, Float64, Bool, String

Output column type

  • Int64, Float64, Bool, String

Usage

To use this aggregation, define an Aggregation object, using function="last", in a Batch Feature View or a Stream Feature View.

Example

Aggregation(column="amt", function="last", time_window=timedelta(days=1))

last(n)​

An aggregation function that returns, for a materialization time window, the last N row values for a column, per entity value (such as a user_id value).

For example, if the last 2 row values for a column, in the materialization time window, are 10 and 20, then the function returns [10,20].

note

The output sequence is in ascending order based on the timestamp.

Not currently supported with:

  • Serverless Feature Retrieval with Athena

Input column types

  • String

Output column type

  • Array[String]

Usage

Import this aggregation with from tecton.aggregation_functions import last.

Then, define an Aggregation object using function=last(n), where n is an integer > 0 and <= 1000, in a Batch Feature View or a Stream Feature View.

Example

Aggregation(column="amt", function=last(2), time_window=timedelta(days=1))

max​

An aggregation function that returns, for a materialization time window, the maximum of the row values for a column, per entity value (such as a user_id value).

Input column types

  • Int64, Int32, Float64, String

Output column type

  • Int64, Float64, String

Usage

To use this aggregation, define an Aggregation object, using function="max", in a Batch Feature View or a Stream Feature View.

Example

Aggregation(column="amt", function="max", time_window=timedelta(days=1))

mean​

An aggregation function that returns, for a materialization time window, the mean of the row values for a column, per entity value (such as a user_id value).

Input column types

  • Int64, Int32, Float64

Output column type

  • Float64

Usage

To use this aggregation, define an Aggregation object, using function="mean", in a Batch Feature View or a Stream Feature View.

Example

Aggregation(column="amt", function="mean", time_window=timedelta(days=1))

min​

An aggregation function that returns, for a materialization time window, the minimum of the row values for a column, per entity value (such as a user_id value).

Input column types

  • Int64, Int32, Float64, String

Output column type

  • Int64, Float64, String

Usage

To use this aggregation, define an Aggregation object, using function="min", in a Batch Feature View or a Stream Feature View.

Example

Aggregation(column="amt", function="min", time_window=timedelta(days=1))

stddev_pop​

An aggregation function that returns, for a materialization time window, the standard deviation of the row values for a column around the population mean, per entity value (such as a user_id value).

Not currently supported with:

  • Serverless Feature Retrieval with Athena

Input column types

  • Int64, Int32, Float64

Output column type

  • Float64

Usage

To use this aggregation, define an Aggregation object, using function="stddev_pop", in a Batch Feature View or a Stream Feature View.

Example

Aggregation(column="amt", function="stddev_pop", time_window=timedelta(days=1))

stddev_samp​

An aggregation function that returns, for a materialization time window, the standard deviation of the row values for a column around the sample mean, per entity value (such as a user_id value).

Not currently supported with:

  • Serverless Feature Retrieval with Athena

Input column types

  • Int64, Int32, Float64

Output column type

  • Float64

Usage

To use this aggregation, define an Aggregation object, using function="stddev_samp", in a Batch Feature View or a Stream Feature View.

Example

Aggregation(column="amt", function="stddev_samp", time_window=timedelta(days=1))

sum​

An aggregation function that returns, for a materialization time window, the sum of the row values for a column, per entity value (such as a user_id value).

Input column types

  • Int64, Int32, Float64

Output column type

  • Int64 or Float64

Usage

To use this aggregation, define an Aggregation object, using function="sum", in a Batch Feature View or a Stream Feature View.

Example

Aggregation(column="amt", function="sum", time_window=timedelta(days=1))

var_pop​

An aggregation function that returns, for a materialization time window, the variance of the row values for a column around the population mean, per entity value (such as a user_id value).

Not currently supported with:

  • Serverless Feature Retrieval with Athena

Input column types

  • Int64, Int32, Float64

Output column type

  • Float64

Usage

To use this aggregation, define an Aggregation object, using function="var_pop", in a Batch Feature View or a Stream Feature View.

Example

Aggregation(column="amt", function="var_pop", time_window=timedelta(days=1))

var_samp​

An aggregation function that returns, for a materialization time window, the variance of the row values for a column around the sample mean, per entity value (such as a user_id value).

Not currently supported with:

  • Serverless Feature Retrieval with Athena

Input column types

  • Int64, Int32, Float64

Output column type

  • Float64

Usage

To use this aggregation, define an Aggregation object, using function="var_samp", in a Batch Feature View or a Stream Feature View.

Example

Aggregation(column="amt", function="var_samp", time_window=timedelta(days=1))

Was this page helpful?

🧠 Hi! Ask me anything about Tecton!

Floating button icon