Skip to main content
Version: 0.6

Time-Window Aggregation Functions Reference

Time-window aggregation functions are built-in functions that are used by defining an Aggregation object in a Batch Feature View or a Stream Feature View.

This page is a reference that contains the available time-window aggregation functions. The aggregation functions discussed on this page are either available exclusively under the tecton.aggregation_functions namespace or can only be specified through string representations. For specific examples of how to use these functions, please refer to the examples provided under each aggregation function.


An aggregation function that returns, for a materialization time window, the number of row values for a column, per entity value (such as a user_id value). Null values are excluded.

Input column types

  • Tecton on Spark: All types
  • Tecton on Snowflake: All types

Output column types

  • Int64


To use this aggregation, define an Aggregation object, using function="count", in a Batch Feature View or a Stream Feature View.


Aggregation(column="transaction_id", function="count", time_window=timedelta(days=1))


An aggregation function that returns, for a materialization time window, the first N distinct row values for a column, per entity value (such as a user_id value).

For example, if the first 2 distinct row values for a column, in the materialization time window, are 10 and 20, then the function returns [10,20].


The output sequence is in ascending order based on timestamp.

Not currently supported with:

  • Tecton on Snowflake
  • Serverless Feature Retrieval with Athena

Input column types

  • String

Output column type

  • Array[String]

Import this aggregation with from tecton.aggregation_functions import first_distinct.

Then, define an Aggregation object, using function=first_distinct(n), where n is an integer > 0 and <= 1000, in a Batch Feature View or a Stream Feature View.


Aggregation(column="amt", function=first_distinct(2), time_window=timedelta(days=1)).


An aggregation function that returns, for a materialization time window, the first N row values for a column, per entity value (such as a user_id value).

For example, if the first 2 row values for a column, in the materialization time window, are 10 and 20, then the function returns [10,20].


The output sequence is in ascending order based on the timestamp.

Not currently supported with:

  • Serverless Feature Retrieval with Athena

Input column types

  • String

Output column type

  • Array[String]


To use this aggregation, define an Aggregation object, using function=first(n), where n is an integer > 0 and <= 1000, in a Batch Feature View or a Stream Feature View.


Aggregation(column="amt", function=first(2), time_window=timedelta(days=1))


An aggregation function that returns, for a materialization time window, the last N distinct row values for a column, per entity value (such as a user_id value).

For example, if the last 2 distinct row values for a column, in the materialization time window, are 10 and 20, then the function returns [10,20].


The output sequence is in ascending order based on the timestamp.

Not currently supported with:

  • Tecton on Snowflake
  • Serverless Feature Retrieval with Athena

Input column types

  • String

Output column type

  • Array[String]


Import this aggregation with from tecton.aggregation_functions import last_distinct.

Then, define an Aggregation object, using function=last_distinct(n), where n is an integer > 0 and <= 1000, in a Batch Feature View or a Stream Feature View.


Aggregation(column="amt", function=last_distinct(2), time_window=timedelta(days=1))


An aggregation function that returns, for a materialization time window, the last row value for a column, per entity value (such as a user_id value).

Not currently supported with:

  • Tecton on Snowflake
  • Serverless Feature Retrieval with Athena

Input column types

  • Int64, Int32, Float64, Bool, String

Output column type

  • Int64, Float64, Bool, String


To use this aggregation, define an Aggregation object, using function="last", in a Batch Feature View or a Stream Feature View.


Aggregation(column="amt", function="last", time_window=timedelta(days=1))


An aggregation function that returns, for a materialization time window, the last N row values for a column, per entity value (such as a user_id value).

For example, if the last 2 row values for a column, in the materialization time window, are 10 and 20, then the function returns [10,20].


The output sequence is in ascending order based on the timestamp.

Not currently supported with:

  • Serverless Feature Retrieval with Athena

Input column types

  • String

Output column type

  • Array[String]


Import this aggregation with from tecton.aggregation_functions import last.

Then, define an Aggregation object using function=last(n), where n is an integer > 0 and <= 1000, in a Batch Feature View or a Stream Feature View.


Aggregation(column="amt", function=last(2), time_window=timedelta(days=1))


An aggregation function that returns, for a materialization time window, the maximum of the row values for a column, per entity value (such as a user_id value).

Input column types

  • Int64, Int32, Float64, String

Output column type

  • Int64, Float64, String


To use this aggregation, define an Aggregation object, using function="max", in a Batch Feature View or a Stream Feature View.


Aggregation(column="amt", function="max", time_window=timedelta(days=1))


An aggregation function that returns, for a materialization time window, the mean of the row values for a column, per entity value (such as a user_id value).

Input column types

  • Int64, Int32, Float64

Output column type

  • Float64


To use this aggregation, define an Aggregation object, using function="mean", in a Batch Feature View or a Stream Feature View.


Aggregation(column="amt", function="mean", time_window=timedelta(days=1))


An aggregation function that returns, for a materialization time window, the minimum of the row values for a column, per entity value (such as a user_id value).

Input column types

  • Int64, Int32, Float64, String

Output column type

  • Int64, Float64, String


To use this aggregation, define an Aggregation object, using function="min", in a Batch Feature View or a Stream Feature View.


Aggregation(column="amt", function="min", time_window=timedelta(days=1))


An aggregation function that returns, for a materialization time window, the standard deviation of the row values for a column around the population mean, per entity value (such as a user_id value).

Not currently supported with:

  • Serverless Feature Retrieval with Athena

Input column types

  • Int64, Int32, Float64

Output column type

  • Float64


To use this aggregation, define an Aggregation object, using function="stddev_pop", in a Batch Feature View or a Stream Feature View.


Aggregation(column="amt", function="stddev_pop", time_window=timedelta(days=1))


An aggregation function that returns, for a materialization time window, the standard deviation of the row values for a column around the sample mean, per entity value (such as a user_id value).

Not currently supported with:

  • Serverless Feature Retrieval with Athena

Input column types

  • Int64, Int32, Float64

Output column type

  • Float64


To use this aggregation, define an Aggregation object, using function="stddev_samp", in a Batch Feature View or a Stream Feature View.


Aggregation(column="amt", function="stddev_samp", time_window=timedelta(days=1))


An aggregation function that returns, for a materialization time window, the sum of the row values for a column, per entity value (such as a user_id value).

Input column types

  • Int64, Int32, Float64

Output column type

  • Int64 or Float64


To use this aggregation, define an Aggregation object, using function="sum", in a Batch Feature View or a Stream Feature View.


Aggregation(column="amt", function="sum", time_window=timedelta(days=1))


An aggregation function that returns, for a materialization time window, the variance of the row values for a column around the population mean, per entity value (such as a user_id value).

Not currently supported with:

  • Serverless Feature Retrieval with Athena

Input column types

  • Int64, Int32, Float64

Output column type

  • Float64


To use this aggregation, define an Aggregation object, using function="var_pop", in a Batch Feature View or a Stream Feature View.


Aggregation(column="amt", function="var_pop", time_window=timedelta(days=1))


An aggregation function that returns, for a materialization time window, the variance of the row values for a column around the sample mean, per entity value (such as a user_id value).

Not currently supported with:

  • Serverless Feature Retrieval with Athena

Input column types

  • Int64, Int32, Float64

Output column type

  • Float64


To use this aggregation, define an Aggregation object, using function="var_samp", in a Batch Feature View or a Stream Feature View.


Aggregation(column="amt", function="var_samp", time_window=timedelta(days=1))

Was this page helpful?

🧠 Hi! Ask me anything about Tecton!

Floating button icon