This class describes a single aggregation that is applied in a batch or stream feature view.
Aggregation constructor accepts a
function input, which can be one of
built-in aggregation functions.
For these aggregation functions, you can pass the name of it as a string. Nulls
are handled like Spark SQL Function(column)- for example,
sum of all nulls is
count of all nulls is 0.
In addition to numeric aggregations,
Aggregation supports the last
non-distinct and distinct N aggregation that will compute the last N
non-distinct and distinct values for the column by timestamp. Right now only
string column is supported as input to this aggregation, i.e., the resulting
feature value will be a list of strings. The order of the value in the list is
ascending based on the timestamp. Nulls are not included in the aggregated list.
You can use it via the
last_distinct() helper function like this:
from tecton.aggregation_functions import last_distinct, last
The attributes are the same as the
__init__ method parameters. See below.
Method generated by attrs for class Aggregation.
str) – Column name of the feature we are aggregating.
datetime.timedelta) – Duration to aggregate over. Example:
str) – The name of this feature. Defaults to an autogenerated name, e.g.