FeatureAggregation(column, function, time_windows)¶
This class describes a single aggregation that is applied in a batch or stream window aggregate feature view.
function can be one of predefined numeric aggregation functions, namely
"max". For these numeric aggregations, you can pass the name of it as a string. Nulls are handled like Spark SQL Function(column), e.g. SUM/MEAN/MIN/MAX of all nulls is null and COUNT of all nulls is 0.
In addition to numeric aggregations,
FeatureAggregationsupports “last-n” aggregations that will compute the last N distinct values for the column by timestamp. Right now only string column types are supported as inputs to this aggregation, i.e., the resulting feature value will be a list of strings. Nulls are not included in the aggregated list.
You can use it via the
last_distinct()helper function like this:
from tecton.aggregation_functions import last_distinct my_fv = BatchWindowAggregateFeatureView( ... aggregations=[FeatureAggregation( column='my_column', function=last_distinct(15), time_windows=['7days'])], ... )
__init__(column, function, time_windows)¶
Initialize self. See help(type(self)) for accurate signature.