tecton.FeatureAggregation¶
-
class
tecton.
FeatureAggregation
(column, function, time_windows)¶ This class describes a single aggregation that is applied in a batch or stream window aggregate feature view.
- Parameters
column (str) – Column name of the feature we are aggregating.
function (Union[str, AggregationFunction]) – One of the built-in aggregation functions.
time_windows (Union[str, List[str]]) – Duration to aggregate over in pytimeparse format. Examples:
"30days"
,["8hours", "30days", "365days"]
.
function can be one of predefined numeric aggregation functions, namely
"count"
,"sum"
,"mean"
,"min"
,"max"
. For these numeric aggregations, you can pass the name of it as a string. Nulls are handled like Spark SQL Function(column), e.g. SUM/MEAN/MIN/MAX of all nulls is null and COUNT of all nulls is 0.In addition to numeric aggregations,
FeatureAggregation
supports “last-n” aggregations that will compute the last N distinct values for the column by timestamp. Right now only string column types are supported as inputs to this aggregation, i.e., the resulting feature value will be a list of strings. Nulls are not included in the aggregated list.You can use it via the
last_distinct()
helper function like this:from tecton.aggregation_functions import last_distinct my_fv = BatchWindowAggregateFeatureView( ... aggregations=[FeatureAggregation( column='my_column', function=last_distinct(15), time_windows=['7days'])], ... )
Methods
Initialize self.
-
__init__
(column, function, time_windows)¶ Initialize self. See help(type(self)) for accurate signature.