Skip to main content
Version: 0.8

tecton.TimeWindow

Summary​

This class describes a TimeWindow that is applied in an Aggregation within a Batch or Stream Feature View.

Description​

Tecton aggregations are applied over a specified time window using the time_window parameter. Use the TimeWindow class to create an aggregation over a fixed window size as shown in the example below:

from tecton import batch_feature_view, FilteredSource, Aggregation, TimeWindow


@batch_feature_view(
sources=[FilteredSource(transactions)],
mode="spark_sql",
entities=[user],
aggregation_interval=timedelta(days=1),
aggregations=[Aggregation(column="amount", function="mean", time_window=TimeWindow(window_size=timedelta(days=7)))],
)
def user_average_transaction_amount(transactions):
return f"""
SELECT user_id, timestamp, amount
FROM {transactions}
"""
note

If you directly pass a datetime.timedelta object to the time_window parameter, as in time_window=datetime.timedelta(days=7), it will be inferred as time_window=TimeWindow(window_size=datetime.timedelta(days=7))

The end time of this window will be the most recent aggregation interval relative to the online request time or offline spine timestamp.

Offset Time Windows​

The end time of the time window can be adjusted via an offset parameter in the TimeWindow class as shown in the example below. In this example, the window will be from -10 days to -3 days:

from tecton import batch_feature_view, FilteredSource, Aggregation, TimeWindow
from datetime import timedelta


@batch_feature_view(
sources=[FilteredSource(transactions)],
mode="spark_sql",
entities=[user],
aggregation_interval=timedelta(days=1),
aggregations=[
Aggregation(
column="amount",
function="mean",
time_window=TimeWindow(window_size=timedelta(days=7), offset=timedelta(days=-3)),
)
],
)
def user_average_transaction_amount(transactions):
return f"""
SELECT user_id, timestamp, amount
FROM {transactions}
"""
note

The offset parameter must always be negative.

Example​

Consider the following example mock data:

user_idtimestampvalue
0user_12022-05-14 00:00:001
1user_12022-05-15 00:00:003
2user_12022-05-16 00:00:006
3user_12022-05-17 00:00:0011
4user_12022-05-18 00:00:0023

A Feature View can have aggregations with and without an offset.

from tecton import Entity, batch_feature_view, Aggregation, TimeWindow

user_entity = Entity(name="user", join_keys=["user_id"])


@batch_feature_view(
mode="spark_sql",
sources=[ds],
entities=[user_entity],
aggregation_interval=timedelta(days=1),
timestamp_field="timestamp",
offline=True,
online=True,
feature_start_time=datetime(2022, 5, 1),
aggregations=[
Aggregation(column="value", function="sum", time_window=TimeWindow(window_size=timedelta(days=2))),
Aggregation(
column="value",
function="sum",
time_window=TimeWindow(window_size=timedelta(days=2), offset=timedelta(days=-2)),
),
],
)
def user_transaction_sums(input_table):
return f"""
SELECT user_id, timestamp, value
FROM {input_table}
"""

At request time when you pass in a spine, the aggregation will be computed over the time window with an offset relative to the spine timestamp. We give examples of how the aggregation is computed for different spine timestamps below.

import pandas as pd
import datetime

training_events = pd.DataFrame(
{
"user_id": ["user_1", "user_1", "user_1", "user_1"],
"timestamp": [datetime(2022, 5, 15), datetime(2022, 5, 18), datetime(2022, 5, 19), datetime(2022, 5, 20)],
}
)

df = user_transaction_sums.get_historical_features(training_events).to_pandas()
display(df)
user_idtimestampuser_transaction_sums__value_sum_2d_1duser_transaction_sums__value_sum_2d_1d_offset_2d
0user_12022-05-15 00:00:001None
1user_12022-05-18 00:00:00174
2user_12022-05-19 00:00:00349
3user_12022-05-20 00:00:002317

Attributes​

The attributes are the same as the __init__ method parameters. See below.

Methods​

__init__(...)​

Parameters​

  • window_size (datetime.timedelta) – The size of the window to aggregate over. Example: datetime.timedelta(days=30).

  • offset (datetime.timedelta)

    • The negative offset of the time window’s end time relative to the most recent aggregation interval for a given request timestamp. Example: datetime.timedelta(days=-1).

Was this page helpful?

🧠 Hi! Ask me anything about Tecton!

Floating button icon