Version: 0.9

Tecton-managed offline event log

You can ask Tecton to maintain a historical event log of your ingested records. This historical event log can be leveraged to backfill newly created features or generate point-in-time correct training data.

from tecton import PushConfig, StreamSource
from tecton.types import String, Int64, Timestamp, Field

input_schema = [
    Field(name="user_id", dtype=String),
    Field(name="timestamp", dtype=Timestamp),
    Field(name="clicked", dtype=Int64),
]

stream_config_log = PushConfig(log_offline=True)
impressions_event_source = StreamSource(
    name="impressions_event_source", schema=input_schema, stream_config=stream_config_log
)

Below is a Stream Feature View using the above Stream Source.

from datetime import datetime, timedelta
from tecton import StreamFeatureView
from ads.entities import user
from ads.data_sources.ad_impressions import impressions_event_source

schema = [
    Field(name="user_id", dtype=String),
    Field(name="timestamp", dtype=Timestamp),
    Field(name="clicked", dtype=Int64),
]

click_events_fv = StreamFeatureView(
    name="click_events_fv",
    source=impressions_event_source,
    entities=[user],
    online=True,
    offline=True,
    feature_start_time=datetime(2022, 1, 1),
    batch_schedule=timedelta(days=1),
    ttl=timedelta(days=7),
    description="The count of ad clicks for a user",
    schema=schema,
)

Training Data Generation

Historical data can be retrieved via the Feature View's get_features_for_events() method in the Python SDK with from_source=True. Note that newly applied Feature Views using the same Stream Source can also retrieve historical data previously ingested.

Training Data Generation​

Was this page helpful?

Training Data Generation