0.9 to 1.0 Upgrade Guide
Overviewβ
With Tecton 1.0, weβre excited to announce new capabilities along two major fronts:
GenAI. We are applying our 5+ years experience of helping enterprises run their AI applications in production to the GenAI world. With Tecton 1.0, you can now manage, enrich and serve your prompts in production, cost efficiently generate, store and serve embeddings, and provide your LLM with additional context in the form of features as tools and knowledge.
Core platform. We are continuing to evolve our core platform, improving performance and cost efficiency at scale. New capabilities include Remote Dataset Generation, Compaction for Streaming Time Window Aggregation features, and improved controls for realtime compute and feature serving infrastructure. 1.0 also includes capabilities such as Plan Integration tests to further simplify our developer experience; or Model Generated Features to allow our customers to make the most of all of their data. And thereβs much more!
Upgrade to 1.0β
To begin, read the general Upgrade Process for background on migrating between SDK versions and view the changelog for a comprehensive list of improvements in 1.0.
The majority of code changes you'll make while upgrading from 0.9
are driven
by the a pair of framework improvements in 1.0
which are covered in depth
below.
Before you begin, make sure your local repository is in sync with the version
currently applied to Tecton. You can also run tecton restore
to sync from
Tecton to your working directory.
Migrating to the features
Parameterβ
In Tecton 1.0, all feature views will declare their features with a new
features
parameter. The streamlined feature definition includes feature-level
descriptions and tagging for improved discoverability.
# Before
user = Entity(name="user", join_keys=["user_id"])
@batch_feature_view(
# ...
entities=[user],
schema=[
Field("user_id", String),
Field("signup_timestamp", Timestamp),
Field("credit_card_issuer", String),
],
)
def my_bfv():
pass
# After
user = Entity(name="user", join_keys=[Field("user_id", String)])
@batch_feature_view(
# ...
entities=[user],
timestamp_field="signup_timestamp",
features=[Attribute("credit_card_issuer", String)],
)
def my_bfv():
pass
The features
parameter consolidates and replaces the old parameters schema
and aggregations
previously defined on feature views.
New Filtered Source Defaultβ
Previous versions of Tecton have offered FilteredSource as a wrapper for DataSources. Filtering data at the source ensures that only the necessary data is processed, reducing computational overhead and improving the efficiency of your feature engineering pipeline. This was recommended for most use cases, and has become the default behavior as of 1.0.
To improve the flexibility and UX for data source filtering, Tecton 1.0 offers a
new set of DataSource APIs and support for start/end_time
as opposed to just
start_time_offset
.
# Before:
@batch_feature_view(
# ...
sources=[
ds_one, # Unfiltered by default
FilteredSource(ds_two),
FilteredSource(ds_three, start_time_offset=timedelta(days=-35)),
],
)
def bfv():
pass
# Equivalent in 1.0.0
from tecton import TectonTimeConstant
@batch_feature_view(
# ...
sources=[
ds_one.unfiltered(),
ds_two, # Filtered by default
ds_three.select_range(
start_time=TectonTimeConstant.MATERIALIZATION_START_TIME - timedelta(days=35),
end_time=TectonTimeConstant.MATERIALIZATION_END_TIME,
),
]
)
def bfv():
pass
Upgrade Utilitiesβ
We've provided a set of tools to ease the transition between SDK versions. Notably:
- The
v09_compat
module contains objects compatible with Tecton 0.9. Note that these are meant as a temporary bridge between versions, and prolonged use is not recommended. tecton upgrade
is a new CLI command which can inspect your workspace and provide an interactive upgrade guide with code suggestions.tecton plan / apply
will block on changes which include both atecton.v09_compat
totecton
upgrade and a substantive change to feature views including destruction or creation. This guard rail is meant to ensure that SDK upgrade work does not affect production feature serving or materialization.
Tecton recommends relying on tecton upgrade
for upgrading feature definitions.
The tool is capable of generating sample code specific to your workspace such as
type suggestions or imports. This guide covers the same material as the upgrade
tool.
Prepare your Repositoryβ
Upgrade repo.yamlβ
- In the repo.yaml file at the base of your repository:
- Spark Based Feature Views: Update the
tecton_materialization_runtime
parameter in repo.yaml to1.0.0
- Rift Based Feature Views: Update the
environment
field in your repo.yaml totecton-rift-core-1.0.0
- Spark Based Feature Views: Update the
- Install Tecton 1.0 locally with
pip install tecton==1.0.0
Modify Import Pathsβ
We will begin the upgrade process by first converting all the objects to the
backwards-compatible 0.9 versions using the v09_compat
library. We will then
go through each Tecton Object and make the necessary changes to use the 1.0
version of the object.
Globally replace all tecton
imports with tecton.v09_compat
imports. Note
that this does not apply to tecton.types
or tecton.aggregation_functions
,
which can remain unchanged. This will will allow you to migrate all Data
Sources, Feature Views, Entities, Feature Services, and Transformations piece by
piece.
# Before
from tecton import Entity, batch_feature_view
from tecton.types import Field, Int64
# After
from tecton.v09_compat import Entity, batch_feature_view
from tecton.types import Field, Int64
Consider a command line incantation such as
find . -type f -exec perl -i -pe 's/from tecton import/from tecton.v09_compat import/g' {} +
Run tecton plan
to validate these changes. You may see a list of yellow
warnings along with an Update SDK Version
confirmation on top. You should not
see destructive or creation changes for your feature views.
~ Update SDK Version: 0.9.14 -> 1.0.0
~ Update Stream Data Source
name: ad_impressions_stream
~ Update Batch Feature View
name: user_approx_distinct_merchant_transaction_count_30d
description: How many transactions the user has made to distinct merchants in the last 30 days.
tecton_materialization_runtime: 0.9.0 -> 1.0
~ Update Stream Feature View
name: user_continuous_transaction_count
owner: demo-user@tecton.ai
description: Number of transactions a user has made recently
tecton_materialization_runtime: 0.9.0 -> 1.0
aggregation_leading_edge: AGGREGATION_MODE_UNSPECIFIED -> AGGREGATION_MODE_LATEST_EVENT_TIME
warning: `LATEST_EVENT_TIME` will be deprecated in Tecton sdk 1.1.
Upgrade your Feature Definitionsβ
This section describes step by step instructions for upgrading your feature definitions.
The tecton upgrade
CLI tool accompanies this guide. It provides updated code
snippets and guidance specific to your workspace.
β― tecton upgrade
β
Imported 29 Python modules from the feature repository
Migration Progress:
Step 1: Migrate Data Sources - β
(4/4 Data Sources migrated).
Step 2: Migrate Entities - β
(4/4 Entities migrated).
Step 3: Migrate Batch Feature Views - β
(8/8 Batch Feature Views migrated).
Step 4: Migrate Stream Feature Views - π« (0/5 Stream Feature Views migrated).
Step 5: Migrate On-Demand Feature Views - π« (0/5 On Demand Feature Views migrated).
Step 6: Migrate Feature Tables - π« (0/3 Feature Tables migrated).
Step 7: Migrate all other imports from `tecton.v09_compat` to `tecton`.
Step 4: Migrate Stream Feature Views.
Detected 5 Stream Feature Views in need of an upgrade:
# ...
PushSourceβ
PushSource was deprecated in 0.9 and has been removed in 1.0. It should be replaced by a StreamSource using a PushConfig as shown below:
# Before
from tecton.v09_compat import PushSource
user_click_push_source = PushSource(
name="user_event_source",
schema=user_schema,
)
# After
from tecton import StreamSource, PushConfig
user_click_push_source = StreamSource(
name="user_event_source",
stream_config=PushConfig(),
schema=user_schema,
)
Entityβ
Previously, Entity
objects accepted untyped join key names. In 1.0 and beyond,
Tecton requires that join_keys be typed Field objects. tecton upgrade
can
infer join_key types for you.
# Before
from tecton.v09_compat import Entity
user_entity = Entity(name="user", join_keys=["user_id"])
# After
from tecton import Entity
user_entity = Entity(name="user", join_keys=[Field("user_id", String)])
BatchFeatureViewβ
In 1.0 Batch Feature Views accept a features parameter instead of schema. The
upgrade
command can help generate properly typed Features arguments. You will
also need to set timestamp_field
explicitly. tecton upgrade
can infer
Feature schema for you.
You will also be required to handle the new FilteredSource default. For more information on this change, see the accompanying documentation.
Migrate non aggregate batch feature views:
# Before
@batch_feature_view(
# ...
entities=[FilteredSource(user)],
schema=[
Field("user_id", String),
Field("value", Int64),
Field("timestamp", Timestamp),
],
)
def my_feature_view(input):
return f"""
SELECT user_id, value, timestamp FROM {input}
"""
# After
@batch_feature_view(
# ...
entities=[user], # Source is filtered by default
features=[
Attribute(name="value", dtype=Int64),
],
timestamp_field="timestamp",
)
def feature_view(input):
return f"""
SELECT user_id, value, timestamp FROM {input}
"""
Migrate aggregation batch feature views:
# Before
@batch_feature_view(
# ...
entities=[FilteredSource(user)],
aggregations=[
Aggregation(
column="value",
function="count",
time_window=timedelta(days=7),
),
],
)
def feature_view(input):
return f"""
SELECT user_id, value, timestamp FROM {input}
"""
# After
@batch_feature_view(
# ...
entities=[user], # Source is filtered by default
features=[
Aggregate(
input_column=Field("value", Int64),
function="count",
time_window=timedelta(days=7),
),
],
timestamp_field="timestamp",
)
def feature_view(input):
return f"""
SELECT user_id, value, timestamp FROM {input}
"""
StreamFeatureViewβ
All the updates required for BatchFeatureViews apply to StreamFeatureViews as
well. These include the features
parameter, a required timestamp_field
, and
a new FilteredSource
default. There is one additional change:
- 1.0 introduces a new
aggregation_leading_edge
parameter, allowing users to set the aggregation strategy for processing stream events. To upgrade safely, this value should be set toaggregation_leading_edge=AggregationLeadingEdge.LATEST_EVENT_TIME
.
Migrate non aggregate stream feature views:
# Before
@stream_feature_view(
# ...
source=FilteredSource(transactions_stream),
)
def last_transaction_amount_sql(transactions):
return f"""
SELECT
timestamp,
user_id,
value
FROM
{transactions}
"""
# After
@stream_feature_view(
# ...
source=transactions_stream, # Source is filtered by default
features=[
Attribute("value", Int64),
],
timestamp_field="timestamp",
aggregation_leading_edge=AggregationLeadingEdge.LATEST_EVENT_TIME,
)
def last_transaction_amount_sql(transactions):
return f"""
SELECT
timestamp,
user_id,
value
FROM
{transactions}
"""
Migrate aggregate stream feature views:
# Before
@stream_feature_view(
# ...
source=FilteredSource(transactions_stream),
aggregations=[
Aggregation(column="amt", function="sum", time_window=timedelta(hours=1)),
],
)
def last_transaction_amount_sql(transactions):
return f"""
SELECT
timestamp,
user_id,
value
FROM
{transactions}
"""
# After
@stream_feature_view(
# ...
source=transactions_stream, # Source is filtered by default
features=[
Aggregate(input_column=Field("amt", Int64), function="sum", time_window=timedelta(hours=1)),
],
timestamp_field="timestamp",
aggregation_leading_edge=AggregationLeadingEdge.LATEST_EVENT_TIME,
)
def last_transaction_amount_sql(transactions):
return f"""
SELECT
timestamp,
user_id,
value
FROM
{transactions}
"""
OnDemandFeatureView -> RealtimeFeatureViewβ
on_demand_feature_view
has been renamed to realtime_feature_view
. Upgrading
these feature views requires the same features parameter change as Batch and
Stream feature views.
# Before
@on_demand_feature_view(
mode="pandas",
sources=[transaction_request],
schema=[Field("amount", Int64)],
)
def my_feature_view(request_df: pandas.DataFrame):
pass
# After
@realtime_feature_view(
mode="pandas",
sources=[transaction_request],
features=[Attribute("amount", Int64)],
)
def my_feature_view(request_df: pandas.DataFrame):
pass
FeatureTableβ
Feature Tables now require a timestamp_field
and the new features parameter
# Before
ft = FeatureTable(
entities=[user_entity],
schema=[
Field("user_id", String),
Field("clicked", Int64),
Field("timestamp", Timestamp),
],
)
# After
ft = FeatureTable(
entities=[user_entity],
features=[
Attribute("clicked", Int64),
],
timestamp_field="timestamp",
)
Unit Tests and Timestampsβ
In 1.0, TectonDataFrame#to_pandas
returns Timezone (TZ) aware
datetime64[us, UTC]
objects for Timestamp columns. This return type aligns
with the rest of Tecton's framework, but diverges from previous SDKs, which
returned TZ-naive objects from this call pattern.
This change has the potential to break Timestamp comparisons in unit tests or similar code paths. If this affects your repo, you have two options:
- Localize the timezone to UTC with
df['timestamp'] = df['timestamp'].dt.tz_localize('UTC')
- Opt into the old behavior by setting
conf.set("TECTON_STRIP_TIMEZONE_FROM_FEATURE_VALUES", "true")
Option 1 is recommended, as it aligns online / offline return values exactly and will be the preferred behavior moving forward.
Final Stepβ
Congratulations! You've successfully migrated all objects which include a
breaking change in Tecton 1.0. It is now safe to migrate existing
tecton.v09_compat
imports to the top level tecton
module. You should apply
this final change and return to feature development with the newfound power of
Tecton 1.0.