0.5.0
New featuresβ
Materialization jobs can be manually triggeredβ
With the Materialization API, you can manually trigger materialization via an API call. The Materialization API can be used in the the Tecton SDK and in Airflow, through the Tecton Airflow provider.
Feature Output Streamsβ
Feature View Output Streams enable your application to subscribe to the outputs of streaming feature pipelines. Your application accesses these outputs via a stream sink. Feature View Output Streams are designed to be used for asynchronous predictions, where model inference is triggered by newly arriving feature data.
The Tecton SDK can be used, in any Python Environment, to retrieve featuresβ
Using the Tecton SDK with AWS Athena removes the requirement that you use a Databricks notebook or an AWS EMR notebook to retrieve features from Tectonβs offline store.
When using the Tecton SDK with AWS Athena, you can retrieve features from Tectonβs offline store in any Python environment that has access to AWS (e.g. your local laptop, a Jupyter notebook, Kubeflow pipelines etc).
Data Source Functions, for increased flexibility in working with Data Sourcesβ
When defining a BatchSource
or StreamSource
object, you set the
batch_config
or stream_config
parameter, respectively. The value of these
configs
can be the name of an object (such as HiveConfig
or KafkaConfig
)
or a Data Source Function.
Compared to using an object, a Data Source Function gives you more flexibility in connecting to an underlying data source and specifying logic for transforming the data retrieved from the underlying data source. However, using an object is recommended if you do not require the additional flexibility offered by a Data Source Function.
Rematerialization can be suppressed, to reduce infrastructure costsβ
After refactoring a Python function or migrating an upstream Data Source, you
can run tecton plan
or tecton apply
with the --suppress-recreates
flag to
suppress rematerialization. When rematerialization is suppressed, feature values
are not recalculated.
You should only use the --suppress-recreates
flag when you are confident that
changes to a Tecton repo will not affect feature values.
Struct Type Features in On-Demand Feature Viewsβ
You can include a
Struct
data type in the output schema of an On-Demand Feature View (ODFV).
A Struct
can contain multiple fields with mixed data types.
A Struct
can be nested within other complex types. For example, you can have a
Struct
within a Struct
, or an array of Struct
s.
Using a Struct
in the output schema of an ODFV allows you to easily parse the
ODFV's output when it contains multiple feature values.
Improvements and bug fixesβ
to_dict
support on SDK methods returning tabular Displayable
objectsβ
All SDK methods returning a table now return a Displayable
object with a
to_dict()
method. The following methods have been updated.
materialization_status()
summary()
deletion_status()
get_feature_freshness()
(see Note below)
get_feature_freshness
no longer supports the to_dict
parameter. Calls to the
method can be updated by changing tecton.get_feature_freshness(to_dict=True)
to tecton.get_feature_freshness().to_dict()
.
Alert email must now be set if monitor_freshness
= True
β
For monitoring of feature views, the alert_email
parameter must also be set if
monitor_freshness
= True
. This is to ensure that alerting emails are sent
for the desired feature views. See Alerts for
more information.
get_historical_features() performance improvements on Sparkβ
get_historical_features()
has been updated with a more performant
point-in-time join. This join results in faster feature value retrieval when
both of the following are true:
- The call to
get_historical_features()
contains a spine. get_historical_features()
returns feature values from non-aggregate Feature Views, custom aggregate Feature Views, or Feature Services that contain the prior two Feature Views mentioned.
Batch Feature View skew reductionβ
To reduce online/offline skew, get_historical_features()
now uses the
_effective_timestamp
(calculated internally) to retrieve feature values. The _effective_timestamp
is the earliest time the feature will be available in the online store for
inference. The _effective_timestamp
column is automatically added to all
feature records returned by calls to get_historical_features()
which do not
include a spine.
Improved support for nulls in On-Demand Feature Viewsβ
On-Demand Feature Views now have improved support for nulls. On-Demand Feature Views that use Pandas still have some null special handling; see the documentation.
Upgrading to 0.5β
0.5 will no longer support compat definitions. Follow the instructions below to upgrade to 0.5 based on your current version. You will NOT need to re-materialize data to upgrade your objects.
In 0.5, you must set an alert email for Feature Views with monitoring enabled. You may see this error blocking your apply. When upgrading from 0.3 or 0.4 in compatibility mode, please configure the alert email while upgrading your Feature views. No other semantic changes can be done when upgrading.
When upgrading to 0.5 you will see updates to your Feature View's
batch_trigger
like the following as a result of the new
Materialization API. These
changes have no effect, and will only occur the first time you run
tecton apply
with Tecton 0.5
~ Update BatchDataSource
name: transactions_batch
description: Batch Data Source for transactions stream
batch_trigger: BATCH_TRIGGER_TYPE_UNKNOWN -> BATCH_TRIGGER_TYPE_SCHEDULED
From 0.4 non-compat:
- You can move to 0.5 CLI without making any changes!
From 0.4 in compatibility mode (tecton.compat):
-
You can move to 0.5 CLI directly if you upgrade all of your definitions to 0.4 definitions using this upgrade guide in one
tecton apply
. -
To upgrade definitions incrementally, i.e. in multiple
tecton apply
steps:1.) Upgrade objects to 0.4 definitions using 0.4 CLI with this guide.
2.) Once all your objects are in 0.4 definitions you can move to 0.5 CLI.
From 0.3:
-
You can move to 0.5 CLI directly if you upgrade all of your definitions to 0.4 definitions using this upgrade guide in one
tecton apply
. -
To upgrade incrementally, i.e. in multiple
tecton apply
steps:1.) You must first upgrade to 0.4 CLI with objects in compatibility mode. Follow these instructions.
2.) Upgrade your objects from 0.4 compat to 0.4 definitions using these instructions.
3.) Once all your objects are 0.4 definitions, you can move to 0.5 CLI.