What's New?
April 6, 2021
-
Online Feature Logging: Feature Services now have the ability to continuously log online requests and feature vector responses as Tecton Datasets. These logged feature datasets can be used for auditing, analysis, training dataset generation, and spine creation.
To enable feature logging on a FeatureService, simply add a LoggingConfig like in the example below and optionally specify a sample rate. You can also optionally set
log_effective_times=True
to log the feature timestamps from the Feature Store. As a reminder, Tecton will always serve the latest stored feature values as of the time of the request.Run
tecton apply
to apply your changes.from tecton import LoggingConfig ctr_prediction_service = FeatureService( name='ctr_prediction_service', features=[ ad_ground_truth_ctr_performance_7_days, user_total_ad_frequency_counts ], logging=LoggingConfig( sample_rate=0.5, log_effective_times=False ) )
This will create a new Tecton Dataset under the Datasets tab in the Web UI. This dataset will continue having new feature logs appended to it every 30 mins. If the features in the Feature Service change, a new dataset version will be created.
This dataset can be fetched in a notebook using the code snippet below.
import tecton dataset = tecton.get_dataset('ctr_prediction_service.logged_requests.4') display(dataset.to_spark())
-
Easier Dataset Retrieval: All Datasets can now be retrieved by name using the method below:
import tecton dataset = tecton.get_dataset('my_dataset') display(dataset.to_spark())
March 5, 2021
-
Self-Serve User Management: Tecton now offers a self-service user management portal through the Web UI. Navigate to the Admin Console by clicking on your avatar at the top right of the screen as pictured below:
From the Admin Console, cluster administrators can add new users or remove existing users from their Tecton instance.
February 17, 2021
- Faster FileDataSource Validation: FileDSConfig now supports an optional new
schema_uri
parameter, which significantly decreasestecton plan
andtecton apply
latency. This parameter allows users to specify a specific subpath within the data source URI that will be used as a de facto example of the schema. Example:Normally, FileDSConfig schema inference recurses through all partitions within a path (e.g. "s3://ad-impressions-data/batch_events/" above), which can be very expensive for large datasets with fine-grained partitioning. This process is not required if a Hive metastore such as Glue is used, since all partitions are already known.clicks_sleepnumber_file = FileDSConfig( uri="s3://acme-my-data/batch_events/", file_format="parquet", schema_uri="s3://acme-my-data/batch_events/date=2020-06-09/part-00000-tid-644275213368719145-d040e31e-d1c6-42f1-8677-4e97f09df7bc-1358-1.c000.parquet", )
January 22, 2021
New Features
- Workspace Home Page: A workspace dashboard is now available through our web UI. Navigate to your cluster.tecton.ai URL or click on the Tecton logo in the top left-hand corner of the screen to go to the new dashboard. This dashboard provides a high-level overview of the Tecton Primitives in your workspace, changes to your workspace, and quick links to helpful resources.
January 15, 2021
New Features
- Limited Destructive Updates: Changing
feature_start_time
,online_enabled
, oroffline_enabled
in MaterializationConfig for TFPs and TAFPs will no longer be a destructive update. Changing these params will schedule the additional jobs necessary to fill in the gaps, and not destroy existing materialized data nor induce serving downtime.
Breaking Changes
@online_transformation
arguments must now be named identically to RequestContext schema fields to prevent accidentally swapping arguments.rc = RequestContext( schema={ "field_A": StringType() }) # OK @online_transformation(request_context=rc, output_schema=output_schema) def ad_is_displayed_as_banner_transformer(field_A: pandas.Series): pass # Error: RequestContext schema fields ['field_A'] do not # match transformation function arguments ['field_X']. @online_transformation(request_context=rc, output_schema=output_schema) def ad_is_displayed_as_banner_transformer(field_X: pandas.Series): pass
January 8, 2021
Breaking Changes
- The Tecton Primitive
.get()
accessor methods are now deprecated. To fetch a Tecton Primitive, please use the newer workspace methods such asworkspace.get_entity("entity_name")
from tecton import * workspace = get_workspace("prod") fp = workspace.get_feature_package("my_fp") fs = workspace.get_feature_service("my_fs") e = workspace.get_entity("my_entity") t = workspace.get_transformation("my_transform") vds = workspace.get_virtual_data_source("my_vds")
December 23, 2020
New Features
-
New Monitoring, Alerting, and Debugging Tools: Monitoring, alerting, and debugging tools are now available to ensure that production FeaturePackages remain in a healthy state. A brief list of the new tools are below, however, more information can be found in the documentation.
- Add alert_email to the MonitoringConfig of a FeaturePackage to enable email alert.
- Navigate to the "Materialization" tab in the Web UI to see new information on the status of processing jobs and other helpful information.
- Use the
tecton materialization-status [FP_NAME]
command in the CLI to retrieve more detailed materialization processing job information for a specific FeaturePackage - Use the
tecton freshness
command in the CLI to retrieve cluster-level freshness information for all production FeaturePackages.
- Add alert_email to the MonitoringConfig of a FeaturePackage to enable email alert.
-
Feature Repo File Paths: All Tecton objects now show their Feature Repo file path in the UI to make them easier to discover and edit.
- FeatureService Metadata API FeatureServices have a new metadata API for fetching information about expected input parameters and returned features.
curl -X POST https://staging.tecton.ai/v1/feature-service/metadata -H "Authorization: Tecton-key $API_KEY" -d\ '{ "params": { "feature_service_name": "yolo2" } }' { "featureServiceType" : "DEFAULT", "inputRequestContextKeys":[{"name":"device_type","type":"string"},{"name":"a","type":"string"}], "featureValues":[{"name":"oofp.is_mobile_device","type":"boolean"}] }
December 7, 2020
New Features
-
PushFeaturePackages: Users can now create PushFeaturePackages to ingest features generated outside of Tecton and load them into the offline and online Feature Stores for training or prediction. For a detailed example, check out Pushing Feature Values into Feature Stores
import tecton import pandas fp = tecton.get_feature_package('user_purchases_push_fp') pandas_df = pandas.DataFrame([{ "timestamp": pandas.Timestamp("2020-09-18 12:00:06", tz="UTC"), "userid": "u123", "num_purchases": 91 }]) fp.ingest(pandas_df)
-
Improved Documentation and Usability: The Tecton documentation has been updated with easier navigation and a focus on practical examples. Recently, the Tecton team has also shipped a large number of usability improvements throughout the product including bug fixes, better error messages, better validations, and more.
November 6, 2020
New Features
- FeaturePackage Freshness Monitoring: FeaturePackages now show their "actual freshness" value under the "Materialization" tab.
October 30, 2020
New Features
- Tecton Feature Freshness CLI Overview: Users can now view the freshness of all features in the CLI by running
tecton freshness
.$ tecton freshness Feature Package Stale? Freshness Expected Freshness Created At ================================================================================================= ad_ground_truth_ctr_performance_7_days N 14h 40m 2d 10/01/20 2:25 user_ad_impression_counts N 40m 24s 2h 10/01/20 2:16 content_keyword_ctr_performance:v2 N 40m 25s 2h 09/04/20 22:22 ad_group_ctr_performance N 40m 26s 2h 08/26/20 12:52 ad_is_displayed_as_banner - - - 07/24/20 13:51
October 19, 2020
New Features
- Snowflake Data Sources: Tecton now supports Snowflake as a data source!
click_stream_snowflake_ds = SnowflakeDSConfig( url="https://[your-cluster].eu-west-1.snowflakecomputing.com/", database="YOUR_DB", schema="CLICK_STREAM_SCHEMA", warehouse="COMPUTE_WH", table="CLICK_STREAM", ) transaction_snowflake_vds = VirtualDataSource( name="click_stream_snowflake_vds", batch_ds_config=click_stream_snowflake_ds, )
October 9, 2020
New Features
-
Web UI Materialization Job Monitoring: Materialization jobs are now displayed to help monitor FeaturePackages.
The easiest way to check the health of a materialized FeaturePackage is now through the Web UI. Navigate to the FeaturePackage in question and switch to the "Materialization" tab to see FeaturePackage materialization diagnostics at a glance.
The new "Materialization Jobs" table displays the most relevant information about a FeaturePackage's materialization jobs. Retried jobs are grouped into rows, and the most recent job's status is displayed. Visit the "Run Page" for a row to view more specific job information or use the SDK to dive deeper into Materialization Jobs.
-
Easier Tecton CLI Login: Users can now log into the Tecton CLI simply by running
tecton login [cluster URL]
. This will automatically open a browser tab to authenticate.tecton configure
will now be deprecated, as users no longer need to set keys manually.
September 25, 2020
New Features
- Improvements to SDK Materialization Status Monitoring: When running fp.materialization_status(verbose=True), users will now also see two additional columns for each run: "TERMINATION_REASON" and "STATE_MESSAGE". These columns should provide more information for failed materialization runs.
- Simpler Feature Service Definitions:
online_serving_enabled
is now set toTrue
by default in FeatureServices, making the typical FeatureService definition simpler. Setonline_serving_enabled=False
if you want to create a batch-only FeatureService. - Bug fixes in Saved Datasets.
September 18, 2020
New Features
- Materialization Status Improvements: The Materialization Status graph in a FeaturePackage's Materialization tab now shows better descriptors that make it clear which bar is related to Streaming vs Batch data. Hover over the bar to view descriptor.
- In the interactive SDK, users can now pass a flag to only show materialization errors by calling
my_feature_package.materialization_status(only_errors=True)
September 11, 2020
New Features
- Feature Package entities are now hyperlinks to specific Entity pages.
September 4, 2020
New Features and Breaking Changes
- Quicker Workspace Iteration: Users no longer have to confirm destructive changes when running
tecton apply
in non-prod workspaces. These safety checks are unnecessary because non-prod workspaces do not contain materialized data and can be easily restored to a prior state. - The
default_join_keys
parameter in theEntity
class has been renamed tojoin_keys
.default_join_keys
will be deprecated.partner_entity = Entity(name="PartnerWebsite", join_keys=["partner_id"], description="The partner website participating in the ad network.")
August 21, 2020
New Features
- Feature Summary Statistics: Tecton now computes and displays data summary statistics in the Web UI for features the have offline materialization enabled.
August 14, 2020
New Features
-
Feature Freshness Custom Monitoring: Users now can customize the freshness monitoring of their Feature Packages using
MonitoringConfig
. The Web UI will also reflect these configurations.In your Tecton declarative API configuration file, import
MonitoringConfig
to specify how your materialized Feature Package should be monitored. If you don't provide this config, we will compute a default threshold.You can then find this in the Materialization tab on a Feature Package page on the Web UI.from tecton import MonitoringConfig, TemporalFeaturePackage ... my_feature_package = TemporalFeaturePackage( name="my_feature_package", ... materialization=MaterializationConfig( schedule_interval="3d", ... ), monitoring_config = MonitoringConfig( monitor_freshness=True, expected_feature="2w" ) )
August 6, 2020
New Features
- Kafka and Redshift Data Sources: Tecton now supports Kafka and Redshift as data sources!
July 31, 2020
New Features
- First Class Transformations: Transformations are now first-class objects in Tecton. They can be cataloged with metadata, viewed in the UI, and fetched in a notebook.
import tecton # Prod workspace tecton.get_transformation('my_transformation') # Specified workspace ws = tecton.get_workspace('my_ws') ws.get_transformation('my_transformation') Property Value ================================================================================ name my_transformation description None created_at 2020-07-28 20:15:14 defined_in my/transformation.py owner ravi type SQL inputs Transformations: ['transformation1'] Virtual Data Sources: None use_context True transformer def my_transformation(context, transformation1_view): return f""" SELECT content_id, SUM(clicked) as actual2, to_timestamp('{context.feature_data_end_time}') as timestamp FROM {transformation1_view} GROUP BY content_id """
July 27, 2020
New Features and Breaking Changes
- Simpler FeatureService Definitions: Specifying the features that are used in a FeatureService is now done in the constructor. Along with this change, the
FeatureService.add()
method has been deprecated. This change ensures that a FeatureService definition has a single source of truth, and makes the FeatureService class consistent with other Tecton classes.from tecton import FeatureService from feature_repo.features import my_package1, my_package2 my_service = FeatureService( name='example_feature_service', features=[ my_package1, my_package2 ] )
-
Materialization parameters changes: When defining a FeaturePackage, all materialization-parameters are now specified in a configuration class,
MaterializationConfig
. This change is expected to increase organization and re-use (asMaterializationConfig
can be reused across many FeaturePackages.) Some parameters have been renamed for clarity and brevity.The table below lists the full changes:from tecton import TemporalFeaturePackage, MaterializationConfig ad_ground_truth_ctr_performance_7_days = TemporalFeaturePackage( name="ad_ground_truth_ctr_performance_7_days", transformation=ad_ground_truth_ctr_performance_7_days_transformer, entities=[e.ad_entity], data_source_configs=[data_sources.ad_impressions_batch_config], materialization=MaterializationConfig( online_enabled=True, feature_start_time=datetime(2020, 6, 19), schedule_interval='1day', serving_ttl='1day', data_lookback_period='7days' ), )
Old New online_materialization_enabled
online_enabled
offline_materialization_enabled
offline_enabled
feature_store_start_time
feature_start_time
batch_materialization_schedule
schedule_interval
data_lookback
data_lookback_period
serving_tll
serving_ttl
-
DataSource class names: Datasource class names have been shortened for brevity. The table below lists the full changes:
Old New HiveDataSourceConfig
HiveDSConfig
KinesisDataSourceConfig
KinesisDSConfig
FileDataSourceConfig
FileDSConfig
-
Interactive and Declarative class split: Tecton has fully split its Interactive and Declarative Python classes. The Reference API now lists seperate pages for Interactive classes (which are used in notebooks and returned from functions such as
tecton.get_feature_package()
), and Declarative classes (which are used to declare Tecton objects in a Feature Repository.) -
timestamp_key
is now optional: When declaring aFeaturePackage
, Tecton will now infer thetimestamp_key
argument when possible.
July 6, 2020
New Features
- Metadata tagging: Tecton users now have the ability to add metadata tags to VirtualDataSources, Entities, Feature Packages, and Feature Services. Simply pass a Python dictionary containing all tags via the
tags
parameter to their constructors. These tags will show up in the Tecton Web UI.my_feature_package = TemporalFeaturePackage( name="my_example_temporal_feature_package", ... tags={ 'tag_key':'tag_value', 'experimental':'true', } )
- Plan Hooks for OnlineTransformation testing: Tecton supports Plan Hooks that run automatically every time
tecton plan
ortecton apply
is run. This lets you trigger customizable behavior during key actions during the tecton workflow. Plan Hooks are great for creating unit tests for OnlineTransformations where errors would otherwise only be caught at runtime.
June 25, 2020
New Features and Breaking Changes
-
Configurable offline and online materialization: Users can now independently configure offline and online materialization for a FeaturePackage in order to optimize costs and store exactly the data that is needed.
This is enabled via the new parameters,
online_materialization_enabled
, andoffline_materialization_enabled
. Thematerialization_enabled
parameter has been removed.# Example: online materialization is required for serving, # but offline materialization for historical look-up is not required. my_feature_package = TemporalFeaturePackage( name='my_feature', ... online_materialization_enabled=True, offline_materialization_enabled=False, )
-
Easier experimentation with Feature Services: Users can now create FeatureServices that depend on FeaturePackages which are not materializing, allowing for richer experimentation before materialization is enabled.
This is enabled via the new
online_serving_enabled
parameter on FeatureService, which configures whether a FeatureService can serve feature values online. By settingonline_serving_enabled
toFalse
, users can now create FeatureServices with non-materializing FeaturePackages.online_serving_enabled
defaults toFalse
, meaning that the default behavior of FeatureServices has changed.# to use a FeatureService for online queries, online_serving_enabled # must be explicitly set to True. my_feature_service = FeatureService( name='feature_service_for_online_use', ..., online_serving_enabled=True )
June 11, 2020
New Features
- Faster Tecton CLI: The Spark driver initialization has been removed from the CLI, making it much quicker to run
tecton plan
andtecton apply
. Try it out in the latest SDK! - Workspace names are now included in the Web UI URL to enable direct linking to objects in a Workspace.
New Features and Breaking Changes
- The command for creating a Workspace has been changed from
tecton workspace new [workspace]
totecton workspace create [workspace]
. For a complete list of Workspace commands, check out Using Workspaces.
May 11, 2020
New Features
- Workspaces: Users can now define different Tecton Workspaces which offer an isolated environment for experimental iteration. Workspaces are designed to work well with code branches. To get started try entering the CLI commands below and then navigate to your new workspace in the Tecton Web UI. You can find detailed documentation on Workspaces here.
$ git checkout -b [name] $ tecton workspace create [name] $ tecton apply
- File Data Sources now support Parquet and CSV file formats.
- The Tecton CLI now provides more helpful error messages that tell where offending objects are defined.
Bug Fixes
- http://[domain].tecton.ai now correctly redirects to https instead of hanging.