Skip to content

What's New?

July 9, 2021

Feature Monitoring Summary Dashboard

In addition to monitoring the materialization status for individual Feature Views, we've now added a summary dashboard to easily see if any of the Feature Views in a workspace are stale or have failing materialization jobs.

To view this dashboard, just click on Features in the left-hand navigation, then select the Monitoring Summary tab.

Feature Monitoring Summary

Databricks Runtime 6.4 Extended Support Tecton Notebook clusters are now configured by default to run on DBR 6.4 Extended Support in order to stay on an officially supported runtime.

The option to use Databricks Runtimes 8+ is coming soon.

June 25, 2021

Unlock users for incorrect password attempts

Admins can now Unlock users who have been locked out for too many incorrect password attempts using the Admin Console.

See the User Management instructions for details on how to unlock a user.

Tecton SDK on PyPI

The Tecton SDK is now being published to PyPI at https://pypi.org/project/tecton/ for easier installation.

See the CLI Setup Instructions for details on how to install the Tecton package.

June 7, 2021

Continuous mode for StreamWindowAggregateFeatureView

Continuous mode for Stream Window Aggregate Feature Views now enables new event data to be included in feature values in less than a second. This low ingestion latency can dramatically improve model performance for many use-cases, such as fraud detection or product recommendations.

In order to take advantage of continuous processing mode, all you need to do is set aggregation_slide_period=continuous in your feature view definition.

May 24, 2021

Framework v2

We're excited to introduce you to Tecton's Framework v2! While the core concepts remain the same, we've improved the API and added popular new features based on user feedback.

  1. Transformation definitions now have
    • Streamlined authoring for single-transformation features; and
    • Flexible pipelines for composition and re-use.
  2. On Demand Feature Views (formerly Online Feature Packages) can now combine current request data with materialized batch or stream features.
  3. BatchFeatureViews formerly TemporalFeaturePackage) can join multiple batch sources in their transformations.
  4. Backfilling for Batch Feature Views (formerly TemporalFeaturePackage) with data look back will be much more efficient.
  5. New object names make it easier to differentiate batch and stream processing.

See the Framework v2 documentation for more details on the new API.

May 3, 2021

Spark Configuration Options

Spark configuration options can now be added to NewEMRClusterConfig or NewDatabricksClusterConfig. This can be helpful if you're looking to materialize a particularly large dataset, and running into limitations on memory in Spark. The following options are currently supported:

  • spark.driver.memory
  • spark.driver.memoryOverhead
  • spark.executor.memory
  • spark.executor.memoryOverhead
MaterializationConfig(
    online_enabled=True,
    offline_enabled=True,
    feature_start_time=datetime(2021, 1, 1),
    batch_materialization=NewEMRClusterConfig(
        instance_type="m4.xlarge",
        spark_config={
            "spark.executor.memory": "2g",
            "spark.driver.memory": "2g",
        },
)

April 26, 2021

  • Schema Override for FileDSConfig FileDSConfig now supports an optional schema_override parameter, which can be used to specify a schema with a pyspark.sql.types.StructType object. If the parameter is set, then the schema will be explicitly used whenever Tecton reads from the file as opposed to being inferred automatically. This can be helpful if your file contains a column type that Spark doesn't support, for example INT64 (TIMESTAMP_MICROS).
FileDSConfig(
    uri='s3://ad-impressions-data/ctr_events.pq',
    file_format="parquet",
    schema_override=(
        pyspark.sql.types.StructType()
        .add("ad_id", pyspark.sql.types.LongType(), True)
        .add("user_uuid", pyspark.sql.types.StringType(), True)
        .add("timestamp", pyspark.sql.types.TimestampType(), True)
        .add("clicked", pyspark.sql.types.LongType(), True)
    ),
)

April 6, 2021

  • Online Feature Logging: Feature Services now have the ability to continuously log online requests and feature vector responses as Tecton Datasets. These logged feature datasets can be used for auditing, analysis, training dataset generation, and spine creation.

    Feature Logging Diagram

    To enable feature logging on a FeatureService, simply add a LoggingConfig like in the example below and optionally specify a sample rate. You can also optionally set log_effective_times=True to log the feature timestamps from the Feature Store. As a reminder, Tecton will always serve the latest stored feature values as of the time of the request.

    Run tecton apply to apply your changes.

    from tecton import LoggingConfig
    
    ctr_prediction_service = FeatureService(
        name='ctr_prediction_service',
        features=[
            ad_ground_truth_ctr_performance_7_days,
            user_total_ad_frequency_counts
        ],
        logging=LoggingConfig(
            sample_rate=0.5,
            log_effective_times=False
        )
    )
    

    This will create a new Tecton Dataset under the Datasets tab in the Web UI. This dataset will continue having new feature logs appended to it every 30 mins. If the features in the Feature Service change, a new dataset version will be created.

    Logged Features

    This dataset can be fetched in a notebook using the code snippet below.

    import tecton
    dataset = tecton.get_dataset('ctr_prediction_service.logged_requests.4')
    display(dataset.to_spark())
    

    Logged Features Dataset

  • Easier Dataset Retrieval: All Datasets can now be retrieved by name using the method below:

    import tecton
    dataset = tecton.get_dataset('my_dataset')
    display(dataset.to_spark())
    

March 5, 2021

  • Self-Serve User Management: Tecton now offers a self-service user management portal through the Web UI. Navigate to the Admin Console by clicking on your avatar at the top right of the screen as pictured below: Admin Panel

    From the Admin Console, cluster administrators can add new users or remove existing users from their Tecton instance. Self Serve User Management

February 17, 2021

  • Faster FileDataSource Validation: FileDSConfig now supports an optional new schema_uri parameter, which significantly decreases tecton plan and tecton apply latency. This parameter allows users to specify a specific subpath within the data source URI that will be used as a de facto example of the schema. Example:
      clicks_sleepnumber_file = FileDSConfig(
        uri="s3://acme-my-data/batch_events/",
        file_format="parquet",
        schema_uri="s3://acme-my-data/batch_events/date=2020-06-09/part-00000-tid-644275213368719145-d040e31e-d1c6-42f1-8677-4e97f09df7bc-1358-1.c000.parquet",
      )
    
    Normally, FileDSConfig schema inference recurses through all partitions within a path (e.g. "s3://ad-impressions-data/batch_events/" above), which can be very expensive for large datasets with fine-grained partitioning. This process is not required if a Hive metastore such as Glue is used, since all partitions are already known.

January 22, 2021

New Features

  • Workspace Home Page: A workspace dashboard is now available through our web UI. Navigate to your cluster.tecton.ai URL or click on the Tecton logo in the top left-hand corner of the screen to go to the new dashboard. This dashboard provides a high-level overview of the Tecton Primitives in your workspace, changes to your workspace, and quick links to helpful resources. Workspace Dash

January 15, 2021

New Features

  • Limited Destructive Updates: Changing feature_start_time, online_enabled, or offline_enabled in MaterializationConfig for TFPs and TAFPs will no longer be a destructive update. Changing these params will schedule the additional jobs necessary to fill in the gaps, and not destroy existing materialized data nor induce serving downtime.

Breaking Changes

  • @online_transformation arguments must now be named identically to RequestContext schema fields to prevent accidentally swapping arguments.
      rc = RequestContext(
      schema={
        "field_A": StringType()
      })
    
      # OK
      @online_transformation(request_context=rc, output_schema=output_schema)
      def ad_is_displayed_as_banner_transformer(field_A: pandas.Series):
        pass
    
      # Error: RequestContext schema fields ['field_A'] do not
      # match transformation function arguments ['field_X'].
      @online_transformation(request_context=rc, output_schema=output_schema)
      def ad_is_displayed_as_banner_transformer(field_X: pandas.Series):
        pass
    

January 8, 2021

Breaking Changes

  • The Tecton Primitive .get() accessor methods are now deprecated. To fetch a Tecton Primitive, please use the newer workspace methods such as workspace.get_entity("entity_name")
      from tecton import *
      workspace = get_workspace("prod")
      fp = workspace.get_feature_package("my_fp")
      fs = workspace.get_feature_service("my_fs")
      e = workspace.get_entity("my_entity")
      t = workspace.get_transformation("my_transform")
      vds = workspace.get_virtual_data_source("my_vds")
    

December 23, 2020

New Features

  • New Monitoring, Alerting, and Debugging Tools: Monitoring, alerting, and debugging tools are now available to ensure that production FeaturePackages remain in a healthy state. A brief list of the new tools are below, however, more information can be found in the documentation.

    • Add alert_email to the MonitoringConfig of a FeaturePackage to enable email alert. Alert Email
    • Navigate to the "Materialization" tab in the Web UI to see new information on the status of processing jobs and other helpful information.
    • Use the tecton materialization-status [FP_NAME] command in the CLI to retrieve more detailed materialization processing job information for a specific FeaturePackage
    • Use the tecton freshness command in the CLI to retrieve cluster-level freshness information for all production FeaturePackages.
  • Feature Repo File Paths: All Tecton objects now show their Feature Repo file path in the UI to make them easier to discover and edit. Repo Link

  • FeatureService Metadata API FeatureServices have a new metadata API for fetching information about expected input parameters and returned features.
      curl -X POST https://staging.tecton.ai/v1/feature-service/metadata -H "Authorization: Tecton-key $API_KEY" -d\
      '{ "params": { "feature_service_name": "yolo2" } }'
    
      {
        "featureServiceType" : "DEFAULT",
        "inputRequestContextKeys":[{"name":"device_type","type":"string"},{"name":"a","type":"string"}],
          "featureValues":[{"name":"oofp.is_mobile_device","type":"boolean"}]
      }
    

December 7, 2020

New Features

  • PushFeaturePackages: Users can now create PushFeaturePackages to ingest features generated outside of Tecton and load them into the offline and online Feature Stores for training or prediction. For a detailed example, check out Pushing Feature Values into Feature Stores

      import tecton
      import pandas
    
      fp = tecton.get_feature_package('user_purchases_push_fp')
    
      pandas_df = pandas.DataFrame([{
        "timestamp": pandas.Timestamp("2020-09-18 12:00:06", tz="UTC"),
        "userid": "u123",
        "num_purchases": 91
      }])
    
      fp.ingest(pandas_df)
    

  • Improved Documentation and Usability: The Tecton documentation has been updated with easier navigation and a focus on practical examples. Recently, the Tecton team has also shipped a large number of usability improvements throughout the product including bug fixes, better error messages, better validations, and more.

November 6, 2020

New Features

  • FeaturePackage Freshness Monitoring: FeaturePackages now show their "actual freshness" value under the "Materialization" tab. Actual Freshness

October 30, 2020

New Features

  • Tecton Feature Freshness CLI Overview: Users can now view the freshness of all features in the CLI by running tecton freshness.
    $ tecton freshness
               Feature Package               Stale?   Freshness   Expected Freshness     Created At
    =================================================================================================
    ad_ground_truth_ctr_performance_7_days   N        14h 40m     2d                   10/01/20 2:25
    user_ad_impression_counts                N        40m 24s     2h                   10/01/20 2:16
    content_keyword_ctr_performance:v2       N        40m 25s     2h                   09/04/20 22:22
    ad_group_ctr_performance                 N        40m 26s     2h                   08/26/20 12:52
    ad_is_displayed_as_banner                -        -           -                    07/24/20 13:51
    

October 19, 2020

New Features

  • Snowflake Data Sources: Tecton now supports Snowflake as a data source!
      click_stream_snowflake_ds = SnowflakeDSConfig(
      url="https://[your-cluster].eu-west-1.snowflakecomputing.com/",
      database="YOUR_DB",
      schema="CLICK_STREAM_SCHEMA",
      warehouse="COMPUTE_WH",
      table="CLICK_STREAM",
      )
    
      transaction_snowflake_vds = VirtualDataSource(
        name="click_stream_snowflake_vds",
        batch_ds_config=click_stream_snowflake_ds,
      )
    

October 9, 2020

New Features

  • Web UI Materialization Job Monitoring: Materialization jobs are now displayed to help monitor FeaturePackages.

    The easiest way to check the health of a materialized FeaturePackage is now through the Web UI. Navigate to the FeaturePackage in question and switch to the "Materialization" tab to see FeaturePackage materialization diagnostics at a glance.

    The new "Materialization Jobs" table displays the most relevant information about a FeaturePackage's materialization jobs. Retried jobs are grouped into rows, and the most recent job's status is displayed. Visit the "Run Page" for a row to view more specific job information or use the SDK to dive deeper into Materialization Jobs.

    Materialization Status UI

  • Easier Tecton CLI Login: Users can now log into the Tecton CLI simply by running tecton login [cluster URL]. This will automatically open a browser tab to authenticate. tecton configure will now be deprecated, as users no longer need to set keys manually. Tecton Login

September 25, 2020

New Features

  • Improvements to SDK Materialization Status Monitoring: When running fp.materialization_status(verbose=True), users will now also see two additional columns for each run: "TERMINATION_REASON" and "STATE_MESSAGE". These columns should provide more information for failed materialization runs.
  • Simpler Feature Service Definitions: online_serving_enabled is now set to True by default in FeatureServices, making the typical FeatureService definition simpler. Set online_serving_enabled=False if you want to create a batch-only FeatureService.
  • Bug fixes in Saved Datasets.

September 18, 2020

New Features

  • Materialization Status Improvements: The Materialization Status graph in a FeaturePackage's Materialization tab now shows better descriptors that make it clear which bar is related to Streaming vs Batch data. Hover over the bar to view descriptor. Materialization Status
  • In the interactive SDK, users can now pass a flag to only show materialization errors by calling my_feature_package.materialization_status(only_errors=True)

September 11, 2020

New Features

  • Feature Package entities are now hyperlinks to specific Entity pages. Entity Links

September 4, 2020

New Features and Breaking Changes

  • Quicker Workspace Iteration: Users no longer have to confirm destructive changes when running tecton apply in non-prod workspaces. These safety checks are unnecessary because non-prod workspaces do not contain materialized data and can be easily restored to a prior state.
  • The default_join_keys parameter in the Entity class has been renamed to join_keys. default_join_keys will be deprecated.
      partner_entity = Entity(name="PartnerWebsite", join_keys=["partner_id"], description="The partner website participating in the ad network.")
    

August 21, 2020

New Features

  • Feature Summary Statistics: Tecton now computes and displays data summary statistics in the Web UI for features the have offline materialization enabled. Summary Stats

August 14, 2020

New Features

  • Feature Freshness Custom Monitoring: Users now can customize the freshness monitoring of their Feature Packages using MonitoringConfig. The Web UI will also reflect these configurations.

    In your Tecton declarative API configuration file, import MonitoringConfig to specify how your materialized Feature Package should be monitored. If you don't provide this config, we will compute a default threshold.

      from tecton import MonitoringConfig, TemporalFeaturePackage
    
      ...
    
      my_feature_package = TemporalFeaturePackage(
        name="my_feature_package",
        ...
        materialization=MaterializationConfig(
            schedule_interval="3d",
            ...
        ),
        monitoring_config = MonitoringConfig(
            monitor_freshness=True,
            expected_feature="2w"
        )
      )
    
    You can then find this in the Materialization tab on a Feature Package page on the Web UI.

    Transform FCOs

August 6, 2020

New Features

  • Kafka and Redshift Data Sources: Tecton now supports Kafka and Redshift as data sources!

July 31, 2020

New Features

  • First Class Transformations: Transformations are now first-class objects in Tecton. They can be cataloged with metadata, viewed in the UI, and fetched in a notebook. Transform FCOs
      import tecton
    
      # Prod workspace
      tecton.get_transformation('my_transformation')
    
      # Specified workspace
      ws = tecton.get_workspace('my_ws')
      ws.get_transformation('my_transformation')
    
      Property                                   Value
      ================================================================================
      name          my_transformation
      description   None
      created_at    2020-07-28 20:15:14
      defined_in    my/transformation.py
      owner         ravi
      type          SQL
      inputs        Transformations: ['transformation1']
                    Virtual Data Sources: None
      use_context   True
      transformer   def my_transformation(context, transformation1_view):
                        return f"""
                            SELECT
                                content_id,
                                SUM(clicked) as actual2,
                                to_timestamp('{context.feature_data_end_time}') as
                    timestamp
                            FROM
                                {transformation1_view}
                            GROUP BY
                                content_id
                        """
    

July 27, 2020

New Features and Breaking Changes

  • Simpler FeatureService Definitions: Specifying the features that are used in a FeatureService is now done in the constructor. Along with this change, the FeatureService.add() method has been deprecated. This change ensures that a FeatureService definition has a single source of truth, and makes the FeatureService class consistent with other Tecton classes.
    from tecton import FeatureService
    from feature_repo.features import my_package1, my_package2
    
    my_service = FeatureService(
        name='example_feature_service',
        features=[
            my_package1,
            my_package2
        ]
    )
    
  • Materialization parameters changes: When defining a FeaturePackage, all materialization-parameters are now specified in a configuration class, MaterializationConfig. This change is expected to increase organization and re-use (as MaterializationConfig can be reused across many FeaturePackages.) Some parameters have been renamed for clarity and brevity.

    from tecton import TemporalFeaturePackage, MaterializationConfig
    
    ad_ground_truth_ctr_performance_7_days = TemporalFeaturePackage(
        name="ad_ground_truth_ctr_performance_7_days",
        transformation=ad_ground_truth_ctr_performance_7_days_transformer,
        entities=[e.ad_entity],
        data_source_configs=[data_sources.ad_impressions_batch_config],
        materialization=MaterializationConfig(
            online_enabled=True,
            feature_start_time=datetime(2020, 6, 19),
            schedule_interval='1day',
            serving_ttl='1day',
            data_lookback_period='7days'
        ),
    )
    
    The table below lists the full changes:

    Old New
    online_materialization_enabled online_enabled
    offline_materialization_enabled offline_enabled
    feature_store_start_time feature_start_time
    batch_materialization_schedule schedule_interval
    data_lookback data_lookback_period
    serving_tll serving_ttl
  • DataSource class names: Datasource class names have been shortened for brevity. The table below lists the full changes:

    Old New
    HiveDataSourceConfig HiveDSConfig
    KinesisDataSourceConfig KinesisDSConfig
    FileDataSourceConfig FileDSConfig
  • Interactive and Declarative class split: Tecton has fully split its Interactive and Declarative Python classes. The Reference API now lists seperate pages for Interactive classes (which are used in notebooks and returned from functions such as tecton.get_feature_package()), and Declarative classes (which are used to declare Tecton objects in a Feature Repository.)

  • timestamp_key is now optional: When declaring a FeaturePackage, Tecton will now infer the timestamp_key argument when possible.

July 6, 2020

New Features

  • Metadata tagging: Tecton users now have the ability to add metadata tags to VirtualDataSources, Entities, Feature Packages, and Feature Services. Simply pass a Python dictionary containing all tags via the tags parameter to their constructors. These tags will show up in the Tecton Web UI.
      my_feature_package = TemporalFeaturePackage(
        name="my_example_temporal_feature_package",
        ...
        tags={
          'tag_key':'tag_value',
          'experimental':'true',
        }
      )
    
  • Plan Hooks for OnlineTransformation testing: Tecton supports Plan Hooks that run automatically every time tecton plan or tecton apply is run. This lets you trigger customizable behavior during key actions during the tecton workflow. Plan Hooks are great for creating unit tests for OnlineTransformations where errors would otherwise only be caught at runtime.

June 25, 2020

New Features and Breaking Changes

  • Configurable offline and online materialization: Users can now independently configure offline and online materialization for a FeaturePackage in order to optimize costs and store exactly the data that is needed.

    This is enabled via the new parameters, online_materialization_enabled, and offline_materialization_enabled. The materialization_enabled parameter has been removed.

    # Example: online materialization is required for serving,
    # but offline materialization for historical look-up is not required.
    
    my_feature_package = TemporalFeaturePackage(
        name='my_feature',
        ...
        online_materialization_enabled=True,
        offline_materialization_enabled=False,
    )
    
  • Easier experimentation with Feature Services: Users can now create FeatureServices that depend on FeaturePackages which are not materializing, allowing for richer experimentation before materialization is enabled.

    This is enabled via the new online_serving_enabled parameter on FeatureService, which configures whether a FeatureService can serve feature values online. By setting online_serving_enabled to False, users can now create FeatureServices with non-materializing FeaturePackages.

    online_serving_enabled defaults to False, meaning that the default behavior of FeatureServices has changed.

    # to use a FeatureService for online queries, online_serving_enabled
    # must be explicitly set to True.
    
    my_feature_service = FeatureService(
        name='feature_service_for_online_use',
        ...,
        online_serving_enabled=True
    )
    

June 11, 2020

New Features

  • Faster Tecton CLI: The Spark driver initialization has been removed from the CLI, making it much quicker to run tecton plan and tecton apply. Try it out in the latest SDK!
  • Workspace names are now included in the Web UI URL to enable direct linking to objects in a Workspace.
    Workspaces URL

New Features and Breaking Changes

  • The command for creating a Workspace has been changed from tecton workspace new [workspace] to tecton workspace create [workspace]. For a complete list of Workspace commands, check out Using Workspaces.

May 11, 2020

New Features

  • Workspaces: Users can now define different Tecton Workspaces which offer an isolated environment for experimental iteration. Workspaces are designed to work well with code branches. To get started try entering the CLI commands below and then navigate to your new workspace in the Tecton Web UI. You can find detailed documentation on Workspaces here.
    $ git checkout -b [name]
    $ tecton workspace create [name]
    $ tecton apply
    
  • File Data Sources now support Parquet and CSV file formats.
  • The Tecton CLI now provides more helpful error messages that tell where offending objects are defined.

Bug Fixes

  • http://[domain].tecton.ai now correctly redirects to https instead of hanging.