Skip to content

Interacting with the Feature Store

Overview

You can interact with the Tecton Feature Store in three ways:

  • To define the feature pipelines running in production, use the Tecton command line interface (CLI). In the CLI, you write configuration code to apply changes to a Feature Repository.
  • To monitor the state of the Feature Store and view feature metadata, use the Web UI. The Web UI provides a window into feature definitions, the health and status of feature pipelines, and performance metrics.
  • To interact with data from the Feature Store, use the Tecton SDK in a Databricks or EMR notebook. The SDK enables you to fetch metadata, test online feature requests with Python. You cannot alter the state of the feature repository from the Tecton SDK.

The following diagram shows Tecton's high-level architecture. The three interaction tools are shown at the bottom of the picture.

The Tecton Architecture

The examples below show how to:

  1. Define a feature in the Feature Repository using the Tecton CLI
  2. Review the data by reading the feature metadata using the Web UI and then previewing data using the Python SDK

Defining a Feature Using the CLI

To define a feature:

  1. Make a change to your local Feature Store
  2. Generate an execution plan for applying your changes to the Tecton cluster
  3. Apply your changes to the cluster

Making a Local Change

The state of the Feature Store is defined using a Feature Repository, which is represented by a set of local configuration files. In Tecton, these configuration files are written in Python. To make a change to the Feature Repository, start by editing its files on your local machine.

Following the example below, add a new Transformation and Feature Package to the Feature Store by adding a new file to the Feature Repository.

from datetime import datetime
from tecton import TemporalAggregateFeaturePackage, FeatureAggregation, DataSourceConfig, sql_transformation, MaterializationConfig
from feature_repo.shared import data_sources, entities

@sql_transformation(inputs=data_sources.ad_impressions_stream)
def content_keyword_ctr_performance_transformer(input_df):
    return f"""
        select
            content_keyword,
            clicked,
            1 as impression,
            timestamp
        from
            {input_df}
        """

content_keyword_ctr_performance = TemporalAggregateFeaturePackage(
    name="content_keyword_ctr_performance",
    description="[Stream Feature] The aggregate CTR of a content_keyword across all impressions (clicks / total impressions)",
    entities=[entities.content_keyword_entity],
    transformation=content_keyword_ctr_performance_transformer,
    aggregation_slide_period="1h",
    aggregations=[
        FeatureAggregation(column="impression", function="count", time_windows=["1h", "12h", "24h","72h","168h"]),
        FeatureAggregation(column="clicked", function="sum", time_windows=["12h", "24h","72h","168h"])
        ],
    materialization=MaterializationConfig(
        online_enabled=True,
        offline_enabled=True,
        feature_start_time=datetime(2020, 6, 1),
    )
)

For more information about the CLI, see Setting Up the CLI.

Generating an Execution Plan

In the previous step, you edited the configuration of the Feature Repository, but did not apply it. Using a software development as an analogy, you edited the source code of the Feature Store, but haven't deployed the changes yet.

Use the Tecton CLI on your local terminal to apply the local configuration against the Tecton cluster. Following the code example below, use the tecton plan command to generate an execution plan.

$ tecton plan
Using workspace "prod"
✅ Imported 15 Python modules from the feature repository
✅ Collecting local feature declarations
✅ Performing server-side validation of feature declarations
 ↓↓↓↓↓↓↓↓↓↓↓↓ Plan Start ↓↓↓↓↓↓↓↓↓↓

  + Create Transformation
        name: content_keyword_ctr_performance_transformer
        type: SQL

  + Create FeaturePackage
    name:            content_keyword_ctr_performance
    description:     A FeatureService used for supporting a CTR prediction model.
        transformation: content_keyword_ctr_performance_transformer
 ↑↑↑↑↑↑↑↑↑↑↑↑ Plan End ↑↑↑↑↑↑↑↑↑↑↑↑

When you run tecton plan, Tecton fetches the current state of the cluster and determines what actions are necessary to reach the state specified in the configuration files.

Applying Changes

If the output of tecton plan meets your expectations and you'd like to apply your changes to the Tecton cluster, run the tecton apply command.

$ tecton apply
Using workspace "prod"
✅ Imported 15 Python modules from the feature repository
✅ Collecting local feature declarations
✅ Performing server-side validation of feature declarations
 ↓↓↓↓↓↓↓↓↓↓↓↓ Plan Start ↓↓↓↓↓↓↓↓↓↓

  + Create Transformation
        name: content_keyword_ctr_performance_transformer
        type: SQL

  + Create FeaturePackage
    name:            content_keyword_ctr_performance
    description:     A FeatureService used for supporting a CTR prediction model.
        transformation: content_keyword_ctr_performance_transformer
 ↑↑↑↑↑↑↑↑↑↑↑↑ Plan End ↑↑↑↑↑↑↑↑↑↑↑↑
Are you sure you want to apply this plan? [y/N]>

Enter "yes" or "y" to apply your changes to the Tecton cluster. If your updates changes a feature that is running in production, Tecton prompts you for additional conformation.

Are you sure you want to apply this plan? [y/N]> y
🎉 all done!

Reviewing the Feature Store

Next, view the Feature Package in the Web UI. Use the Feature Package to generate data using the Tecton SDK.

Examine the Feature Store Using the Web UI

The Tecton Web UI is a read-only view into the feature pipelines running in production. The Web UI for your cluster is available at <yourcluster>.tecton.ai.

Use the Web UI to view metadata about these pipelines and to monitor their health. In the screenshot below, see the list view of Feature Packages running in production for a cluster.

Web UI List of Features

Click into a Feature Package to see more detailed information, such as source code for the Feature Package's Transformation and monitoring information about the materialization of the Feature Package. Tecton exposes the following information about the health of feature pipelines:

  • The status of feature computation jobs
  • The date ranges of raw data that have been processed by Tecton, and any errors that were encountered during a date range
  • For serving endpoints, performance metrics such as serving latency, percentage of calls resulting in errors, and requests per second

The screenshot below shows information about the ad_ground_truth_ctr_performance_7_days feature using the Materialization tab.

Web UI Feature Detail

Fetch Data using the Tecton SDK

To fetch data from the Feature Store, use Tecton's Python SDK in a Databricks or EMR notebook. During setup, you'll connect your Spark cluster to Tecton in order to fetch offline features and training data. In the following example, you'll fetch data from the ad_ground_trurth_ctr_performance_7_days Feature Package.

import tecton

# Fetch an instance of a FeaturePackage
my_fp = tecton.get_feature_package('ad_ground_trurth_ctr_performance_7_days')

# Preview the data
my_fp.preview()

Similar code is used to interact with Feature Services, Data Sources, and Transformations. Reference documentation for the Python SDK is available in the API Reference in "Accessing Feature Primitives."