Skip to content

Feature Services

Feature Services are sets of features which are exposed as an API. This API can be used for batch lookups of feature values (e.g. generating training datasets or feature dataframes for batch prediction), or low-latency requests for individual feature vectors.

Feature Services reference a set of features from Feature Views. Feature Views must be created before they can be served in a Feature Service, since Feature Services are only a consumption layer on top of existing features.

It is generally recommended that each model deployed in production have one associated Feature Service deployed, which serves features to the model.

A Feature Service provides:

  • A REST API to access feature values at the time of prediction
  • A one-line method call to rapidly construct training data for user-specified timestamps and labels
  • The ability to observe the endpoint where the data is served to monitor serving throughput, latency, and prediction success rate

Defining a Feature Service

Define a Feature Service using the FeatureService class.

Attributes

A Feature Service definition includes the following attributes:

  • name: The unique name of the Feature Service
  • features: The features defined in a Feature View or Feature Table, and served by the Feature Service
  • online_serving_enabled: Whether online serving is enabled for this Feature Service (defaults to True.)
  • Metadata used to organize the FeatureService. Metadata parameters include description, owner, family, and tags.

Example: Defining a Feature Service

The following example defines a Feature Service.

from tecton import FeatureService, FeaturesConfig
from feature_repo.shared.features.ad_ground_truth_ctr_performance_7_days import ad_ground_truth_ctr_performance_7_days
from feature_repo.shared.features.user_total_ad_frequency_counts import user_total_ad_frequency_counts
from feature_repo.shared.features.user_ad_impression_counts import user_ad_impression_counts

ctr_prediction_service = FeatureService(
    name='ctr_prediction_service',
    description='A Feature Service used for supporting a CTR prediction model.',
    online_serving_enabled=True,
    features=[
        # add all of the features in a Feature View
        user_total_ad_frequency_counts,
        # add a single feature from a Feature View using double-bracket notation
        user_ad_impression_counts[["count"]]
    ],
    family='ad_serving',
    tags={'release': 'production'},
    owner="matt@tecton.ai",
)
  • The Feature Service uses the user_total_ad_frequency_counts, and user_ad_impression_counts Feature Views.
  • The list of features in the Feature Service are defined in the features argument. When you pass a FeatureView in this argument, the Feature Service will contain all the features in the Feature View. To select a subset of features in a Feature View, use double-bracket notation (e.g. FeatureView[['my_feature', 'other_feature']].)

Using Feature Services

Using the low-latency REST API Interface

To use the FeatureService's REST API end, send a request containing the join keys of the feature vector to be retrieved. Tecton's response contains the full feature vector, including the join keys.

To request a single feature vector from the REST API, use the /get-features endpoint. Pass the Feature Service name and the join keys as parameters. The response is a JSON object.

Important

The request is authenticated with an API key that you create using the tecton create-api-key CLI command. See Access Controls & Secrets for more information.

Example Request

$ curl -X POST https://<your_cluster>.tecton.ai/api/v1/feature-service/get-features\
     -H "Authorization: Tecton-key $TECTON_API_KEY" -d\
         '{
       "params": {
         "feature_service_name": "ad_ctr_feature_service",
         "join_key_map": {
           "ad_id": "5417",
           "user_id": "6c423390-9a64-52c8-9bb3-bbb108c74198"
         }
       }
     }'

Response

{
  "result": {
    "features": [
      "3",
      "46",
      "118"
    ]
  }
}

Metadata options for the REST API

You can specify metadata_options to get additional relevant information about your feature vector.

  • include_names: the name of each feature in the vector
  • include_effective_times: timestamp of the most recent feature value that was written to the online store
  • include_types: the types of each feature in the vector
  • include_slo_info: information about the server response time

Example Request

$ curl -X POST https://<your_cluster>.tecton.ai/api/v1/feature-service/get-features\
     -H "Authorization: Tecton-key $TECTON_API_KEY" -d\
'{
  "params": {
    "feature_service_name": "ad_ctr_feature_service",
    "join_key_map": {
      "ad_id": "5417",
      "user_id": "6c423390-9a64-52c8-9bb3-bbb108c74198"
    },
    "metadata_options": {
      "include_names": true,
      "include_effective_times": true,
      "include_types": true,
      "include_slo_info": true
    }
  }
}'

Example Response

{
  "result": {
    "features": [
      "3",
      "46",
      "118"
    ]
  },
  "metadata": {
    "features": [
      {
        "name": "user_impression_counts.impression_count_1h_1h",
        "effectiveTime": "2021-06-11T01:00:00Z",
        "type": "int64"
      },
      {
        "name": "user_impression_counts.impression_count_24h_1h",
        "effectiveTime": "2021-06-11T01:00:00Z",
        "type": "int64"
      },
      {
        "name": "user_impression_counts.impression_count_72h_1h",
        "effectiveTime": "2021-06-11T01:00:00Z",
        "type": "int64"
      }
    ],
    "sloInfo": {
      "sloEligible": true,
      "sloServerTimeSeconds": 0.036373672,
      "dynamodbResponseSizeBytes": 15023,
      "serverTimeSeconds": 0.042795387
    }
  }
}

Important

Tecton represents double feature values as JSON numbers and int64 feature values as JSON strings.

This is because JSON does not specify a precision for numerical values, and most JSON libraries treat all numerical values as double-precision floating point numbers. Representing int64 values as double-precision floating point numbers is problematic because not all values can be represented exactly.

As a result, Tecton serializes int64 values in the response body as strings, which can be seen in the example response above. It is recommended to parse the string as a signed 64 bit integer in your client application to maintain full precision.

Using the Online Feature Retrieval SDK Interface

To fetch real-time data from a Feature Service using the Python SDK as a client, use the FeatureService.get_feature_vector() method.

import tecton

feature_service = tecton.get_feature_service("price_prediction_feature_service")

# Request features for a single (user, item) tuple
join_keys = { "user_id": "demo_user_123", "item_id": "demo_item_987" }

# Sample code matching example in above section
# Assumes model is built in sklearn, and that the order and type of feature values are consistent with the model object
scoring_features = feature_service.get_feature_vector(join_keys=join_keys)
predicted_price = cls.predict_proba(scoring_features)

Using the Offline Feature Retrieval SDK Interface

Use the offline or batch interface for batch prediction jobs or to generate training datasets. To fetch a dataframe from a Feature Service with the Python SDK as a client, use the FeatureService.get_feature_dataframe() method.

To make a batch request, first create a context consisting of the join keys for prediction and the desired feature timestamps. Then, pass these events to the Feature Service method get_feature_dataframe().

events = spark.read.parquet('dbfs:/sample_events.pq')
display(events)

Events Table

Tecton then generates the feature values.

import tecton
feature_service = tecton.get_feature_service("price_prediction_feature_service")
result_spark_df = feature_service.get_feature_dataframe(events).to_spark()

Results Table

Using Feature Logging

Feature Services have the ability to continuously log online requests and feature vector responses as Tecton Datasets. These logged feature datasets can be used for auditing, analysis, training dataset generation, and spine creation.

Feature Logging Diagram

To enable feature logging on a FeatureService, simply add a LoggingConfig like in the example below and optionally specify a sample rate. You can also optionally set log_effective_times=True to log the feature timestamps from the Feature Store. As a reminder, Tecton will always serve the latest stored feature values as of the time of the request.

Run tecton apply to apply your changes.

from tecton import LoggingConfig

ctr_prediction_service = FeatureService(
    name='ctr_prediction_service',
    features=[
        ad_ground_truth_ctr_performance_7_days,
        user_total_ad_frequency_counts
    ],
    logging=LoggingConfig(
        sample_rate=0.5,
        log_effective_time=False,
    )
)

This will create a new Tecton Dataset under the Datasets tab in the Web UI. This dataset will continue having new feature logs appended to it every 30 mins. If the features in the Feature Service change, a new dataset version will be created.

Logged Features

This dataset can be fetched in a notebook using the code snippet below.

import tecton
dataset = tecton.get_dataset('ctr_prediction_service.logged_requests.4')
display(dataset.to_spark())

Logged Features Dataset