Skip to main content
Version: 0.8

tecton.FeatureService

Summary​

A Tecton Feature Service.

In Tecton, a Feature Service exposes an API for accessing a set of FeatureViews.

Once deployed in production, each model has one associated Feature Service that serves the model its features. A Feature Service contains a list of the Feature Views associated with a model. It also includes user-provided metadata such as name, description, and owner that Tecton uses to organize feature data.

Example​

from tecton import FeatureService, LoggingConfig
# Import your feature views declared in your feature repo directory
from feature_repo.features.feature_views import last_transaction_amount_sql, transaction_amount_is_high
...

# Declare Feature Service
fraud_detection_feature_service = FeatureService(
name='fraud_detection_feature_service',
description='A FeatureService providing features for a model that predicts if a transaction is fraudulent.',
features=[
last_transaction_amount_sql,
transaction_amount_is_high,
...
]
logging=LoggingConfig(
sample_rate=0.5,
log_effective_times=False,
)
)

Attributes​

NameData TypeDescription
created_atOptional[str]The time that this Tecton object was created or last updated.
defined_inOptional[str]The repo filename where this object was declared.
descriptionstrReturns the description of the Tecton object.
featuresList[framework_feature_view.FeatureReference]Returns the list of feature references included in this feature service.
feature_viewsSet[framework_feature_view.FeatureView]Returns the deduplicated set of feature views included in this feature service. Feature views may be included in feature service multiple times.
idstrReturns the unique id of the Tecton object.
info
namestrReturns the name of the Tecton object.
ownerOptional[str]Returns the owner of the Tecton object.
tagsDict[str, str]Returns the tags of the Tecton object.
workspaceOptional[str]Returns the workspace that this Tecton object belongs to.

Methods​

NameDescription
__init__()Instantiates a new FeatureService.
get_feature_columns()Returns the list of all feature columns included in this feature service.
get_features_for_events(...)Fetch a TectonDataFrame of feature values from this Feature Service.
get_historical_features(...)Fetch a TectonDataFrame of feature values from this FeatureService.
get_online_features(...)Returns a single Tecton tecton.FeatureVector from the Online Store.
query_features(...)[Advanced Feature] Queries the FeatureService with a partial set of join_keys defined in the online_serving_index of the included FeatureViews.
summary()Displays a human readable summary of this Feature View.
validate()Validate this Tecton object and its dependencies (if any).

__init__(...)​

Instantiates a new FeatureService.

Parameters​

  • name (str) – A unique name for the Feature Service.

  • description (Optional[str]) – A human-readable description. (Default: None)

  • tags (Optional[Dict[str, str]]) – Tags associated with this Tecton Object (key-value pairs of arbitrary metadata). (Default: None)

  • owner (Optional[str]) – Owner name (typically the email of the primary maintainer). (Default: None)

  • prevent_destroy (bool) – If True, this Tecton object will be blocked from being deleted or re-created (i.e. a destructive update) during tecton plan/apply. To remove or update this object, prevent_destroy must be first set to False via the same tecton apply or a separate tecton apply. prevent_destroy can be used to prevent accidental changes such as inadvertently deleting a Feature Service used in production or recreating a Feature View that triggers expensive rematerialization jobs. prevent_destroy also blocks changes to dependent Tecton objects that would trigger a recreate of the tagged object, e.g. if prevent_destroy is set on a Feature Service, that will also prevent deletions or re-creates of Feature Views used in that service. prevent_destroy is only enforced in live (i.e. non-dev) workspaces. (Default: False)

  • online_serving_enabled (bool) – (Optional, default True) If True, users can send realtime requests to this FeatureService, and only FeatureViews with online materialization enabled can be added to this FeatureService. (Default: True)

  • features (Optional[List[Union[FeatureReference, FeatureView]]]) – The list of FeatureView or FeatureReference that this FeatureService will serve. (Default: None)

  • logging (Optional[LoggingConfig]) – A configuration for logging feature requests sent to this Feature Service. (Default: None)

  • on_demand_environment (Optional[str]) – The environment in which all the on demand feature views for this feature service should be executed. Defaults to None, which means the on demand feature views are executed in the same environment as the feature server, without any resource isolation or dependencies. This may be preferred for low-latency feature services which do not have dependencies. Learn more about environments at Environments. (Default: None)

  • enable_online_caching: (bool) – (Optional, default False) If True, the Feature Service will attempt to retrieve a cached value of Feature Views which have caching setup.

get_feature_columns()​

Returns the list of all feature columns included in this feature service.

get_features_for_events(...)​

info

This method is functionally equivalent to get_historical_features(spine) and has been renamed in Tecton 0.8+ for clarity. get_historical_features() will be deprecated in a future release.

Fetch a TectonDataFrame of feature values from this Feature Service.

This method will return feature values for each row provided in the events DataFrame. The feature values returned by this method will respect the timestamp provided in the timestamp column of the events DataFrame.

Parameters​

  • events (Union[pyspark.sql.DataFrame,pandas.DataFrame, TectonDataFrame]) – A DataFrame of possible join keys, request data keys, and timestamps that specify which feature values to fetch. To distinguish between columns in the events DataFrame and feature columns, feature columns are labeled as feature_view_name__feature_name in the returned DataFrame.

  • timestamp_key (str) – Name of the time column in the events DataFrame. This method will fetch the latest features computed before the specified timestamps in this column. Not applicable if the Feature Service strictly contains OnDemandFeatureViews with no feature view dependencies. (Default: None)

  • from_source (bool) – Whether feature values should be recomputed from the original data source. If None, feature values will be fetched from the Offline Store for Feature Views that have offline materialization enabled and otherwise computes feature values on the fly from raw data. Use from_source=True to force computing from raw data and from_source=False to error if any Feature Views are not materialized. (Default: None)

  • save (bool) – Whether to persist the DataFrame as a Dataset object. This parameter is not supported in Tecton on Snowflake. (Default: False)

  • save_as (str) – Name to save the DataFrame as. If unspecified and save=True, a name will be generated. This parameter is not supported in Tecton on Snowflake. (Default: None)

  • compute_mode (Union[str, tecton.ComputeMode, None]) – Compute mode to use to produce the DataFrame. Valid string values are "spark", "snowflake", "athena", and "rift".

Returns​

A TectonDataFrame

Examples​

A Feature Service fs that contains a Batch Feature View and Stream Feature View with join keys user_id and ad_id.

  1. fs.get_features_for_events(events) where events=pandas.Dataframe({'user_id': [1,2,3], 'ad_id': ['a234', 'b256', 'c9102'], 'date': [datetime(...), datetime(...), datetime(...)]})

Fetches historical features from the offline store for users 1, 2, and 3 for the specified timestamps and ad ids in the events DataFrame.

  1. fs.get_features_for_events(events, save_as='my_dataset) where events=pandas.Dataframe({'user_id': [1,2,3], 'ad_id': ['a234', 'b256', 'c9102'], 'date': [datetime(...), datetime(...), datetime(...)]})

Fetches historical features from the offline store for users 1, 2, and 3 for the specified timestamps and ad ids in the events DataFrame. Save the DataFrame as dataset with the name my_dataset.

  1. fv.get_features_for_events(events, timestamp_key='date_1') where events=pandas.Dataframe({'user_id': [1,2,3], 'ad_id': ['a234', 'b256', 'c9102'], 'date_1': [datetime(...), ...], 'date_2': [datetime(...), ...]})

Fetches historical features from the offline store for users 1, 2, and 3 for the ad ids and specified timestamps in the date_1 column in the events DataFrame.

A Feature Service fs_on_demand that contains only OnDemandFeatureViews and expects request time data for the key amount.

The request time data is defined in the feature definition as such:

request_schema = StructType()
request_schema.add(StructField("amount", DoubleType()))
transaction_request = RequestDataSource(request_schema=request_schema)
  1. fs_on_demand.get_features_for_events(events) where events=pandas.Dataframe({'amount': [30, 50, 10000]})

Fetches historical features from the offline store with request data inputs 30, 50, and 10000.

A Feature Service fs_all that contains feature views of all types with join key β€˜user_id’ and expects request time data for the key amount.

  1. fs_all.get_features_for_events(events) where events=pandas.Dataframe({'user_id': [1,2,3], 'amount': [30, 50, 10000], 'date': [datetime(...), datetime(...), datetime(...)]})

Fetches historical features from the offline store for users 1, 2, and 3 for the specified timestamps and request data inputs in the events data.

get_historical_features(...)​

Fetch a TectonDataFrame of feature values from this FeatureService.

This method will return feature values for each row provided in the spine DataFrame. The feature values returned by this method will respect the timestamp provided in the timestamp column of the spine DataFrame.

By default (i.e. from_source=None), this method fetches feature values from the Offline Store for Feature Views that have offline materialization enabled and otherwise computes feature values on the fly from raw data.

Parameters​

  • spine (Union[pyspark.sql.DataFrame,pandas.DataFrame, TectonDataFrame]) – A dataframe of possible join keys, request data keys, and timestamps that specify which feature values to fetch.To distinguish between spine columns and feature columns, feature columns are labeled as feature_view_name__feature_name in the returned DataFrame.

  • timestamp_key (str) – Name of the time column in the spine DataFrame. This method will fetch the latest features computed before the specified timestamps in this column. Not applicable if the FeatureService strictly contains OnDemandFeatureViews with no feature view dependencies. (Default: None)

  • include_feature_view_timestamp_columns (bool) – Whether to include timestamp columns for each FeatureView in the FeatureService. (Default: False)

  • from_source (bool) – Whether feature values should be recomputed from the original data source. If None, feature values will be fetched from the Offline Store for Feature Views that have offline materialization enabled and otherwise computes feature values on the fly from raw data. Use from_source=True to force computing from raw data and from_source=False to error if any Feature Views are not materialized. (Default: None)

  • save (bool) – Whether to persist the DataFrame as a Dataset object. This parameter is not supported in Tecton on Snowflake. (Default: False)

  • save_as (str) – Name to save the DataFrame as. If unspecified and save=True, a name will be generated. This parameter is not supported in Tecton on Snowflake. (Default: None)

  • compute_mode (Union[str, tecton.ComputeMode, None]) – Compute mode to use to produce the data frame. Valid string values are "spark", "snowflake", "athena", and "rift".

Returns​

A TectonDataFrame

Examples​

A FeatureService fs that contains a BatchFeatureView and StreamFeatureView with join keys user_id and ad_id.

  1. fs.get_historical_features(spine) where spine=pandas.Dataframe({'user_id': [1,2,3], 'ad_id': ['a234', 'b256', 'c9102'], 'date': [datetime(...), datetime(...), datetime(...)]})

Fetch historical features from the offline store for users 1, 2, and 3 for the specified timestamps and ad ids in the spine.

  1. fs.get_historical_features(spine, save_as='my_dataset) where spine=pandas.Dataframe({'user_id': [1,2,3], 'ad_id': ['a234', 'b256', 'c9102'], 'date': [datetime(...), datetime(...), datetime(...)]})

Fetch historical features from the offline store for users 1, 2, and 3 for the specified timestamps and ad ids in the spine. Save the DataFrame as dataset with the name my_dataset.

  1. fv.get_historical_features(spine, timestamp_key='date_1') where spine=pandas.Dataframe({'user_id': [1,2,3], 'ad_id': ['a234', 'b256', 'c9102'], 'date_1': [datetime(...), ...], 'date_2': [datetime(...), ...]})

Fetch historical features from the offline store for users 1, 2, and 3 for the ad ids and specified timestamps in the date_1 column in the spine.

A FeatureService fs_on_demand that contains only OnDemandFeatureViews and expects request time data for the key amount.

The request time data is defined in the feature definition as such:

request_schema = StructType()
request_schema.add(StructField("amount", DoubleType()))
transaction_request = RequestDataSource(request_schema=request_schema)
  1. fs_on_demand.get_historical_features(spine) where spine=pandas.Dataframe({'amount': [30, 50, 10000]})

Fetch historical features from the offline store with request data inputs 30, 50, and 10000.

A FeatureService fs_all that contains feature views of all types with join key β€˜user_id’ and expects request time data for the key amount.

  1. fs_all.get_historical_features(spine) where spine=pandas.Dataframe({'user_id': [1,2,3], 'amount': [30, 50, 10000], 'date': [datetime(...), datetime(...), datetime(...)]})

Fetch historical features from the offline store for users 1, 2, and 3 for the specified timestamps and request data inputs in the spine.

get_online_features(...)​

Returns a single Tecton tecton.FeatureVector from the Online Store. At least one of join_keys or request_data is required.

Parameters​

  • join_keys (Optional[Mapping[str, Union[int, int64, str, bytes]]]) – Join keys of the enclosed FeatureViews. (Default: None)

  • include_join_keys_in_response (bool) – Whether to include join keys as part of the response FeatureVector. (Default: False)

  • request_data (Optional[Mapping[str, Union[int, int64, str, bytes, float]]]) – Dictionary of request context values. Only applicable when the FeatureService contains OnDemandFeatureViews. (Default: None)

Returns​

A tecton.FeatureVector of the results.

Examples​

A FeatureService fs that contains a BatchFeatureView and StreamFeatureView with join keys user_id and ad_id.

  1. fs.get_online_features(join_keys={'user_id': 1, 'ad_id': 'c234'}) Fetch the latest features from the online store for user 1 and ad β€˜c234’.

  2. fv.get_online_features(join_keys={'user_id': 1, 'ad_id': 'c234'}, include_join_keys_in_response=True) Fetch the latest features from the online store for user 1 and ad id β€˜c234’. Include the join key information (user_id=1, ad_id=’c234’) in the returned FeatureVector.

A FeatureService fs_on_demand that contains only OnDemandFeatureViews and expects request time data for key amount.

The request time data is defined in the feature definition as such:

request_schema = StructType()
request_schema.add(StructField("amount", DoubleType()))
transaction_request = RequestDataSource(request_schema=request_schema)
  1. fs_on_demand.get_online_features(request_data={'amount': 30}) Fetch the latest features from the online store with amount = 30.

A FeatureService fs_all that contains feature views of all types with join key user_id and expects request time data for key amount.

  1. fs_all.get_online_features(join_keys={'user_id': 1}, request_data={'amount': 30}) Fetch the latest features from the online store for user 1 with amount = 30.

query_features(...)​

[Advanced Feature] Queries the FeatureService with a partial set of join_keys defined in the online_serving_index of the included FeatureViews. Returns a TectonDataFrame of all matched records.

Parameters​

  • join_keys (Mapping[str, Union[int, int64, str, bytes]]) – Query join keys, i.e., a union of join keys in the online_serving_index of all enclosed FeatureViews.

Returns​

A TectonDataFrame

summary()​

Displays a human readable summary of this Feature View.

validate()​

Validate this Tecton object and its dependencies (if any).

Validation performs most of the same checks and operations as tecton plan.

  1. Check for invalid object configurations, e.g. setting conflicting fields.

  2. For Data Sources and Feature Views, test query code and derive schemas. e.g. test that a Data Source’s specified s3 path exists or that a Feature View’s SQL code executes and produces supported feature data types.

Objects already applied to Tecton do not need to be re-validated on retrieval (e.g. fv = tecton.get_workspace('prod').get_feature_view('my_fv')) since they have already been validated during tecton plan. Locally defined objects (e.g. my_ds = BatchSource(name="my_ds", ...)) may need to be validated before some of their methods can be called, e.g. my_feature_view.get_historical_features().

Was this page helpful?