FeatureService
Summaryβ
In Tecton, a Feature Service exposes an API for accessing a set of FeatureViews.Β
Once deployed in production, each model has one associated Feature Service that serves the model its features. A Feature Service contains a list of the Feature Views associated with a model. It also includes user-provided metadata such as name, description, and owner that Tecton uses to organize feature data.
Exampleβ
from tecton import FeatureService, LoggingConfig
# Import your feature views declared in your feature repo directory
from feature_repo.features.feature_views import last_transaction_amount_sql, transaction_amount_is_high
...
# Declare Feature Service
fraud_detection_feature_service = FeatureService(
name='fraud_detection_feature_service',
description='A FeatureService providing features for a model that predicts if a transaction is fraudulent.',
features=[
last_transaction_amount_sql,
transaction_amount_is_high,
...
]
logging=LoggingConfig(
sample_rate=0.5,
log_effective_times=False,
)
)
Attributesβ
| Name | Data Type | Description |
|---|---|---|
created_at | Optional[datetime.datetime] | Returns the time that this Tecton object was created or last updated. None for locally defined objects. |
defined_in | Optional[str] | The repo filename where this object was declared. None for locally defined objects. |
description | str | Description of the Feature Service. |
enable_online_caching | bool | Whether Online Caching is enabled for this Feature Service |
feature_metadata | List[FeatureMetadata] | Returns the list of all feature columns included in this feature service as well as associated metadata including data type of the feature, user-defined descriptions, and user-defined tags. |
feature_views | Set[FeatureView] | Returns the set of Feature Views directly depended on by this Feature Service. Β A single Feature View may be included multiple times in a Feature Service under different namespaces. See the FeatureReference documentation. This method dedupes those Feature Views. |
features | List[FeatureReference] | Returns the list of feature references included in this Feature Service. Β FeatureReferences are references to Feature Views/Tables that may select a subset of features, override the Feature View/Table namespace, or re-map join-keys. |
id | str | Returns the unique id of the Tecton object. |
info | ||
name | str | Name of the Feature Service. |
online_serving_enabled | bool | Whether Online Serving is enabled for this Feature Service |
owner | str | Owner of the Feature Service. |
prevent_destroy | bool | Whether this Feature Service will be blocked from being deleted or re-created during tecton plan/apply. |
tags | Dict[str, str] | Returns the tags of the Tecton object. |
workspace | Optional[str] | Returns the workspace that this Tecton object belongs to. None for locally defined objects. |
Methodsβ
| Name | Description |
|---|---|
__init__(...) | Instantiates a new FeatureService. |
get_feature_columns() | Returns the list of all feature columns included in this feature service. |
get_features_for_events(...) | Fetch a TectonDataFrame of feature values from this FeatureService. |
get_historical_features(...) | [Deprecated in SDK 0.9] Fetch a TectonDataFrame of feature values from this FeatureService. |
get_job(...) | Retrieves data about the specified job. |
get_online_features(...) | Returns a single FeatureVector from the Online Store. |
list_jobs() | Retrieves the list of dataset jobs created for this feature service. |
query_features(...) | [Advanced Feature] Queries the FeatureService with a partial set of join_keys defined in the online_serving_index |
summary() | Displays a human-readable summary |
validate() | [Deprecated in SDK 1.0] Method is deprecated and will be removed in a future version. As of Tecton version 1.0, objects are validated upon object creation, so validation is unnecessary. |
__init__(...)β
Instantiates a new FeatureService.Parameters
name: strA unique name for the Feature Service.description: Optional[str] = NoneA human-readable description.tags: Optional[Dict[str, str]] = NoneTags associated with this Tecton Object (key-value pairs of arbitrary metadata).owner: Optional[str] = NoneOwner name (typically the email of the primary maintainer).prevent_destroy: bool = FalseIf True, this Tecton object will be blocked from being deleted or re-created (i.e. a destructive update) during tecton plan/apply. To remove or update this object,prevent_destroymust be set to False via the same tecton apply or a separate tecton apply.prevent_destroycan be used to prevent accidental changes such as inadvertently deleting a Feature Service used in production or recreating a Feature View that triggers expensive rematerialization jobs.prevent_destroyalso blocks changes to dependent Tecton objects that would trigger a recreate of the tagged object, e.g. ifprevent_destroyis set on a Feature Service, that will also prevent deletions or re-creates of Feature Views used in that service.prevent_destroyis only enforced in live (i.e. non-dev) workspaces.online_serving_enabled: bool = TrueIf True, users can send realtime requests to this FeatureService, and only FeatureViews with online materialization enabled can be added to this FeatureService.features: Optional[List[Union[FeatureReference, FeatureView]]] = NoneThe list of FeatureView or FeatureReference that this FeatureService will serve.logging: Optional[LoggingConfig] = NoneA configuration for logging feature requests sent to this Feature Service.on_demand_environment: Optional[str] = None(Deprecated) Renamed to realtime_environmentrealtime_environment: Optional[str] = NoneThe environment in which all the Realtime Feature Views for this feature service should be executed. Defaults toNone, which means the Realtime Feature Views are executed in the same environment as the feature service, without any resource isolation. This may be preferred for low-latency feature services which do not have dependencies. Learn more about environments at https://docs.tecton.ai/docs/defining-features/feature-views/realtime-feature-view/realtime-feature-view-environments.options: Optional[Dict[str, str]] = NoneAdditional options to configure the Feature Service. Used for advanced use cases and beta features.enable_online_caching: bool = FalseIf True, the feature server will read and write feature values to the online serving cache for feature views and tables that have caching enabled (have cache_config set).transform_server_group: Optional[TransformServerGroup] = NoneOptional, the Transform Server Group used for executing all Realtime Feature Views in the Feature Service. Defaults toNone, which means the Realtime Feature Views are executed in the same environment as the feature service, without any resource isolation.feature_server_group: FeatureServerGroup = NoneOptional, the Feature Server Group used for online feature serving.
get_feature_columns()β
Returns the list of all feature columns included in this feature service.Returns
List[str]get_features_for_events(...)β
This method is functionally equivalent to get_historical_features(spine) and
has been renamed in Tecton 0.8+ for clarity. get_historical_features() will be
deprecated in a future release.
TectonDataFrame of feature values from this FeatureService.Β
This method will return feature values for each row provided in the spine DataFrame. The feature values returned by this method will respect the timestamp provided in the timestamp column of the spine Data Frame.
Β
By default (i.e.
from_source=None), this method fetches feature values from the Offline Store for Feature
Views that have offline materialization enabled and otherwise computes feature values on the fly from raw data.Parameters
events: Union[pyspark.sql.dataframe.DataFrame, pandas.core.frame.DataFrame, TectonDataFrame, str]A dataframe of possible join keys, request data keys, and timestamps that specify which feature values to fetch.To distinguish between the event dataframe columns and feature columns, feature columns are labeled asfeature_view_name__feature_namein the returned DataFrame.timestamp_key: Optional[str] = NoneName of the time column in the spine DataFrame. This method will fetch the latest features computed before the specified timestamps in this column. Not applicable if the FeatureService strictly contains RealtimeFeatureViews with no feature view dependencies.from_source: Optional[bool] = NoneWhether feature values should be recomputed from the original data source. IfNone, feature values will be fetched from the Offline Store for Feature Views that have offline materialization enabled and otherwise computes feature values on the fly from raw data. Usefrom_source=Trueto force computing from raw data andfrom_source=Falseto error if any Feature Views are not materialized. Defaults to None.compute_mode: Union[tecton_core.compute_mode.ComputeMode, str, NoneType] = NoneCompute mode to use to produce the data frame.
Returns
TectonDataFrameExamplesβ
A Feature Service fs that contains a Batch Feature View and Stream Feature
View with join keys user_id and ad_id.
fs.get_features_for_events(events)whereevents=pandas.Dataframe({'user_id': [1,2,3], 'ad_id': ['a234', 'b256', 'c9102'], 'date': [datetime(...), datetime(...), datetime(...)]})
Fetches historical features from the offline store for users 1, 2, and 3 for the
specified timestamps and ad ids in the events DataFrame.
fs.get_features_for_events(events, save_as='my_dataset)whereevents=pandas.Dataframe({'user_id': [1,2,3], 'ad_id': ['a234', 'b256', 'c9102'], 'date': [datetime(...), datetime(...), datetime(...)]})
Fetches historical features from the offline store for users 1, 2, and 3 for the
specified timestamps and ad ids in the events DataFrame. Save the
DataFrame as dataset with the name my_dataset.
fv.get_features_for_events(events, timestamp_key='date_1')whereevents=pandas.Dataframe({'user_id': [1,2,3], 'ad_id': ['a234', 'b256', 'c9102'], 'date_1': [datetime(...), ...], 'date_2': [datetime(...), ...]})
Fetches historical features from the offline store for users 1, 2, and 3 for the
ad ids and specified timestamps in the date_1 column in the events
DataFrame.
A Feature Service fs_on_demand that contains only RealtimeFeatureViews and
expects request time data for the key amount.
The request time data is defined in the feature definition as such:
request_schema = StructType()
request_schema.add(StructField("amount", DoubleType()))
transaction_request = RequestDataSource(request_schema=request_schema)
fs_on_demand.get_features_for_events(events)whereevents=pandas.Dataframe({'amount': [30, 50, 10000]})
Fetches historical features from the offline store with request data inputs 30, 50, and 10000.
A Feature Service fs_all that contains feature views of all types with join
key βuser_idβ and expects request time data for the key amount.
fs_all.get_features_for_events(events)whereevents=pandas.Dataframe({'user_id': [1,2,3], 'amount': [30, 50, 10000], 'date': [datetime(...), datetime(...), datetime(...)]})
Fetches historical features from the offline store for users 1, 2, and 3 for the
specified timestamps and request data inputs in the events data.
get_historical_features(...)β
get_historical_features() is replaced by get_features_for_events() and get_features_in_range(). See Offline Retrieval Methods for details. TectonDataFrame of feature values from this FeatureService.Β
This method will return feature values for each row provided in the spine DataFrame. The feature values returned by this method will respect the timestamp provided in the timestamp column of the spine DataFrame.
Β
By default (i.e.
from_source=None), this method fetches feature values from the Offline Store for Feature
Views that have offline materialization enabled and otherwise computes feature values on the fly from raw data.Parameters
spine: Union[pyspark.sql.dataframe.DataFrame, pandas.core.frame.DataFrame, TectonDataFrame, str]A dataframe of possible join keys, request data keys, and timestamps that specify which feature values to fetch.To distinguish between spine columns and feature columns, feature columns are labeled asfeature_view_name__feature_namein the returned DataFrame.timestamp_key: Optional[str] = NoneName of the time column in the spine DataFrame. This method will fetch the latest features computed before the specified timestamps in this column. Not applicable if the FeatureService strictly contains RealtimeFeatureViews with no feature view dependencies.include_feature_view_timestamp_columns: bool = FalseWhether to include timestamp columns for each FeatureView in the FeatureService. Default is False.from_source: Optional[bool] = NoneWhether feature values should be recomputed from the original data source. IfNone, feature values will be fetched from the Offline Store for Feature Views that have offline materialization enabled and otherwise computes feature values on the fly from raw data. Usefrom_source=Trueto force computing from raw data andfrom_source=Falseto error if any Feature Views are not materialized. Defaults to None.compute_mode: Union[tecton_core.compute_mode.ComputeMode, str, NoneType] = NoneCompute mode to use to produce the data frame.
Returns
TectonDataFrameExamplesβ
A FeatureService fs that contains a BatchFeatureView and StreamFeatureView
with join keys user_id and ad_id.
fs.get_historical_features(spine)wherespine=pandas.Dataframe({'user_id': [1,2,3], 'ad_id': ['a234', 'b256', 'c9102'], 'date': [datetime(...), datetime(...), datetime(...)]})
Fetch historical features from the offline store for users 1, 2, and 3 for the specified timestamps and ad ids in the spine.
fs.get_historical_features(spine, save_as='my_dataset)wherespine=pandas.Dataframe({'user_id': [1,2,3], 'ad_id': ['a234', 'b256', 'c9102'], 'date': [datetime(...), datetime(...), datetime(...)]})
Fetch historical features from the offline store for users 1, 2, and 3 for the
specified timestamps and ad ids in the spine. Save the DataFrame as dataset with
the name my_dataset.
fv.get_historical_features(spine, timestamp_key='date_1')wherespine=pandas.Dataframe({'user_id': [1,2,3], 'ad_id': ['a234', 'b256', 'c9102'], 'date_1': [datetime(...), ...], 'date_2': [datetime(...), ...]})
Fetch historical features from the offline store for users 1, 2, and 3 for the
ad ids and specified timestamps in the date_1 column in the spine.
A FeatureService fs_on_demand that contains only RealtimeFeatureViews and
expects request time data for the key amount.
The request time data is defined in the feature definition as such:
request_schema = StructType()
request_schema.add(StructField("amount", DoubleType()))
transaction_request = RequestDataSource(request_schema=request_schema)
fs_on_demand.get_historical_features(spine)wherespine=pandas.Dataframe({'amount': [30, 50, 10000]})
Fetch historical features from the offline store with request data inputs 30, 50, and 10000.
A FeatureService fs_all that contains feature views of all types with join key
βuser_idβ and expects request time data for the key amount.
fs_all.get_historical_features(spine)wherespine=pandas.Dataframe({'user_id': [1,2,3], 'amount': [30, 50, 10000], 'date': [datetime(...), datetime(...), datetime(...)]})
Fetch historical features from the offline store for users 1, 2, and 3 for the specified timestamps and request data inputs in the spine.
get_job(...)β
Retrieves data about the specified job.Parameters
job_id: strID string of the job.
Returns
TectonJob:JobData object for the job.
get_online_features(...)β
Returns a singleFeatureVector from the Online Store.Parameters
join_keys: Optional[Mapping[str,Union[int, numpy.int64, str, bytes]]] = NoneJoin keys of the enclosed FeatureViews.include_join_keys_in_response: bool = FalseWhether to include join keys as part of the response FeatureVector.request_data: Optional[Mapping[str,Union[int, numpy.int64, str, bytes, float]]] = NoneDictionary of request context values. Only applicable when the FeatureService contains RealtimeFeatureViews.
Returns
FeatureVector:FeatureVector of the results.
Examplesβ
A FeatureService fs that contains a BatchFeatureView and StreamFeatureView
with join keys user_id and ad_id.
-
fs.get_online_features(join_keys={'user_id': 1, 'ad_id': 'c234'})Fetch the latest features from the online store for user 1 and ad βc234β. -
fv.get_online_features(join_keys={'user_id': 1, 'ad_id': 'c234'}, include_join_keys_in_response=True)Fetch the latest features from the online store for user 1 and ad id βc234β. Include the join key information (user_id=1, ad_id=βc234β) in the returned FeatureVector.
A FeatureService fs_on_demand that contains only RealtimeFeatureViews and
expects request time data for key amount.
The request time data is defined in the feature definition as such:
request_schema = StructType()
request_schema.add(StructField("amount", DoubleType()))
transaction_request = RequestDataSource(request_schema=request_schema)
fs_on_demand.get_online_features(request_data={'amount': 30})Fetch the latest features from the online store with amount = 30.
A FeatureService fs_all that contains feature views of all types with join key
user_id and expects request time data for key amount.
fs_all.get_online_features(join_keys={'user_id': 1}, request_data={'amount': 30})Fetch the latest features from the online store for user 1 with amount = 30.
list_jobs()β
Retrieves the list of dataset jobs created for this feature service.Returns
List[TectonJob]: List ofJobData objects.
query_features(...)β
[Advanced Feature] Queries the FeatureService with a partial set of join_keys defined in theonline_serving_index
of the included FeatureViews. Returns TectonDataFrame of all matched records.Parameters
join_keys: Mapping[str,Union[int, numpy.int64, str, bytes]]Query join keys, i.e., a union of join keys in theonline_serving_indexof all enclosed FeatureViews.
Returns
TectonDataFramesummary()β
Displays a human-readable summary(Deprecated) validate()β
validate() is unnecessary.