FeatureService
Summaryβ
In Tecton, a Feature Service exposes an API for accessing a set of FeatureViews.Β
Once deployed in production, each model has one associated Feature Service that serves the model its features. A Feature Service contains a list of the Feature Views associated with a model. It also includes user-provided metadata such as name, description, and owner that Tecton uses to organize feature data.
Exampleβ
from tecton import FeatureService, LoggingConfig
# Import your feature views declared in your feature repo directory
from feature_repo.features.feature_views import last_transaction_amount_sql, transaction_amount_is_high
...
# Declare Feature Service
fraud_detection_feature_service = FeatureService(
name='fraud_detection_feature_service',
description='A FeatureService providing features for a model that predicts if a transaction is fraudulent.',
features=[
last_transaction_amount_sql,
transaction_amount_is_high,
...
]
logging=LoggingConfig(
sample_rate=0.5,
log_effective_times=False,
)
)
Attributesβ
Name | Data Type | Description |
---|---|---|
created_at | Optional[datetime.datetime] | Returns the time that this Tecton object was created or last updated. None for locally defined objects. |
defined_in | Optional[str] | The repo filename where this object was declared. None for locally defined objects. |
description | str | Description of the Feature Service. |
enable_online_caching | bool | Whether Online Caching is enabled for this Feature Service |
feature_metadata | List[FeatureMetadata] | Returns the list of all feature columns included in this feature service as well as associated metadata including data type of the feature, user-defined descriptions, and user-defined tags. |
feature_views | Set[FeatureView] | Returns the set of Feature Views directly depended on by this Feature Service. Β A single Feature View may be included multiple times in a Feature Service under different namespaces. See the FeatureReference documentation. This method dedupes those Feature Views. |
features | List[FeatureReference] | Returns the list of feature references included in this Feature Service. Β FeatureReferences are references to Feature Views/Tables that may select a subset of features, override the Feature View/Table namespace, or re-map join-keys. |
id | str | Returns the unique id of the Tecton object. |
info | ||
name | str | Name of the Feature Service. |
online_serving_enabled | bool | Whether Online Serving is enabled for this Feature Service |
owner | str | Owner of the Feature Service. |
prevent_destroy | bool | Whether this Feature Service will be blocked from being deleted or re-created during tecton plan/apply. |
tags | Dict[str, str] | Returns the tags of the Tecton object. |
workspace | Optional[str] | Returns the workspace that this Tecton object belongs to. None for locally defined objects. |
Methodsβ
Name | Description |
---|---|
__init__(...) | Instantiates a new FeatureService. |
get_feature_columns() | Returns the list of all feature columns included in this feature service. |
get_features_for_events(...) | Fetch a TectonDataFrame of feature values from this FeatureService. |
get_historical_features(...) | Fetch a TectonDataFrame of feature values from this FeatureService. |
get_job(...) | Retrieves data about the specified job. |
get_online_features(...) | Returns a single tecton.FeatureVector from the Online Store. |
list_jobs() | Retrieves the list of dataset jobs created for this feature service. |
query_features(...) | [Advanced Feature] Queries the FeatureService with a partial set of join_keys defined in the online_serving_index |
summary() | Displays a human-readable summary |
validate() | Method is deprecated and will be removed in a future version. As of Tecton version 1.0, objects are validated upon object creation, so validation is unnecessary. |
__init__(...)β
Instantiates a new FeatureService.Parameters
name
(str
) - A unique name for the Feature Service.description
(Optional
[str
]) - A human-readable description. Default:None
tags
(Optional
[Dict
[str
,str
]]) - Tags associated with this Tecton Object (key-value pairs of arbitrary metadata). Default:None
owner
(Optional
[str
]) - Owner name (typically the email of the primary maintainer). Default:None
prevent_destroy
(bool
) - If True, this Tecton object will be blocked from being deleted or re-created (i.e. a destructive update) during tecton plan/apply. To remove or update this object,prevent_destroy
must be set to False via the same tecton apply or a separate tecton apply.prevent_destroy
can be used to prevent accidental changes such as inadvertently deleting a Feature Service used in production or recreating a Feature View that triggers expensive rematerialization jobs.prevent_destroy
also blocks changes to dependent Tecton objects that would trigger a recreate of the tagged object, e.g. ifprevent_destroy
is set on a Feature Service, that will also prevent deletions or re-creates of Feature Views used in that service.prevent_destroy
is only enforced in live (i.e. non-dev) workspaces. Default:false
online_serving_enabled
(bool
) - If True, users can send realtime requests to this FeatureService, and only FeatureViews with online materialization enabled can be added to this FeatureService. Default:true
features
(Optional
[List
[Union
[FeatureReference
,FeatureView
]]]) - The list of FeatureView or FeatureReference that this FeatureService will serve. Default:None
logging
(Optional
[LoggingConfig
]) - A configuration for logging feature requests sent to this Feature Service. Default:None
on_demand_environment
(Optional
[str
]) - (Deprecated) Renamed to realtime_environment Default:None
realtime_environment
(Optional
[str
]) - The environment in which all the Realtime Feature Views for this feature service should be executed. Defaults toNone
, which means the Realtime Feature Views are executed in the same environment as the feature service, without any resource isolation. This may be preferred for low-latency feature services which do not have dependencies. Learn more about environments at https://docs.tecton.ai/docs/defining-features/feature-views/realtime-feature-view/realtime-feature-view-environments. Default:None
options
(Optional
[Dict
[str
,str
]]) - Additional options to configure the Feature Service. Used for advanced use cases and beta features. Default:None
enable_online_caching
(bool
) - If True, the feature server will read and write feature values to the online serving cache for feature views and tables that have caching enabled (have cache_config set). Default:false
transform_server_group
(TransformServerGroup
) - Optional, the Transform Server Group used for executing all Realtime Feature Views in the Feature Service. Default:None
feature_server_group
(FeatureServerGroup
) - Optional, the Feature Server Group used for online feature serving. Default:None
get_feature_columns()β
Returns the list of all feature columns included in this feature service.Returns
List
[str
]
get_features_for_events(...)β
This method is functionally equivalent to get_historical_features(spine)
and
has been renamed in Tecton 0.8+ for clarity. get_historical_features()
will be
deprecated in a future release.
TectonDataFrame
of feature values from this FeatureService.Β
This method will return feature values for each row provided in the spine DataFrame. The feature values returned by this method will respect the timestamp provided in the timestamp column of the spine Data Frame.
Β
By default (i.e.
from_source=None
), this method fetches feature values from the Offline Store for Feature
Views that have offline materialization enabled and otherwise computes feature values on the fly from raw data.Parameters
events
(Union
[pyspark.sql.dataframe.DataFrame
,pandas.core.frame.DataFrame
,TectonDataFrame
,str
]) - A dataframe of possible join keys, request data keys, and timestamps that specify which feature values to fetch.To distinguish between the event dataframe columns and feature columns, feature columns are labeled asfeature_view_name__feature_name
in the returned DataFrame.timestamp_key
(Optional
[str
]) - Name of the time column in the spine DataFrame. This method will fetch the latest features computed before the specified timestamps in this column. Not applicable if the FeatureService strictly contains RealtimeFeatureViews with no feature view dependencies. Default:None
from_source
(Optional
[bool
]) - Whether feature values should be recomputed from the original data source. IfNone
, feature values will be fetched from the Offline Store for Feature Views that have offline materialization enabled and otherwise computes feature values on the fly from raw data. Usefrom_source=True
to force computing from raw data andfrom_source=False
to error if any Feature Views are not materialized. Defaults to None. Default:None
compute_mode
(Union
[tecton_core.compute_mode.ComputeMode
,str
,NoneType
]) - Compute mode to use to produce the data frame. Default:None
Returns
TectonDataFrame
Examplesβ
A Feature Service fs
that contains a Batch Feature View and Stream Feature
View with join keys user_id
and ad_id
.
fs.get_features_for_events(events)
whereevents=pandas.Dataframe({'user_id': [1,2,3], 'ad_id': ['a234', 'b256', 'c9102'], 'date': [datetime(...), datetime(...), datetime(...)]})
Fetches historical features from the offline store for users 1, 2, and 3 for the
specified timestamps and ad ids in the events
DataFrame
.
fs.get_features_for_events(events, save_as='my_dataset)
whereevents=pandas.Dataframe({'user_id': [1,2,3], 'ad_id': ['a234', 'b256', 'c9102'], 'date': [datetime(...), datetime(...), datetime(...)]})
Fetches historical features from the offline store for users 1, 2, and 3 for the
specified timestamps and ad ids in the events
DataFrame
. Save the
DataFrame
as dataset with the name my_dataset
.
fv.get_features_for_events(events, timestamp_key='date_1')
whereevents=pandas.Dataframe({'user_id': [1,2,3], 'ad_id': ['a234', 'b256', 'c9102'], 'date_1': [datetime(...), ...], 'date_2': [datetime(...), ...]})
Fetches historical features from the offline store for users 1, 2, and 3 for the
ad ids and specified timestamps in the date_1
column in the events
DataFrame
.
A Feature Service fs_on_demand
that contains only RealtimeFeatureViews and
expects request time data for the key amount
.
The request time data is defined in the feature definition as such:
request_schema = StructType()
request_schema.add(StructField("amount", DoubleType()))
transaction_request = RequestDataSource(request_schema=request_schema)
fs_on_demand.get_features_for_events(events)
whereevents=pandas.Dataframe({'amount': [30, 50, 10000]})
Fetches historical features from the offline store with request data inputs 30, 50, and 10000.
A Feature Service fs_all
that contains feature views of all types with join
key βuser_idβ and expects request time data for the key amount
.
fs_all.get_features_for_events(events)
whereevents=pandas.Dataframe({'user_id': [1,2,3], 'amount': [30, 50, 10000], 'date': [datetime(...), datetime(...), datetime(...)]})
Fetches historical features from the offline store for users 1, 2, and 3 for the
specified timestamps and request data inputs in the events
data.
get_historical_features(...)β
Fetch aTectonDataFrame
of feature values from this FeatureService.Β
This method will return feature values for each row provided in the spine DataFrame. The feature values returned by this method will respect the timestamp provided in the timestamp column of the spine DataFrame.
Β
By default (i.e.
from_source=None
), this method fetches feature values from the Offline Store for Feature
Views that have offline materialization enabled and otherwise computes feature values on the fly from raw data.Parameters
spine
(Union
[pyspark.sql.dataframe.DataFrame
,pandas.core.frame.DataFrame
,TectonDataFrame
,str
]) - A dataframe of possible join keys, request data keys, and timestamps that specify which feature values to fetch.To distinguish between spine columns and feature columns, feature columns are labeled asfeature_view_name__feature_name
in the returned DataFrame.timestamp_key
(Optional
[str
]) - Name of the time column in the spine DataFrame. This method will fetch the latest features computed before the specified timestamps in this column. Not applicable if the FeatureService strictly contains RealtimeFeatureViews with no feature view dependencies. Default:None
include_feature_view_timestamp_columns
(bool
) - Whether to include timestamp columns for each FeatureView in the FeatureService. Default is False. Default:false
from_source
(Optional
[bool
]) - Whether feature values should be recomputed from the original data source. IfNone
, feature values will be fetched from the Offline Store for Feature Views that have offline materialization enabled and otherwise computes feature values on the fly from raw data. Usefrom_source=True
to force computing from raw data andfrom_source=False
to error if any Feature Views are not materialized. Defaults to None. Default:None
compute_mode
(Union
[tecton_core.compute_mode.ComputeMode
,str
,NoneType
]) - Compute mode to use to produce the data frame. Default:None
Returns
TectonDataFrame
Examplesβ
A FeatureService fs
that contains a BatchFeatureView and StreamFeatureView
with join keys user_id
and ad_id
.
fs.get_historical_features(spine)
wherespine=pandas.Dataframe({'user_id': [1,2,3], 'ad_id': ['a234', 'b256', 'c9102'], 'date': [datetime(...), datetime(...), datetime(...)]})
Fetch historical features from the offline store for users 1, 2, and 3 for the specified timestamps and ad ids in the spine.
fs.get_historical_features(spine, save_as='my_dataset)
wherespine=pandas.Dataframe({'user_id': [1,2,3], 'ad_id': ['a234', 'b256', 'c9102'], 'date': [datetime(...), datetime(...), datetime(...)]})
Fetch historical features from the offline store for users 1, 2, and 3 for the
specified timestamps and ad ids in the spine. Save the DataFrame as dataset with
the name my_dataset
.
fv.get_historical_features(spine, timestamp_key='date_1')
wherespine=pandas.Dataframe({'user_id': [1,2,3], 'ad_id': ['a234', 'b256', 'c9102'], 'date_1': [datetime(...), ...], 'date_2': [datetime(...), ...]})
Fetch historical features from the offline store for users 1, 2, and 3 for the
ad ids and specified timestamps in the date_1
column in the spine.
A FeatureService fs_on_demand
that contains only RealtimeFeatureViews and
expects request time data for the key amount
.
The request time data is defined in the feature definition as such:
request_schema = StructType()
request_schema.add(StructField("amount", DoubleType()))
transaction_request = RequestDataSource(request_schema=request_schema)
fs_on_demand.get_historical_features(spine)
wherespine=pandas.Dataframe({'amount': [30, 50, 10000]})
Fetch historical features from the offline store with request data inputs 30, 50, and 10000.
A FeatureService fs_all
that contains feature views of all types with join key
βuser_idβ and expects request time data for the key amount
.
fs_all.get_historical_features(spine)
wherespine=pandas.Dataframe({'user_id': [1,2,3], 'amount': [30, 50, 10000], 'date': [datetime(...), datetime(...), datetime(...)]})
Fetch historical features from the offline store for users 1, 2, and 3 for the specified timestamps and request data inputs in the spine.
get_job(...)β
Retrieves data about the specified job.Parameters
job_id
(str
) - ID string of the job.
Returns
TectonJob
: JobData
object for the job.
get_online_features(...)β
Returns a singletecton.FeatureVector
from the Online Store.Parameters
join_keys
(Optional
[Mapping
[str
,Union
[int
,numpy.int64
,str
,bytes
]]]) - Join keys of the enclosed FeatureViews. Default:None
include_join_keys_in_response
(bool
) - Whether to include join keys as part of the response FeatureVector. Default:false
request_data
(Optional
[Mapping
[str
,Union
[int
,numpy.int64
,str
,bytes
,float
]]]) - Dictionary of request context values. Only applicable when the FeatureService contains RealtimeFeatureViews. Default:None
Returns
FeatureVector
: tecton.FeatureVector
of the results.
Examplesβ
A FeatureService fs
that contains a BatchFeatureView and StreamFeatureView
with join keys user_id
and ad_id
.
-
fs.get_online_features(join_keys={'user_id': 1, 'ad_id': 'c234'})
Fetch the latest features from the online store for user 1 and ad βc234β. -
fv.get_online_features(join_keys={'user_id': 1, 'ad_id': 'c234'}, include_join_keys_in_response=True)
Fetch the latest features from the online store for user 1 and ad id βc234β. Include the join key information (user_id=1, ad_id=βc234β) in the returned FeatureVector.
A FeatureService fs_on_demand
that contains only RealtimeFeatureViews and
expects request time data for key amount
.
The request time data is defined in the feature definition as such:
request_schema = StructType()
request_schema.add(StructField("amount", DoubleType()))
transaction_request = RequestDataSource(request_schema=request_schema)
fs_on_demand.get_online_features(request_data={'amount': 30})
Fetch the latest features from the online store with amount = 30.
A FeatureService fs_all
that contains feature views of all types with join key
user_id
and expects request time data for key amount
.
fs_all.get_online_features(join_keys={'user_id': 1}, request_data={'amount': 30})
Fetch the latest features from the online store for user 1 with amount = 30.
list_jobs()β
Retrieves the list of dataset jobs created for this feature service.Returns
List
[TectonJob
]: List of JobData
objects.
query_features(...)β
[Advanced Feature] Queries the FeatureService with a partial set of join_keys defined in theonline_serving_index
of the included FeatureViews. Returns TectonDataFrame
of all matched records.Parameters
join_keys
(Mapping
[str
,Union
[int
,numpy.int64
,str
,bytes
]]) - Query join keys, i.e., a union of join keys in theonline_serving_index
of all enclosed FeatureViews.
Returns
TectonDataFrame
summary()β
Displays a human-readable summary(Deprecated) validate()β
Method is deprecated and will be removed in a future version. As of Tecton version 1.0, objects are validated upon object creation, so validation is unnecessary.Returns
None