Read Feature Data
Reading feature data from Tecton enables you to use engineered features in your machine learning applications and pipelines. This overview provides context on these use cases, outlines the methods available for reading features in each scenario, and points you to relevant documentation with implementation details and examples.
Testing​
During feature development, you can interactively test new feature definitions in your notebook environment to ensure their accuracy. Then, to prevent future regressions, you can define unit tests. See Testing Features for details on interactive and unit testing for Feature Views.
Training​
To generate training data from your Tecton feature store, you can read
historical feature data using the
Feature Services API via the
get_historical_features()
method. When calling this method, provide a "spine" DataFrame containing the
keys and timestamps for the samples you want to include, and Tecton returns a
DataFrame with the feature values joined on. These values are point-in-time
correct, meaning no future data is inadvertently included.
See Constructing Training Data for more details.
Inference​
For online inference, you have a few options:
- Use the Tecton HTTP API to fetch single feature vectors at low latency.
- Use the Java Client Library, a wrapper for the HTTP API that handles best practices.
- Use the Python Client Library, a wrapper for the HTTP API that handles best practices.
- Subscribe your application to Feature View Output Streams to receive feature updates asynchronously.
- For offline batch inference, read historical features like you would for
training using
get_historical_features()
.
Using Materialized Feature Data​
When reading feature data using get_features_for_events()
,
get_features_in_range()
, get_online_features()
, or the GetFeatures
endpoint of the HTTP API, materialized feature data is used if all of the
following are true:
-
Your feature service is running in a live workspace
-
The constituent feature views have the option
offline=True
(when usingget_features_for_events()
orget_features_in_range()
) oronline=True
(when usingget_online_features()
or theGetFeatures
endpoint of the HTTP API) -
(Applies to
get_features_for_events()
andget_features_in_range()
only): You omitted thefrom_source
option or set it toFalse
Using get_online_features()
is not recommended in production. It's much slower
than the GetFeatures
endpoint of the HTTP API, and is not designed for
production workloads.
When reading feature data using get_features_for_events()
,
get_features_in_range()
or get_online_features()
, materialized feature data
is not used if any of the following are true:
-
Your feature service is running in a development workspace
-
Any of the constituent feature views have the option
offline=False
(when usingget_features_for_events()
orget_features_in_range()
) oronline=False
(when usingget_online_features()
or theGetFeatures
endpoint of the HTTP API) -
(Applies to
get_features_for_events()
andget_features_in_range()
only): You specifiedfrom_source=True