Skip to main content
Version: Beta 🚧

Read Feature Data

Reading feature data from Tecton enables you to use engineered features in your machine learning applications and pipelines. This overview provides context on these use cases, outlines the methods available for reading features in each scenario, and points you to relevant documentation with implementation details and examples.

Testing​

During feature development, you can interactively test new feature definitions in your notebook environment to ensure their accuracy. Then, to prevent future regressions, you can define unit tests. See Testing Features for details on interactive and unit testing for Feature Views.

Training​

To generate training data from your Tecton feature store, you can read historical feature data using the Feature Services API via the get_features_for_events() method. When calling this method, provide an "events" DataFrame containing the keys and timestamps for the samples you want to include, and Tecton returns a DataFrame with the feature values joined on. These values are point-in-time correct, meaning no future data is inadvertently included.

See Constructing Training Data for more details.

Inference​

For online inference, you have a few options:

  • Use the Tecton HTTP API to fetch single feature vectors at low latency.
  • Use the Java Client Library, a wrapper for the HTTP API that handles best practices.
  • Use the Python Client Library, a wrapper for the HTTP API that handles best practices.
  • Subscribe your application to Feature View Output Streams to receive feature updates asynchronously.
  • For offline batch inference, read historical features like you would for training using get_features_for_events().

Using Materialized Feature Data​

When reading feature data using get_features_for_events(), get_features_in_range(), get_online_features(), or the GetFeatures endpoint of the HTTP API, materialized feature data is used if all of the following are true:

  • Your feature service is running in a live workspace

  • The constituent feature views have the option offline=True (when using get_features_for_events() or get_features_in_range()) or online=True (when using get_online_features() or the GetFeatures endpoint of the HTTP API)

  • (Applies to get_features_for_events() and get_features_in_range() only): You omitted the from_source option or set it to False

danger

Using get_online_features() is not recommended in production. It's much slower than the GetFeatures endpoint of the HTTP API, and is not designed for production workloads.

When reading feature data using get_features_for_events(), get_features_in_range() or get_online_features(), materialized feature data is not used if any of the following are true:

  • Your feature service is running in a development workspace

  • Any of the constituent feature views have the option offline=False (when using get_features_for_events() or get_features_in_range()) or online=False (when using get_online_features() or the GetFeatures endpoint of the HTTP API)

  • (Applies to get_features_for_events() and get_features_in_range() only): You specified from_source=True

Was this page helpful?