Read Feature Data for Inference
This section covers how to read feature data from Tecton for model inference. There are a few main methods for reading feature data for inference:
-
Using the Tecton HTTP API: This is the recommended method for reading feature data for online inference. The HTTP API provides low latency reads of individual or batch sets of feature vectors. See Reading Online Features for Inference Using the HTTP API for details.
-
Using the Java Client Library: This open-source library provides a convenient wrapper for the Tecton HTTP API. It handles best practices like request retries and response deserialization. See Reading Online Features for Inference using the Java Client for details.
-
Using the Python Client Library: This open-source library provides a convenient wrapper for the Tecton HTTP API, offering seamless deserialization of responses and supporting batch requests, ensuring efficient online feature inference. See Reading Online Features for Inference using the Python Client for details.
-
Using Feature Output Streams: This method allows your application to subscribe to the outputs of streaming feature pipelines. As new events arrive, feature values are written to the stream. This method is designed for asynchronous predictions triggered by new data. See Reading Online Features for Inference via a Feature Output Stream for details. This method is currently available for Spark-based Feature Views.
This section provides an overview of each method, details on implementation, code samples, and links to relevant API references. Reading feature data for inference unlocks model predictions in production using features engineered and stored in Tecton.
Aggregations during online feature read​
This section explains how aggregations are performed when reading feature data from Tecton.
During materialization, Tecton keeps track of the latest timestamp that has been written to the online store for each feature views in an internal status table.
For a batch materialization job, the status table is updated if the job completes, even if no new rows were written.
For a stream materialization job, the status table is updated strictly based on newer rows based and their timestamps.
When reading feature data from Tecton, all aggregations are performed relative to the latest timestamp written to the online store for the feature view, rather than relative to the current wall clock time. This means that retrieved feature values update at the same rate as newer stream values arrive (i.e. slowly-updating feature views have slowly-updating feature values).
Server Groups​
Server Groups provide a way to isolate and scale the serving infrastructure for different feature services. Each server group can be configured with a different desired number of nodes and autoscaling policies. There can be two different kinds of server groups:
- Feature Server Groups: These server groups are responsible for reading and serving feature vectors from the online store.
- Transform Server Groups: These server groups are responsible for running real-time feature view transformations.