Skip to main content
Version: 0.8

Materialize Features

Materialization is an essential part of Tecton's operational ML features lifecycle management. It refers to the process of precomputing feature data using a feature pipeline, followed by publishing the results to either the Online or Offline Feature Store.

The main objective of materialization is to enable quick feature retrieval during training and inference, thereby reducing latencies and improving the efficiency of machine learning applications.

Types of Materialization​

Tecton handles backfill and steady-state materialization for batch and stream features based on your Feature View configuration.

Steady-state Materialization​

Steady-state Materialization refers to materialization being performed on new data arriving in real-time. Steady-state Materialization continuously occurs on all Feature Views where Materialization is enabled.

When a Feature View has materialization enabled, Tecton will schedule steady-state materialization jobs on an ongoing basis in order to maintain fresh feature values. The frequency of steady-state materialization is controlled by the batch_schedule parameter. If you use Delta for the offline store, Tecton will run periodic background maintenance tasks on an ongoing basis with a 7-day schedule to perform optimize and vacuum operations in order to optimize performance with file managements on your Delta tables.

Backfill materialization​

Backfill refers to any materialization operations performed on data in the past. There are two Backfill operations.

The initial materialization of a Feature View is referred to as a bootstrap backfill. During a bootstrap materialization, existing raw data is processed into feature values.

When materialization is initially enabled for a Feature View, Tecton performs a bootstrap materialization. The amount of data materialized during a bootstrap is controlled by the feature_start_time parameter.

Enabling Feature View materialization​

Every Batch and Stream Feature Views can enable materialization to the online and/or offline store by setting online=True and/or offline=True in the Feature View decorator parameters. These options are available for the following types of Feature Views:

On-Demand Feature Views cannot be materialized since they are calculated only at request-time.

Determining if materialized feature data is being used when reading feature data​

When reading feature data using get_historical_features(), get_online_features(), or the GetFeatures endpoint of the HTTP API, materialized feature data is used if all of the following are true:

  • Your feature service is running in a live workspace

  • The constituent feature views have the option offline=True (when using get_historical_features()) or online=True (when using get_online_features() or the GetFeatures endpoint of the HTTP API)

  • (Applies to get_historical_features() only): You omitted the from_source option or set it to False

danger

Using get_online_features() is not recommended in production. It's much slower than the GetFeatures endpoint of the HTTP API, and is not designed for production workloads.

When reading feature data using get_historical_features() or get_online_features(), materialized feature data is not used if any of the following are true:

  • Your feature service is running in a development workspace

  • Any of the constituent feature views have the option offline=False (when using get_historical_features()) or online=False (when using get_online_features() or the GetFeatures endpoint of the HTTP API)

  • (Applies to get_historical_features() only): You specified from_source=True

Monitoring​

Tecton provides tools to monitor and debug production Feature Views via the Web UI, SDK, and CLI. More information on monitoring is available in Monitoring Materialization.

Was this page helpful?

🧠 Hi! Ask me anything about Tecton!

Floating button icon