Skip to main content
Version: 1.2

Materialization

Overviewโ€‹

Materialization processes are run by Tecton to keep production features up-to-date. Monitoring these materialization processes helps ensure that feature pipelines are continuously delivering data to your models.

Tecton offers various tools to facilitate materialization monitoring, including dashboards in the Web UI, email alerts, and the Metrics API.

Where to View Job Statusesโ€‹

To view the status of materialization jobs:

  1. Navigate to the Feature View details page in the Tecton Web UI
  2. Select the "Materialization" tab to see all jobs for that Feature View
  3. For currently running jobs, follow the details link from the jobs table to see detailed job information from your compute provider

Job Statesโ€‹

StateDescription
RUNNINGThe task is running. New attempts may be created if errors are encountered.
DRAININGThe task is draining. Any running attempt will be cancelled. No new attempts will be scheduled.
MANUAL_RETRYThe terminated task was manually requested to be re-executed. Retry policy is reset.
MANUAL_CANCELLATION_REQUESTEDTask cancellation is requested by a user. No new attempts will be scheduled.
FAILUREThe task failed permanently. No new attempts will be made.
MANUALLY_CANCELLEDTask is cancelled. Scheduler will not attempt to fill materialization gap.
DRAINEDTemporary managed job state where Tecton will automatically decide next steps. No action required.
SUCCESSThe task completed successfully. Only applicable to batch materialization tasks.

Batch Materialization Jobsโ€‹

Batch materialization processes run on a scheduled cadence as defined in the Batch or Stream Feature View. These jobs run on a schedule to process batch data sources, compute feature values, and write them to online/offline stores according to defined batch_schedule. They handle both initial backfills and ongoing updates. You can learn more about materialization job scheduling behavior in the materialization documentation.

How Batch Jobs are Triggered and Retriedโ€‹

Batch jobs are automatically triggered based on the schedule defined in your Feature View. When a job fails, Tecton will automatically retry it according to the retry policy.

The Online Store Write Rate chart under the Feature View Monitoring tab shows how many records are written to the Online Store per second. If you have an idea of the total number of records your job needs to output, viewing the writes per second can give you an idea of how long the job will take to complete.

Streaming Materialization Jobsโ€‹

For Stream Feature Views with Spark, Tecton orchestrates Spark Structured Streaming jobs to continuously update feature values when new data arrives. These jobs continuously process incoming stream data (from Kafka, Kinesis, or Push API), compute feature values, write them to the online/offline stores, and maintain fresh feature values with sub-second latency.

Metrics to Monitor Stream Healthโ€‹

Even if a stream job is running, it may be failing to produce up-to-date features. The Stream Feature View Monitoring tab contains several metrics to help assess the progress of your Stream Feature View.

These metrics are also available through the Metrics API, allowing you to create custom dashboards and alerts in your Application Performance Monitoring system.

Processed Event Ageโ€‹

Processed Event Age is the key metric for understanding how up-to-date your features are. It measures the difference between the time the write to the online store completes and the timestamp of the event. This metric includes both upstream processing time and the time taken by Tecton to transform and persist the event.

Input Rateโ€‹

Input rate helps identify if there is a change in records being output by the upstream data source. It shows the rate of messages read from the stream.

Online Store Write Rateโ€‹

Online Store Write Rate is the number of records being written to the Online Store as the output of the stream feature pipeline. This may be lower than the Input Rate due to:

  • Filtering logic in the Data Source post-processor or Feature View transformation logic
  • Multiple records for the same entity ID arriving in the same microbatch, causing events to be aggregated before write

Average Serving Delayโ€‹

Average serving delay measures the difference between the time Tecton received the get-features request and the event timestamp of the feature retrieved (for an aggregation, the most recent event/tile).

Micro-batch Processing Latencyโ€‹

Micro-batch processing latency shows the time between complete micro-batches. By default, this number should remain below 30 seconds since Stream Feature Views micro-batches are 30 seconds long. Above 30 seconds indicates that the stream processing job is under-resourced and will fall behind.

If using continuous processing, then micro-batch latency should be close to 0.

Interpreting Stream Lag and Write Rateโ€‹

If your Processed Event Age suddenly increases, either:

  • The stream processing is falling behind (look for increased microbatch processing latency)
  • Your upstream data source is outputting stale records (check input rate changes)

Feature Freshness measures how up-to-date the stream feature data is. If no new data is coming in on the stream, or the stream feature pipeline is falling behind, then the freshness measurement will increase.

Specifically, Online Serving Feature Freshness measures the most recent timestamp written to the Online Store. Because this metric is polled periodically, the value reported here may be higher than the true value.

Other Job Typesโ€‹

Deletionโ€‹

  • Removes feature data from online/offline stores when features or data are deleted
  • Cleans up obsolete feature values after TTL expiration

Delta Maintenanceโ€‹

  • Performs periodic maintenance tasks on Delta tables in the offline store
  • Runs OPTIMIZE and VACUUM operations to manage file compaction and cleanup
  • Typically runs on a 7-day schedule

Ingestโ€‹

  • Processes data pushed through the Stream Ingest API
  • Validates incoming data against schema
  • Writes records to online/offline stores

Feature Publishโ€‹

  • Publishes materialized feature data to data warehouses for analysis
  • Makes historical feature data available for exploration and feature selection
  • Runs after successful materialization jobs

For more details, see the feature publish jobs documentation.

Dataset Generationโ€‹

  • Creates training datasets by joining features with provided training examples
  • Ensures point-in-time correctness when retrieving historical feature values
  • Supports both offline batch and streaming features

Integration Testโ€‹

Stream

  • Validates streaming feature pipelines end-to-end
  • Tests stream processing, materialization and feature freshness
  • Runs as part of CI/CD

Batch

  • Validates batch feature pipelines end-to-end
  • Tests materialization, retrieval and correctness of batch features
  • Runs as part of CI/CD

For more information about integration testing, see the integration test documentation.

Compactionโ€‹

  • Optimizes storage of aggregation features in the online store
  • Combines partial aggregates into fewer, more efficient tiles
  • Reduces storage costs and improves query performance

Was this page helpful?