Version: 1.1

Job Types

Materialization

Stream

Continuously processes incoming stream data (from Kafka, Kinesis, or Push API) to compute feature values and write them to the online/offline stores
Maintains fresh feature values with sub-second latency

Batch

Runs on a schedule to process batch data sources and compute feature values
Writes computed features to online/offline stores according to defined batch_schedule
Handles both initial backfills and ongoing updates

Materialization Job States

RUNNING: The task is running, meaning the materialization task attempt is being or will be run. New attempts may be created if errors are encountered.

DRAINING: The task is draining. Any running attempt will be cancelled. No new MaterializationTaskAttempts will be scheduled from this task.

MANUAL_RETRY: The terminated task (SUCCESS / PERMANENT_FAILURE) was manually requested to be re-executed. Retry policy is reset for the manually retried task (behaving as if it had no attempts executed before).

MANUAL_CANCELLATION_REQUESTED: Task cancellation is requested by a user. Similar to draining. No new MaterializationTaskAttempts will be scheduled from this task.

FAILURE: The task failed permanently. No new MaterializationTaskAttempts will be made from this task.

MANUALLY_CANCELLED: Task is cancelled. Similar to Drained but scheduler will not attempt to fill materialization gap.

DRAINED: Temporary managed job state where Tecton will automatically decide next steps for the job. No action required from the customer side.

SUCCESS: The task completed successfully. No new MaterializationTaskAttempts will be made from this task. Only applicable to the batch materialization tasks.

Deletion

Removes feature data from online/offline stores when features or data are deleted
Cleans up obsolete feature values after TTL expiration

Delta Maintenance

Performs periodic maintenance tasks on Delta tables in the offline store
Runs OPTIMIZE and VACUUM operations to manage file compaction and cleanup
Typically runs on a 7-day schedule

Ingest

Processes data pushed through the Stream Ingest API
Validates incoming data against schema
Writes records to online/offline stores

Feature Publish

Publishes materialized feature data to data warehouses for analysis
Makes historical feature data available for exploration and feature selection
Runs after successful materialization jobs

More information

Dataset Generation

Creates training datasets by joining features with provided training examples
Ensures point-in-time correctness when retrieving historical feature values
Supports both offline batch and streaming features

Integration Test

Stream

Validates streaming feature pipelines end-to-end
Tests stream processing, materialization and feature freshness
Runs as part of CI/CD

Batch

Validates batch feature pipelines end-to-end
Tests materialization, retrieval and correctness of batch features
Runs as part of CI/CD

More information

Compaction

Optimizes storage of aggregation features in the online store
Combines partial aggregates into fewer, more efficient tiles
Reduces storage costs and improves query performance

Materialization​

Materialization Job States​

Deletion​

Delta Maintenance​

Ingest​

Feature Publish​

Dataset Generation​

Integration Test​

Compaction​

Was this page helpful?