Skip to main content
Version: 0.9

Suppress Recreates

The --suppress-recreates option for tecton apply enables users to override Tecton's default Object recreate behavior for certain kinds of modifications.

This explains the recreate process, and when it may be appropriate to use the --suppress-recreates option.

caution

Use the --suppress-recreates flag with caution. Only use flag when you are confident that changes to a Tecton repo will not affect feature values in a way that will disrupt your downstream consumers. Using the flag incorrectly can lead to degraded model performance if the feature logic used for inference doesn't match historical training data.

Only workspace owners are authorized to apply plans computed with --suppress-recreates.

Recreating Objects in Tecton

During tecton apply, Tecton attempts to identify modification to business logic that could invalidate features for downstream consumers in order to prevent potentially breaking changes to production systems. These modifications are highlighted as "recreating" the Tecton object in the plan output.

Recreating a Feature View with materialization enabled deletes any materialized data, and re-runs the backfill process from the feature start time. This process, called re-materialization, incurs cost and downtime.

Sometimes your judgement allows you to know that the desired modification will not impact any downstream consumers, even though Tecton detects the state change as a recreate. In these scenarios you can use the --suppress-recreates option during tecton plan or tecton apply to force Tecton to treat the modification as a simple update.

Supported use-cases for --suppress-recreates

Conceptually, --suppress-recreates can be used in scenarios where the upstream and downstream contracts of a feature pipeline are unchanged. For example, --suppress-recreates can be used when the transformation logic is modified, so long as the same schema is output.

The primary applications for --suppress-recreates are:

  1. Refactoring Python functions
  2. Upstream Data Source migrations

Unsupported use-cases

In some scenarios, Tecton will not allow the apply to precede with --suppress-recreates. The best way to test if your scenario is valid is to run tecton plan --suppress-recreates with the desired modification.

Recreates cannot be suppressed if any of the following occurs, and will result in a plan failure:

  • Modification of the schema (such as adding a column or removing a column) of a Feature View.
  • Modification of the schema of the RequestSource object that is used in an On-Demand Feature View.
  • Some modifications of a Spark Stream Feature View with window aggregates, where the modification would invalidate the checkpoint.
  • Increasing the ttl duration for a Feature View

Refactoring Python functions

If you are updating a Python function in a way that does not impact feature values, such as a refactor that adds comments or whitespace, you can use the --suppress-recreates flag with tecton apply and tecton plan to suppress rematerialization. The Python functions that can be changed, prior to using --suppress-recreates, are:

  1. The function referenced in the post_processor parameter of the batch_config or stream_config object (in 0.4 compat this is the raw_batch_translator or raw_stream_translator).

    Example plan output when refactoring a batch_config object's post_processor:

    ↓↓↓↓↓↓↓↓↓↓↓↓ Plan Start ↓↓↓↓↓↓↓↓↓↓

    ~ Update BatchDataSource
    name: users_batch
    owner: demo-user@tecton.ai
    hive_ds_config.common_args.post_processor.body:

    @@ -1,4 +1,5 @@
    def post_processor(df):
    + # drop geo location columns
    return df \
    .drop('lat') \
    .drop('long')

    ↑↑↑↑↑↑↑↑↑↑↑↑ Plan End ↑↑↑↑↑↑↑↑↑↑↑↑

    ⚠️ ⚠️ ⚠️ WARNING: This plan was computed with --suppress-recreates, which force-applies changes without causing recreation or rematerialization. Updated feature data schemas have been validated and are equal, but please triple check the plan output before applying.
  2. Transformation functions including the transformation for a Feature View.

Upstream Data Source migrations

If you need to perform a migration of an underlying data source that backs a Tecton Data Source, you can use the --suppress-recreates flag with tecton apply and tecton plan to migrate your Tecton Data Source to use the new underlying data source, without rematerialization. This assumes the schema and data in the new underlying data source is the same as that of the original underlying data source.

Supported changes you can make, prior to using --suppress-recreates, are:

  1. Updating an existing batch_config or stream_config object (such as a HiveConfig), where the schema and data in the underlying data source utilized by the batch_config or stream_config object is the same.

    This is useful when migrating to a replica table within the same database. Example:

    ↓↓↓↓↓↓↓↓↓↓↓↓ Plan Start ↓↓↓↓↓↓↓↓↓↓

    ~ Update BatchDataSource
    name: users_batch
    owner: demo-user@tecton.ai
    hive_ds_config.table: customers -> customers_replica

    ↑↑↑↑↑↑↑↑↑↑↑↑ Plan End ↑↑↑↑↑↑↑↑↑↑↑↑

    ⚠️ ⚠️ ⚠️ WARNING: This plan was computed with --suppress-recreates, which force-applies changes without causing recreating or rematerialization. Updated feature data schemas have been validated and are equal, but please triple check the plan output before applying.
  2. Replacing an existing batch_config or stream_config object with a new one, where the schema and data in the underlying data source utilized by the new batch_config or stream_config object is the same as schema of the original object.

    This is useful when migrating to a new data source format (e.g. from a Parquet format File Data Source to a Hive Data Source), to improve performance.

  3. Creating a new Tecton Data Source for a new replica source, and then changing an existing Batch Feature View to use the new Data Source.

    This is useful when the Data Source is used by many Feature Views and you want to migrate one at a time. Example:

    ↓↓↓↓↓↓↓↓↓↓↓↓ Plan Start ↓↓↓↓↓↓↓↓↓↓

    + Create BatchDataSource
    name: users_batch_replica
    owner: demo-user@tecton.ai

    ~ Update FeatureView
    name: user_date_of_birth
    owner: demo-user@tecton.ai
    description: User date of birth, entered at signup.
    DependencyChanged(DataSource): -> users_batch_replica

    ↑↑↑↑↑↑↑↑↑↑↑↑ Plan End ↑↑↑↑↑↑↑↑↑↑↑↑

    ⚠️ ⚠️ ⚠️ WARNING: This plan was computed with --suppress-recreates, which force-applies changes without causing recreation or rematerialization. Updated feature data schemas have been validated and are equal, but please triple check the plan output before applying.

Special behavior for edge cases

Modifications to Stream Feature View Checkpoints

Tecton uses checkpointing to track position when reading from streams. When some above changes are made to a repo with --suppress-recreates, Tecton cannot guarantee that the current checkpoint for a Stream Feature View is valid according to Spark Streaming docs. Such changes include:

  • Swapping the Stream Feature View to read from a different Stream Data Source
  • Modifying anything in Stream Feature View's Data Source stream_config, except the post_processor.

When the checkpoint for a Stream Feature View is no longer valid, the checkpoint is discarded and the current streaming job is restarted. The stream job may take some time to catch up to its previous location, temporarily affecting freshness.

Most of the time, however, changes will not invalidate the checkpoint, but may still modify the definition of the Feature View. These include:

  • Modifying the Stream Data Source's stream_config's post_processor function
  • Modifying any transformation function in a Stream Feature View's pipeline, including its primary transformation.

In these cases, the current streaming job is restarted to use the new definition of the Feature View, but the checkpoint is reused. The stream job may take some time to catch up to its previous location, temporarily affecting freshness.

When in doubt, the output of tecton plan/apply --suppress-recreates will display all intended changes to the streaming materialization job for review before applying.

Modifications to the ttl parameter in Feature Views

Updating the ttl value in a Feature View (assuming offline or online are set to True) will result in a destructive recreate. If you want to decrease the ttl value, but avoid rematerialization, you should use --suppress-recreates flag when running tecton plan/tecton apply to prevent recomputing your feature values. However, when you want to increase the ttl value, you cannot use --suppress-recreates and will have to re-materialize the Feature View data.

Was this page helpful?