Skip to main content
Version: Beta 🚧

Online Compaction: Upgrade Guide

Private Preview

This feature is currently in Private Preview.

This feature has the following limitations:
  • Must be enabled by Tecton Support.
  • Available for Spark-based Feature Views -- coming to Rift in a future release.
  • See additional limitations & requirements below.
If you would like to participate in the preview, please file a feature request.

See the architecture overview and usage guide for background on online compaction and how to use it.

This article covers how users should upgrade existing feature views to use online compaction and what factors to consider.

Before Upgrading: Contact Tecton Support​

Online compaction is in Private Preview and must be enabled by Tecton Support. Online Compaction also requires providing Tecton with new permissions to enable bulk-loading for DynamoDB.

Enabling online compaction for a feature view changes how it materializes data to the online and offline stores. This means that 1) enabling online compaction for a feature view requires rematerialzing that view, and 2) that the online and offline retrieval semantics for the feature view may change. For those reasons, it is recommended that customers first create a new version of the feature view being upgraded and cut over to using the feature view after experimentation or other validation.

Factors to consider:​

  1. Rematerialization cost
    • Since online compaction leverages DyanmoDB's S3 Bulk Import backfilling compacted Feature Views online is usually 10-100x cheaper than feature views without online compaction enabled due to much cheaper online store writes. Note that online compaction does not reduce batch compute costs, but batch compute is typically a much smaller cost contributor. Consider materializing a sampled dataset or a shorter period first to predict costs.
  2. New behavior for online compacted Stream Feature View windowing
    • Compacted Stream Feature Views introduce some fuzziness to improve performance. See this explanation of sawtooth windows for more info.
  3. Changes to offline retrieval
    • Compacted Feature Views (Batch or Stream) materialize and retrieve data from the offline store in a new way. The new approach improves performances for most Stream Feature Views but may impact offline retrieval performance for Batch Feature Views for some use cases. (We're working on further improvements here.) Consider testing offline retrieval performance for your use case before migrating.

Was this page helpful?