Skip to main content
Version: 1.0

Understanding Cost Components

When operating Tecton in production, it's important to understand which components drive infrastructure costs and how to optimize them. This guide explains the key cost components and factors that influence them.

Stream Feature Views​

Key Cost Components​

  • Online store reads and writes
  • Spark stream cluster compute
  • Tecton Spark streaming compute credits
  • S3 storage
  • Online store storage costs

Cost-Influencing Parameters​

ParameterDescriptionCost Impact
Stream Processing ModeDefines how quickly Stream Feature Views process new events (TIME_INTERVAL vs CONTINUOUS)CONTINUOUS mode leads to more frequent checkpointing and writes, increasing costs
Aggregation WindowThe time window over which features are aggregatedLarger windows typically require more compute resources
AggregationLeadingEdgeControls how recent data is included in aggregationsCan affect both compute and storage costs. AggregationLeadingEdge.WALL_CLOCK_TIME (default in v1.0+) reduces read costs
Compaction EnabledWhether to use Tecton's data compaction capabilitiesReduces storage costs and improves read performance
WatermarkDefines how long to wait for late-arriving dataLonger watermarks increase memory requirements
Cluster ConfigurationSpark cluster settings for stream processingDirectly affects compute costs

Batch Feature Views​

Key Cost Components​

  • Online store reads and writes
  • Spark job cluster compute
  • Tecton Spark batch compute credits
  • S3 storage

Cost-Influencing Parameters​

ParameterDescriptionCost Impact
Aggregation WindowThe time window over which features are aggregatedLarger windows typically require more compute resources
Compaction EnabledWhether to use Tecton's data compaction capabilitiesReduces storage costs and improves read performance
Cluster ConfigurationSpark cluster settings for batch jobsDirectly affects compute costs

Cost Impact Hierarchy​

While all components contribute to overall costs, some have a more significant impact than others. Here's a general hierarchy of cost impact, from highest to lowest:

  1. Online Store Operations

    • Writes to the online store typically represent the highest infrastructure cost
    • This is especially true during backfills or when using CONTINUOUS stream processing
  2. Compute Resources

    • Spark cluster costs for stream and batch processing
    • Impact varies based on cluster configuration and workload
  3. Storage

    • Online store storage and read costs
    • S3 storage costs
    • Generally lower impact compared to compute and online store operations

Next Steps​

To optimize costs for your Tecton deployment:

Was this page helpful?