Understanding Cost Components
When operating Tecton in production, it's important to understand which components drive infrastructure costs and how to optimize them. This guide explains the key cost components and factors that influence them.
Stream Feature Views​
Key Cost Components​
- Online store reads and writes
- Spark stream cluster compute
- Tecton Spark streaming compute credits
- S3 storage
- Online store storage costs
Cost-Influencing Parameters​
Parameter | Description | Cost Impact |
---|---|---|
Stream Processing Mode | Defines how quickly Stream Feature Views process new events (TIME_INTERVAL vs CONTINUOUS ) | CONTINUOUS mode leads to more frequent checkpointing and writes, increasing costs |
Aggregation Window | The time window over which features are aggregated | Larger windows typically require more compute resources |
AggregationLeadingEdge | Controls how recent data is included in aggregations | Can affect both compute and storage costs. AggregationLeadingEdge.WALL_CLOCK_TIME (default in v1.0+) reduces read costs |
Compaction Enabled | Whether to use Tecton's data compaction capabilities | Reduces storage costs and improves read performance |
Watermark | Defines how long to wait for late-arriving data | Longer watermarks increase memory requirements |
Cluster Configuration | Spark cluster settings for stream processing | Directly affects compute costs |
Batch Feature Views​
Key Cost Components​
- Online store reads and writes
- Spark job cluster compute
- Tecton Spark batch compute credits
- S3 storage
Cost-Influencing Parameters​
Parameter | Description | Cost Impact |
---|---|---|
Aggregation Window | The time window over which features are aggregated | Larger windows typically require more compute resources |
Compaction Enabled | Whether to use Tecton's data compaction capabilities | Reduces storage costs and improves read performance |
Cluster Configuration | Spark cluster settings for batch jobs | Directly affects compute costs |
Cost Impact Hierarchy​
While all components contribute to overall costs, some have a more significant impact than others. Here's a general hierarchy of cost impact, from highest to lowest:
-
Online Store Operations
- Writes to the online store typically represent the highest infrastructure cost
- This is especially true during backfills or when using
CONTINUOUS
stream processing
-
Compute Resources
- Spark cluster costs for stream and batch processing
- Impact varies based on cluster configuration and workload
-
Storage
- Online store storage and read costs
- S3 storage costs
- Generally lower impact compared to compute and online store operations
Next Steps​
To optimize costs for your Tecton deployment:
- Implement Development Best Practices
- Set up Cost Monitoring and Alerting
- Review Online Store Costs
- Optimize Spark Materialization Clusters