Caching Feature Views in Production
For use cases with high traffic and duplication, caching can provide benefits in both cost and latency for your production deployment.
Retrieving from the cache skips the extra overhead of Feature Server computation needed to process feature values including aggregations and joins, along with the overhead required to retrieve the feature value from the online store.
Benchmarks were run for a sample Feature View to measure the latencies with different simulated cache hit rates. The benchmark details are as follows:
- Feature Service contains 10 Aggregate Feature Views each with 3 Aggregate Features (sum, count, mean).
- Feature Service uses 3 entity keys.
- Each aggregate feature has a time window of 30 days and a 1-day aggregation interval.
- The benchmark was run with 10 minutes of random traffic at a constant 10,000 QPS.
Latency Benchmark Results
|Cache Hit Rate
|P50 FS Latency
|P90 FS Latency
|P95 FS Latency
|P99 FS Latency
NOTE: This does not include the network latency to/from the client, only the time it takes for the Feature Server to compute the feature values.
Caching high-duplication use cases can also have a two-fold benefit for cost savings.
- Reduces the volume of queries to your online store by a number proportional to your average cache hit rate.
- Reduces the number of Tecton credits you consume.
Below is a scenario with the Feature Service used above for benchmarking. Cost
analysis is done assuming AWS region
us-west-1 and the default on-demand mode
|Cache Hit Rate
|DynamoDB Read Cost/Month
The cache cluster is secured through the following attributes:
- The cache is managed in your Tecton Account's dedicated Cloud Account & VPC.
- All connections to the cache are between private endpoints in the VPC and communications will not go through the public internet.
- All connections are TLS encrypted.
- The user controls the TTL of the data and the TTL must be less than or equal to 1 day.
- No data is ever persisted by Tecton.