Online Feature Serving
What are your availability guarantees for feature serving?
Does Tecton support cross-region and cross-datacenter availability?
By default, Tecton runs with high availability in one region.
You may optionally choose to have multi-region availability for online feature serving to improve reliability during AWS outages or reduce network latency for multi-region applications. There may be additional costs associated with multi-region serving.
What QPS can Tecton support?
Tecton is a horizontally scalable system designed to reliably meet the needs of high volume operational ML.
Tecton has load tested our feature serving at 100,000 queries per second. During this load Tecton processed over 3 million DynamoDB requests per second. More details are present in our blog here.
What factors impact request latency when calling GetFeatures?
You can view the internal amount of time Tecton spends processing each request in the Web UI by navigating to your Feature Service and selecting the Monitoring tab.
The most important factors that contribute to this processing time are:
- Your choice of online store. Configuring Redis can lead to significantly reduced latency.
- The number of Feature Views in your Feature Service.
- If you use Feature Aggregations, the amount of data aggregated at request
time can vary. The
aggregation_interval(a greater value is faster), longest
time_window(a smaller value is faster), and the number of a events for a given entity can all influence time needed.
- If you use OnDemand Feature Views, the time it takes to execute your transformation logic.
How can I improve request latency when calling QueryFeatures?
If your specified online serving index returns a large amount of data, you may have longer response times. The time it takes for Tecton to return is a function of the number of rows matched by your query, the number of features in your feature service, and the amount of aggregation needed for your Aggregate Feature View.
If you need to retrieve a large number of rows, you can try:
- Reducing the data aggregated at request time by increasing the
aggregation_slide_periodor shortening the longest
- Removing unimportant features from your model.