Consuming Feature Services
What are the feature serving limits?
Tecton is built on DynamoDB. Dynamo's default ingestion rate is 40 writes per second and becomes more expensive above this. Dynamo's read SLO is a P99 of 100ms. This is subject to a limit on the amount of data that is being aggregated for that request. When there are aggregations at request time, there is a limit of 2MB of data being aggregated for continuing to mee the 100ms threshold.
For serving aggregate features, what happens if the feature requires more than the 2MB limit?
You can aggregate more than 2MB, however, we can't guarantee what's going to happen to the latency, after some point, with regards to how much data you're reading per query. When we're measuring our compliance with our SLO, we exclude requests that require aggregating more than 2MB's of data. It's a soft limit in the sense that the query will work if it goes over 2MB; it just may be slower.
What are your availability guarantees for feature serving?
For a single region, we have an SLO of 99.99% uptime (52 minutes per year downtime). If your system requires multi-region support, we provide an SLO of five 9's (8 minutes per year downtime).
Does Tecton support cross-region and cross-datacenter availability?
Tecton's default implementation is for all components to be highly available in one region. If your use case requires more than that, Tecton does support feature serving in satellite regions. We mirror the feature store to other regions in which you want to serve features. This helps support low latency, local serving capabilities and can also be used to support global failover from one region to another. Tecton has an additional fee for every cluster maintained in additional regions, to offset the operational overhead.
When serving features, what limits have you tested for throughput?
Tecton has load tested our feature serving at 100,000 queries second to the feature server. During this load the feature services had a combined 65 dependent feature views and we were processing over 3 million DynamoDB requests per second. More details are present in our blog here. That being said, our system is capable of scaling higher than this if your system requires it.
How granular is user access? Can we restrict which users can add vs delete features?
ACLs are on a per-workspace level with read, write, and delete rights.
Does Tecton provide a set of "shared" features that others can pull from for their models?
Yes. Multiple teams can be included in a feature repository and share features within it. These features can be further organized to delineate shared vs. use case-specific features. We strongly encourage the sharing and reuse of features where possible and have seen organizations experience significant productivity and model gains when doing so.
How can I improve the request latency when calling FeatureService.query_features()?
If your specified online serving index returns a large amount of data, you may have longer response times. The time it takes for Tecton to return is a function of the number of rows matched by your query, the number of features in your feature service, and the amount of aggregation needed for your Aggregate Feature View.
If you need to retrieve a large number of rows, you can try:
- Reducing the data aggregated at request time by increasing the
aggregation_slide_periodor shortening the longest
- Removing unimportant features from your model.
Do you support alerting and monitoring via AWS?
If you are using a tool like Cloudwatch, we have the ability to add tags to the Dynamo tables to facilitate your requirements. Contact support for more details.
What monitoring and alerting capabilities do you have?
In the Tecton Web UI, you can find basic metrics per feature set, including total query rate, error rate, latency distributions. Please also note that Tecton is a managed service - if there's an issue with performance or errors in your services, our rotation of on-call engineers will be alerted as well.