Tecton’s Architecture is made up of several high-level components: the Feature Registry, the Serving Layer, the Storage Layer, the Transformation Layer, and the Monitoring System.
Let's look at how these components fit together to serve an online feature request. Suppose we're trying to detect fraud and we want to know if a transaction is suspiciously high. We make a request to Tecton...
- The request to the Fraud Detection Feature Service is received by the Serving layer, which begins to retrieve the feature vector. One of the features requested is
user_transaction_amount_metrics, which tells us how the current transaction amount compares to historical transaction amounts.
- In the Feature registry we look up the definition for the feature and whether the feature value is precomputed (i.e., pre-materialized). In this example, our feature is dependent on values from the last 10 transactions. We find that this feature has materialization turned on.
- We go to the Storage layer, where the precomputed value has been stored for quick retrieval in the online store.
- The feature value is available thanks to our Transformation Layer, which pre-computed our feature ahead of time. Because we're using materialization, the Transformation Layer continuously recomputes our feature from streaming data and updates the values to the storage layer. This ensures we're always getting the last 10 transactions.
- Returning to our Serving layer, it takes the pre-computed values (the last 10 transactions), combines them with data that came in with the request (the current transaction value) and returns the derived value (percentile of current transaction value) to the caller.
As you can see, all of the components in Tecton's architecture work together to fulfill the data flow that you defined using Tecton's Feature Definition framework. We take care of operational concerns like storage and serving, freeing you to focus on feature logic and model development.
Let's take a look at each of these components of Tecton's Feature Store in a little more depth.
Feature Serving Layer
Tecton's Feature Serving layer provides endpoints for fetching feature data in a consistent way across training and serving.
When retrieving data offline (e.g., for training), feature values are accessed through the notebook and IDE-friendly Tecton SDK. Tecton provides point-in-time correct feature values for each example used to train a model (a.k.a. “time-travel”).
For online serving, Tecton delivers a single vector of the latest features. Responses are served through a REST API backed by a low-latency database.
Tecton stores your feature data in both an offline and online feature store to support the different requirements of feature serving systems.
The offline store contains historical feature values across time and is accessed in batch for training data generation or for batch inference. The offline feature store that Tecton uses is configurable, but Tecton defaults to using S3 as the offline store.
The online store contains the latest feature values for low-latency retrieval. Tecton uses DynamoDB as an online feature store.
Tecton populates the online and offline stores by executing the the feature pipelines you authored, and and store the results for serving.
Tecton can record and orchestrate the transformations that are executed against raw data to produce features.
Transformations are registered in the Feature Registry, as seen in the previous Frameworks & Concepts section. By tracking transformation logic, Tecton can guarantee that features used for training and serving are calculated identically, hence giving you consistent data between training and serving a model.
Tecton supports three main types of Transformations:
|Feature Type||Definition||Common input data source||Example|
|Batch Transform||Transformations that are applied only to data at rest||Data warehouse, data lake, database||User country, product category|
|Streaming Transform||Transformations that are applied to streaming sources||Kafka, Kinesis, PubSub||# of clicks per vertical per user in last 30 minutes, # of views per listing in past hour|
|On-demand transform||Transformations that are used to produce features based on data that is only available at the time of the prediction. These features cannot be pre-computed.||User-facing application||Is the user currently in a supported location? / Similarity score between listing and search query|
Tecton's Transformations Layer also manages the execution of these transformations. Since models need access to fresh feature values for inference, Tecton can process and store regular feature values. When users need historical feature values, Tecton can run "backfill jobs" that generate historical values of a feature.
Tecton's central registry tracks features definitions and metadata. It serves as a source-of-truth for what's running in production.
Teams use the registry as a catalog to explore, develop, and publish new feature definitions. The Tecton system uses the feature registry to decide how to compute, store and serve feature values.
The state of the Feature Registry can be read from the Web UI. To edit the Feature Registry, edit the the repository that defines the registry, then apply the changes using the Tecton CLI.
Tecton's architecture enables you to create a fully operationalized feature store from a repository of declarative feature definitions. Serving, storage, materialization and monitoring are fully managed, allowing you to focus on feature logic and model development.