Tecton is a Feature Store, which is made up of several high-level components.
Tecton provides serving endpoints, which are used to fetch feature data in a consistent way for training and serving.
When retrieving data offline (e.g. for training), feature values are accessed through the notebook-friendly Tecton SDK. Tecton provides point-in-time correct feature values for each example used to train a model (a.k.a. “time-travel”).
For online serving, Tecton delivers a single vector of the latest features. Responses are served through a REST API backed by a low-latency database.
Read more about feature serving:
Tecton stores your feature data in both an offline and online feature store to support the different requirements of feature serving systems.
The offline store contains historical feature values across time, and is accessed in batch for training data generation. The offline feature store that Tecton uses is configurable, but Tecton defaults to using Delta Lake as the offline store.
The online store contains the latest feature values for low-latency retrieval. Tecton uses DynamoDB as an online feature store.
Feature data arrives in Tecton via one of two ways. If you author feature pipelines using Tecton, Tecton can execute feature pipelines on a schedule and store the results for serving. Alternatively, if you have feature pipelines running external to Tecton, you can push feature values to Tecton using our Python SDK.
Read more about feature storage:
Tecton can record and orchestrate the transformations that are executed against raw data to produce features.
Transformations are configured in the Feature Registry. By tracking transformation logic, Tecton can guarantee that features used for training and serving are calculated identically.
Tecton supports three main types of Transformations:
|Feature Type||Definition||Common input data source||Example|
|Batch Transform||Transformations that are applied only to data at rest||Data warehouse, data lake, database||User country, product category|
|Streaming Transform||Transformations that are applied to streaming sources||Kafka, Kinesis, PubSub||# of clicks per vertical per user in last 30 minutes, # of views per listing in past hour|
|On-demand transform||Transformations that are used to produce features based on data that is only available at the time of the prediction. These features cannot be pre-computed.||User-facing application||Is the user currently in a supported location? / Similarity score between listing and search query|
A key benefit is that different types of features can be used together in the same models.
Tecton also manages the execution of these transformations. Since models need access to fresh feature values for inference, Tecton can process and store regular feature values. When users need historical feature values, Tecton can run "backfill jobs" that generate historical values of a feature.
Tecton helps monitor both data quality and operational performance, so your models can reliably receive fresh feature data within latency requirements.
Currently, Tecton monitors the following operational metrics:
Tecton will also soon provide data quality monitoring and data drift monitoring. Stay tuned!
Tecton serves as a central registry for the features that are running in production. This feature registry serves as a source-of-truth for what's running in production.
Teams use the registry as a catalog to explore, develop, and publish new feature definitions. The Tecton system uses the feature registry to decide how to compute, store and serve feature values.
The state of the Feature Registry can be read from the Web UI. To edit the Feature Registry, edit the the repository that defines the registry using the Tecton CLI.
If you're just getting started using a feature store, you should start by looking at the Tecton tutorial, which walks through the core workflows of interacting with the feature store including all of the components previously described.