Tecton is a Feature Platform for the enterprise:
The Feature Store is a core component of the Feature Platform.
Features in Tecton are typically defined as transformation pipelines managed by Tecton. This brings many benefits such as:
- Feature definitions as code
- Feature pipeline lineage
- Automatic backfilling
- Pipeline orchestration and job management
Features can also be computed externally and written directly to the Feature Store via Tecton's ingestion APIs. This data can then be fetched offline for model training or online for model inference.
A Feature Store is a system designed to store and serve feature values for machine learning applications.
A Feature Platform addresses the full range of challenges in building and maintaining features for operational machine learning. In addition to storing and serving features, Feature Platforms have support for defining, testing, orchestrating, monitoring, and managing features.
For more information on Feature Platforms, see What is a Feature Platform?
Key Tecton Concepts
A Tecton Repository is a collection of Python files containing Tecton Object Definitions, which define feature pipelines and other dataflows within Tecton's framework. These repositories are typically stored in a Source Control Repository, like Git, enabling version control and collaboration.
Workspaces are remote environments where Tecton Object Definitions are applied and turned into data pipelines and services orchestrated and managed by Tecton.
When a Tecton Repository is
applied to a
workspace using the
tecton apply command via the Tecton CLI or
a CI/CD pipeline, the Object Definitions within the repository update the
Workspace Configuration. The Workspace Configuration encompasses the set of
Tecton Object Definitions currently active in a workspace.
It's common for different teams and organizations to have their own workspaces set up with their own access controls.
There are two types of workspaces:
Live Workspaces generate real-time endpoints accessible in the production environment. These workspaces utilize data infrastructure resources, which can lead to significant infrastructure costs. They are intended for serving production applications and can also be used to create staging environments for testing feature pipelines before moving to production.
Feature definitions that are applied to a live workspace will begin materialization according to the materialization configuration of the Feature Views (e.g.
Development Workspaces are used for development and testing purposes. These workspaces do not connect to the production environment or serve in real-time or materialize feature data (regardless of the materialization configuration), thus avoiding substantial infrastructure costs. Feature pipelines can be discovered via the Web UI and fetched and run ad-hoc from a notebook for testing.
Features in development workspaces are discoverable via the Web UI and can be fetched and run ad-hoc from a notebook for testing.
Materialization is the process of precomputing feature data by executing a feature pipeline and then publishing the results to the Online or Offline Feature Store. Materialization enables low-latency feature retrieval online and faster offline feature retrieval queries.
Materialization can be scheduled by Tecton or triggered via an API.
Feature Compute Engines
Feature pipelines can leverage different types of feature compute depending on the requirements of the use case. This can include: Python, Spark SQL, PySpark, Snowflake SQL, and Snowpark.
Offline Feature Store
The Offline Feature Store holds all historical values of features to be used for training or batch inference. Tecton plugs directly into your data lake or warehouse to use as the offline store.
Online Feature Store
The Online Feature Store is a low-latency key-value store that holds the latest values of pre-computed features. It is located in the online environment and is used to look up features for online inference.
The Feature Server is a managed Tecton service that fetches real-time feature values from the Online Feature Store, running real-time feature transformations, and returning feature vectors via a REST API.
Monitoring and Alerting
There are two types of monitoring that Tecton manages:
- System Monitoring: Tecton monitors for job health, service uptime, and request latencies.
- Data Quality Monitoring (Public Preview): Tecton monitors the quality of the data generated by feature pipelines, as well as for data drift in features between training and serving.
Tecton has built in alerting for system-related and data-quality-related issues.
Tecton Product Interfaces
Tecton's Python SDK is used for:
- Defining and testing feature pipelines and services
- Generating datasets for training and batch inference
Tecton Web UI
The Tecton Web UI allows users to browse, discover, and manage feature pipelines. In the web UI you can also:
- Understand lineage
- Configure access controls
- Manage users
- Check the status of backfills
- Monitor, cancel, and re-run materialization jobs
- Understand data quality
- Monitor online feature services
Each Tecton account is associated with a URL (e.g.
and contains all of the workspaces created in that account.
is primarily used for
applying feature repository changes to a workspace
tecton apply command), and creating API keys.
Tecton HTTP API
Tecton's HTTP API is used for retrieving feature vectors at low-latency online in order to power online model inference.