Tecton Concepts
Tecton is a Feature Platform for the enterprise:
The Feature Store is a core component of the Feature Platform.
Features in Tecton are typically defined as transformation pipelines managed by Tecton. This brings many benefits such as:
- Feature definitions as code
- Feature pipeline lineage
- Automatic backfilling
- Pipeline orchestration and job management
- Monitoring
Features can also be computed externally and written directly to the Feature Store via Tecton's ingestion APIs. This data can then be fetched offline for model training or online for model inference.
A Feature Store is a system designed to store and serve feature values for machine learning applications.
A Feature Platform addresses the full range of challenges in building and maintaining features for operational machine learning. In addition to storing and serving features, Feature Platforms have support for defining, testing, orchestrating, monitoring, and managing features.
For more information on Feature Platforms, see What is a Feature Platform?
Key Tecton Concepts​
Tecton Repository​
A Tecton Repository is a collection of Python files containing Tecton Object Definitions, which define feature pipelines and other dataflows within Tecton's framework. These repositories are typically stored in a Source Control Repository, like Git, enabling version control and collaboration.
Workspace​
Workspaces are remote environments where Tecton Object Definitions are applied and turned into data pipelines and services orchestrated and managed by Tecton.
When a Tecton Repository is
applied to a
workspace using the tecton apply
command via the Tecton CLI or
a CI/CD pipeline, the Object Definitions within the repository update the
Workspace Configuration. The Workspace Configuration encompasses the set of
Tecton Object Definitions currently active in a workspace.
It's common for different teams and organizations to have their own workspaces set up with their own access controls.
There are two types of workspaces:
-
Live:
Live Workspaces generate real-time endpoints accessible in the production environment. These workspaces utilize data infrastructure resources, which can lead to significant infrastructure costs. They are intended for serving production applications and can also be used to create staging environments for testing feature pipelines before moving to production.
Feature definitions that are applied to a live workspace will begin materialization according to the materialization configuration of the Feature Views (e.g.
online=True
oroffline=True
). -
Development:
Development Workspaces are used for development and testing purposes. These workspaces do not connect to the production environment or serve in real-time or materialize feature data (regardless of the materialization configuration), thus avoiding substantial infrastructure costs. Features in development workspaces are discoverable via the Web UI and can be fetched and run ad-hoc from a notebook for testing.
Materialization​
Materialization is the process of precomputing feature data by executing a feature pipeline and then publishing the results to the Online or Offline Feature Store. Materialization enables low-latency feature retrieval online and faster offline feature retrieval queries.
Materialization can be scheduled by Tecton or triggered via an API.
Feature Compute Engines​
Tecton comes with a built-in feature compute engine called Rift which can compute batch, stream, and real-time features consistently online and offline all using vanilla Python and SQL. Rift can also optionally plug into data warehouses like Snowflake or BigQuery and push compute down to the warehouse when appropriate.
Tecton can also leverage Spark for computing batch and stream features or generating training data. Tecton connects to data platforms like Google Cloud Dataproc, AWS EMR, and Databricks to execute and manage Spark jobs.
Offline Feature Store​
The Offline Feature Store holds all historical values of features to be used for training or batch inference. Tecton plugs directly into your data lake or warehouse to use as the offline store.
Online Feature Store​
The Online Feature Store is a low-latency key-value store that holds the latest values of pre-computed features. It is located in the online environment and is used to look up features for online inference.
Feature Server​
The Feature Server is a managed Tecton service that fetches real-time feature values from the Online Feature Store, running real-time feature transformations, and returning feature vectors via a REST API.
Monitoring and Alerting​
There are two types of monitoring that Tecton manages:
- System Monitoring: Tecton monitors for job health, service uptime, and request latencies.
- Data Quality Monitoring (Public Preview): Tecton monitors the quality of the data generated by feature pipelines, as well as for data drift in features between training and serving.
Tecton has built in alerting for system-related and data-quality-related issues.
Tecton Product Interfaces​
Tecton SDK​
Tecton's Python SDK is used for:
- Defining and testing feature pipelines and services
- Generating datasets for training and batch inference
Tecton Web UI​
The Tecton Web UI allows users to browse, discover, and manage feature pipelines. In the web UI you can also:
- Understand lineage
- Configure access controls
- Manage users
- Check the status of backfills
- Monitor, cancel, and re-run materialization jobs
- Understand data quality
- Monitor online feature services
Each Tecton account is associated with a URL (e.g. <your-account>.tecton.ai
)
and contains all of the workspaces created in that account.
Tecton CLI​
The
Tecton CLI
is primarily used for
managing workspaces,
applying feature repository changes to a workspace
(via the tecton apply
command), and creating API keys.
Tecton HTTP API​
Tecton's HTTP API is used for retrieving feature vectors at low-latency online in order to power online model inference.