Skip to main content
Version: 0.4

Tecton Concepts

Tecton is a Feature Platform for the enterprise:

Feature Store Components

The Feature Store is a core component of the Feature Platform.

Features can be computed externally and written directly to the Feature Store via Tecton's ingestion APIs. This data can then be fetched offline for model training or online for model inference.

Features can also be defined as transformation pipelines managed by Tecton. This brings many benefits such as:

  • Feature definitions as code
  • Feature pipeline lineage
  • Automatic backfilling
  • Pipeline orchestration and job management
  • Monitoring
Feature Store vs. Feature Platform

A Feature Store is a system designed to store and serve feature values for machine learning applications.

A Feature Platform addresses the full range of challenges in building and maintaining features for operational machine learning. In addition to storing and serving features, Feature Platforms have support for defining, testing, orchestrating, monitoring, and managing features.

For more information on Feature Platforms, see What is a Feature Platform?

Key Tecton Concepts​

Feature Repository​

A feature repository is a collection of declarative Python files that define feature pipelines using Tecton’s Feature Definition Framework (part of the Tecton Python SDK). Feature repositories are typically backed by Git.

Workspace​

Workspaces are remote environments that manage collections of feature pipelines and services. Feature repositories get β€œapplied” to workspaces via the Tecton CLI or a CI/CD pipeline. Feature definitions are then interpreted and turned into physical pipelines that are managed by Tecton and features become discoverable in Tecton's Web UI. Typically different orgs have their own workspaces set up with their own access controls.

There are two different types of workspaces:

  1. Live:

    Live workspaces have full access to compute, storage, serving, and monitoring resources and are used for serving production applications. Organizations can have one or more live workspaces depending on their organizational needs for sharing and access controls. Live workspaces can also be useful for creating staging environments to test live feature pipelines before moving them to production.

    Feature definitions that are applied to a live workspace will begin materialization according to the materialization configuration of the Feature Views (e.g. online=True or offline=True).

  2. Development:

    Development workspaces are used for experimentation and development of new feature pipelines. Development workspaces have no access to compute, storage, serving, and monitoring resources. Feature definitions that are applied to a development workspace will not use any resources or incur any cost, regardless of the materialization configuration.

    Features in development workspaces are discoverable via the Web UI and can be fetched and ran ad-hoc from a notebook for testing.

Materialization​

Materialization is the process of precomputing feature data by executing a feature pipeline and then publishing the results to the Online or Offline Feature Store. Materialization enables low-latency feature retrieval online and faster offline feature retrieval queries.

Feature Compute Engines​

Feature pipelines can leverage different types of feature compute depending on the requirements of the use case. This can include: Python, Spark SQL, PySpark, Snowflake SQL, and Snowpark.

Offline Feature Store​

The Offline Feature Store holds all historical values of features to be used for training or batch inference. Tecton plugs directly into your data lake or warehouse to use as the offline store.

Online Feature Store​

The Online Feature Store is a low-latency key-value store that holds the latest values of pre-computed features. It is located in the online environment and is used to look up features for online inference.

Feature Server​

The Feature Server is a managed Tecton service that fetches real-time feature values from the Online Feature Store, running real-time feature transformations, and returning feature vectors via a REST API.

Monitoring and Alerting​

There are two types of monitoring that Tecton manages:

  1. System Monitoring: Tecton monitors for job health, service uptime, and request latencies.
  2. Data Quality Monitoring (Private Preview): Tecton monitors the quality of the data generated by feature pipelines, as well as for data drift in features between training and serving.

Tecton has built in alerting for system-related and data-quality-related issues.

Tecton Product Interfaces​

Tecton SDK​

Tecton's Python SDK is used for defining and testing feature pipelines and services, and generating datasets for training and batch inference.

Tecton's SDK has two APIs:

  1. Declarative API: This is the API used to define feature pipelines in the Feature Repository.
  2. Interactive API: This is the API used (often in a notebook) to fetch remote Tecton objects (registered with a workspace), test feature pipelines, and create training datasets.

Tecton Web UI​

The Tecton Web UI allows users to browse, discover, and manage feature pipelines. In the web UI you can also:

  • Understand lineage
  • Configure access controls
  • Manage users
  • Check the status of backfills
  • Monitor, cancel, and re-run materialization jobs
  • Understand data quality
  • Monitor online feature services

Each Tecton account is associated with a URL (e.g. <your-account>.tecton.ai) and contains all of the workspaces created in that account.

Tecton CLI​

The Tecton CLI is primarily used for managing workspaces, applying feature repository changes to a workspace (via the tecton apply command), and creating API keys.

Tecton HTTP API​

Tecton's HTTP API is used for retrieving feature vectors at low-latency online in order to power online model inference.