Skip to content

Tecton Concepts and Frameworks

Tecton makes building operational ML data flows and consuming ML data as easy as possible. Using Tecton involves two main sets of APIs, one for composing data flows, and the other for consuming data.

Declarative Framework APIs

Composing your feature pipelines using Tecton objects like Feature Views, Data Sources and Feature Services using Tecton's declarative framework.

Read APIs

Accessing feature values through Tecton's read APIs for online serving or offline model training.

A minimal end-to-end example is illustrate here, and can be found on our sample repo in Github.

Declarative Pipeline Composition

Defining Data Flows with Tecton's Framework

Tecton's definitions framework is designed for you to express ML data flows. There are 5 main Tecton objects.

  1. Data Sources: Data sources define a connection to a batch, stream, or request data source (i.e. request-time parameters) and are used as inputs to feature pipelines, known as "Feature Views" in Tecton.
  2. Feature Views: Feature Views take in data sources as inputs, or in some cases other Feature Views, and define a pipeline of transformations to compute one or more features. Feature Views also provide Tecton with additional information such as metadata and orchestration, serving, and monitoring configurations. There are many types of Feature Views, each designed to support a common data flow pattern.
  3. Transformations: Each Feature View has a single pipeline of transformations that define the computation of one or more features. Transformations can be modularized and stitched together into a pipeline.
  4. Entities: An Entity is an object or concept that can be modeled and that has features associated with it. Examples include User, Ad, Product, and Product Category. In Tecton, every Feature View is associated with one or more entities.
  5. Feature Services: A Feature Service represents a set of features that power a model. Typically there is one Feature Service for each version of a model. Feature Services provide convenient endpoints for fetching training data through the Tecton SDK or fetching real-time feature vectors from Tecton's REST API.

In practise, composing pipelines with Tecton means connecting Data Sources to Feature Views to Feature Services.

All of these objects are declared in Python. We recommend managing your source files using Git, as it will be the source of truth for the Feature Store we spin up on your behalf.

From Definitions to Operations

With your feature data flows properly defined, Tecton takes care of all of the operational concerns involved in actually running these data flows and serving the features, including:

  • Materialization: orchestration of all transformations, and saving computed feature values in Tecton's online and offline stores
  • Low latency serving: orchestrating feature computation and caching to minimize serving latency
  • Point-in-time-correctness: ensuring future signal does not inappropriately leak into training datasets, thus ensuring the accuracy of the trained model and avoiding data skew
  • Monitoring: showing you the status of your data flow pipelines and alerting any upstream outages.

Consuming Feature Data through Feature Service Endpoints

Depending on the usage scenario, you will use different parts of the consumption API for fetching feature data.

  • Serving Online (Guide) – when your application needs to get up-to-date feature values in real time in production.
  • api/v1/feature-service/get-features as a REST API request
  • get_feature_vector from a Feature Service (via the Python SDK)
  • Training Offline (Guide) - when you need to training models with historical data
  • get_historical_features from a Feature Service (via the Python SDK)

That's the core of it. Compose your data flows with the Definitions framework, then read feature values from the Feature Store we spin up on your behalf.

Next: Learn how we turn your code describing your data flow into a functioning ML data application with Tecton Architecture →