Tecton Fundamentals Tutorial
You can run this tutorial on Tecton on Databricks or Tecton on EMR.
The last step in the tutorial requires the use of Databricks to serve a model for inference. But in general, you can use the system of your choice to serve a model for inference.
Welcome to the Tecton Fundamentals Tutorial. This tutorial will introduce you to the basics of working with Tecton. We will use fraud detection as a practical use case, a common challenge in industries such as banking and e-commerce. Tecton is particularly well-suited to such scenarios, as they often require a blend of real-time and batch-style features.
This tutorial is designed for:
- Data Scientists: To gain an understanding of how Tecton can aid in feature engineering and managing feature pipelines.
- Data Engineers: To learn how Tecton automates the scheduling of feature data pipelines.
- Machine Learning Engineers: To explore how Tecton can assist in creating training datasets and feature vectors for online inference.
By the end of this tutorial, you should be able to:
- Define and manage the various objects associated with a feature.
- Define and manage features for a machine learning model using Tecton.
- Control the materialization of feature values.
- Use Tecton to retrieve feature data for model training.
- Make real-time predictions with a model trained using feature data from Tecton.
The tutorial is divided into two parts:
Part 1: Creating features
In this section, you'll learn to define and create features using Tecton. We'll guide you through the creation of five specific features, which will later be used in a model to predict if a real-time transaction is fraudulent:
user_credit_card_issuer: Determines the user's credit card issuer, based on the user's credit card number.
user_transaction_counts: Calculates the number of transactions (per user), over the last day, 30 days, and 90 days.
transaction_amount_is_high: Returns whether the transaction amount is greater than 100 dollars.
user_home_location: Outputs the latitude and longitude of the user's home location. These outputs are used by the next feature (
transaction_distance_from_home: Calculates how far from the user's home a transaction is made.
Additionally, you'll learn about managing feature-related objects, such as data sources, and controlling the pre-computation of feature values, a process known as materialization.
Part 2: Training a model and making a prediction
In the second part, you'll use the features created in the first part to train a machine learning model. This section will demonstrate how to retrieve feature data from Tecton, use it to train your model, and make real-time predictions.
Let's get started!