⏱️ Building Realtime Features
Many critical features for real-time models can only be calculated at the time of a request, either because:
- They require data that is only available at request time (e.g. a user's current location)
- They can't be efficiently pre-computed (e.g. computing the embedding similarity between all possible users)
Running transformations at request time can also be useful for:
- Post-processing feature data (example: imputing null values)
- Running additional transformations after Tecton-managed aggregations
- Defining new features without needing to rematerialize Feature Store data
This is where "Realtime" features come in. In Tecton, a Realtime Feature View lets you calculate features at the time of a request, using either data passed in with the request or pre-computed batch and stream features.
This tutorial will show how you can develop, test, and productionize realtime features for real-time models. This tutorial is centered around a fraud detection use case, where we need to predict in real-time whether a transaction that a user is making is fraudulent.
This tutorial assumes some basic familiarity with Tecton. If you are new to Tecton, we recommend first checking out Building a Production AI Application with Tecton which walks through an end-to-end journey of building a real-time ML application with Tecton.
⚙️ Install Pre-Reqs
First things first, let's install the Tecton SDK and other libraries used by this tutorial (we recommend in a virtual environment) using:
!pip install 'tecton[rift]==1.0.0' gcsfs s3fs -q
After installing, run the following command to log in to your organization's Tecton account. Be sure to use your own account name.
Note: You need to press enter
after pasting in your authentication code.
import tecton
tecton.login("explore.tecton.ai") # replace with your URL
Let's then run some basic imports and setup that we will use later in the tutorial.
from tecton import *
from tecton.types import *
from datetime import datetime, timedelta
from pprint import pprint
import pandas as pd
tecton.conf.set("TECTON_OFFLINE_RETRIEVAL_COMPUTE_MODE", "rift")
👩💻 Create a realtime feature that leverages request data
For our fraud detection model, let's say we want to be able to leverage information about the user's current transaction that we are evaluating. We only have access to that information at the time of evaluation, so any features derived from current transaction information need to be computed in real-time.
Realtime Feature Views are able to leverage real-time request data for building features. In this case, we will do a very simple check to see if the current transaction amount is over $1000. This is a pretty basic feature, but in the next section we will look at more complex operations.
To define a realtime feature that leverages request data, we first define a Request Source. The Request Source specifies the expected schema for the data that will be passed in with the request.
When using mode='python'
the inputs and outputs of the Realtime Feature View
are dictionaries.
For more information on modes
in Realtime Feature Views see
Realtime Feature View Best Practices.
transaction_request = RequestSource(schema=[Field("amount", Float64)])
@realtime_feature_view(
sources=[transaction_request],
mode="python",
features=[Attribute("transaction_amount_is_high", Bool)],
)
def transaction_amount_is_high(request):
return {"transaction_amount_is_high": request["amount"] > 1000}
Now that we've defined our feature, we can test it out with some mock data using
.run_transformation()
.
input_data = {"request": {"amount": 182.4}}
transaction_amount_is_high.run_transformation(input_data=input_data)
Out:
{'transaction_amount_is_high': False}