Read Offline Features for Inference
Overview​
This example demonstrates how to perform batch inference in Tecton. Batch inference in Tecton is very similar to generating training data.
Fetch a Batch of Data from Tecton​
Assuming your model was trained with data from Tecton, you created a
FeatureService
in order to generate training data. The same FeatureService
you used to generate training data will be used to fetch a batch of data for
inference.
Similar to how you built training data, you'll need to generate a DataFrame that represents the data you wish to retrieve from Tecton. This DataFrame should be composed of rows containing:
- The join keys associated with each of your features
- Timestamps at which you'd like to retrieve data
- Columns corresponding to the
RequestSource
of anyRealtimeFeatureView
features, if yourFeatureService
includes one or moreRealtimeFeatureView
.
If you're not sure which join keys are associated with your features, the page
corresponding to your FeatureService
in the Web UI will list the entities
associated with all of your features. Each entity maps to a join key that you
will need.
Example: Building a Prediction Context for Fraud Detection​
In this example, let's imagine we have a fraud detection model that we would like to run nightly on the last 24 hours of transactions. The features for our model describe transactions, users, and merchants. To create our prediction context, we fetch a log of the transactions in the last day, which should look like this:
transaction_id | user_id | merchant_id | timestamp |
---|---|---|---|
51812359 | C1231006815 | M1979787155 | 2020-12-01 01:00:02.595066019 |
51812360 | C1666544295 | M2044282225 | 2020-12-01 01:00:02.940659192 |
51812361 | C1305486145 | M5532624065 | 2020-12-01 01:00:03.336173880 |
51812362 | C840083671 | M3899427010 | 2020-12-01 01:00:06.033070635 |
51812363 | C2048537720 | M1230701703 | 2020-12-01 01:00:06.711752585 |
Retrieve Data with the Prediction Context​
Now that you have a prediction context, you can use the Tecton SDK to retrieve features for inference. This will be the same code you used to generate a dataset:
# transaction_log is a dataframe containing the prediction context made above
ws = tecton.get_workspace("prod")
fs = ws.get_feature_service("demo_fraud_model")
batch_data = fs.get_features_for_events(transaction_log, timestamp_key="timestamp")
The call to get_features_for_events
will return a Tecton DataFrame, where your
feature values have been joined onto the prediction context. An example with a
single feature joined onto the above context would look like:
transaction_id | user_id | merchant_id | timestamp | transaction_details.amount |
---|---|---|---|---|
51812359 | C1231006815 | M1979787155 | 2020-12-01 01:00:02.595066019 | 35.0 |
51812360 | C1666544295 | M2044282225 | 2020-12-01 01:00:02.940659192 | 522.2 |
51812361 | C1305486145 | M5532624065 | 2020-12-01 01:00:03.336173880 | 1.2 |
51812362 | C840083671 | M3899427010 | 2020-12-01 01:00:06.033070635 | 90.2 |
51812363 | C2048537720 | M1230701703 | 2020-12-01 01:00:06.711752585 | 555.6 |
Perform Inference​
The Tecton DataFrame above can easily be used to perform batch inference; simply convert your data to a Pandas DataFrame:
batch_data_pandas = batch_data.to_pandas()
For other inference frameworks, you can persist your data to a file, then perform inference by loading from this file.