Skip to main content
Version: 0.9

Test Batch Features

Import libraries and select your workspace

import tecton
import pandas
from datetime import datetime, timedelta

ws = tecton.get_workspace("prod")

Load a Batch Feature View

fv = ws.get_feature_view("user_transaction_counts")

Run a Feature View transformation pipeline

The BatchFeatureView::run_transformation function can be used to dry run execute a Feature View transformation pipeline over a given time range. This can be useful for checking the output of your feature transformation logic or debugging a materialization job.


There is no guarantee that the output data is the same as the feature values that would be created in this time frame, such as in the following cases:

  • When using incremental backfills, feature data for a given time range may depend on multiple executions of the Feature view transformation pipeline.
  • Feature values may be dependent on scheduling information (e.g. batch_schedule, data_delay, feature_start_time) that doesn't match the start_time and end_time you provide.
  • Aggregations may require more input data that the window you provide with start_time and end_time.

If you want to produce feature values for a given time range, you should use get_features_in_range(start_time, end_time).

result_dataframe = fv.run_transformation(start_time=datetime(2021, 1, 1), end_time=datetime(2022, 1, 2)).to_pandas()
0user_6000032784852021-01-01 06:25:57other
1user_4699984415712021-01-01 07:16:06Visa
2user_5025676046892021-01-01 04:39:10Visa
3user_9306919581072021-01-01 10:52:31Visa
4user_7825107887082021-01-01 20:15:25other

Run with mock sources

Mock input data sources can be passed into the BatchFeatureView::run_transformation function using the same source names from the Feature View definition.

users_data = pandas.DataFrame(
"user_id": ["user_1", "user_1", "user_2"],
"cc_num": ["423456789012", "567890123456", "678901234567"],
"signup_timestamp": [
datetime(2022, 1, 1, 2),
datetime(2022, 1, 1, 4),
datetime(2022, 1, 1, 3),

result_dataframe = fv.run_transformation(
start_time=datetime(2022, 1, 1),
end_time=datetime(2022, 1, 2),
mock_inputs={"users": users_data}, # `users` is the name of this Feature View input.

0user_12022-01-01 02:00:00Visa
1user_12022-01-01 04:00:00MasterCard
2user_22022-01-01 03:00:00Discover

Run a Batch Feature View with tiled aggregations

When a feature view with tile aggregates, the query operates in three logical steps:

  1. The feature view query is run over the provided time range. The user defined transformations are applied over the data source.
  2. The result of #1 is aggregated into tiles the size of the aggregation_interval.
  3. The tiles from #2 are combined to form the final feature values. The number of tiles that are combined is based off of the time_window of the aggregation.

To see the output of #1, use run_transformation(). For #2, use get_partial_aggregates(). For #3, get_features_in_range().

For more details on aggregate_tiles, refer to Creating Features that use Time-Windowed Aggregations.

agg_fv = ws.get_feature_view("user_transaction_counts")

result_dataframe = agg_fv.run_transformation(
start_time=datetime(2022, 5, 1),
end_time=datetime(2022, 5, 2),

0user_22250678998412022-05-01 21:04:38
1user_2699081696812022-05-01 19:45:14
2user_33775031741212022-05-01 15:18:48
3user_33775031741212022-05-01 07:11:31
4user_33775031741212022-05-01 01:50:51
result_dataframe = agg_fv.get_partial_aggregates(
start_time=datetime(2022, 5, 1),
end_time=datetime(2022, 5, 2),

0user_22250678998412022-05-01 00:00:002022-05-02 00:00:00
1user_2699081696812022-05-01 00:00:002022-05-02 00:00:00
2user_33775031741242022-05-01 00:00:002022-05-02 00:00:00
3user_40253984590122022-05-01 00:00:002022-05-02 00:00:00
4user_46161596668512022-05-01 00:00:002022-05-02 00:00:00

Get a Range of Feature Values from the Offline Store

BatchFeatureView::get_features_in_range can read a range of feature values from the offline store between a given start_time and end_time.

from_source=True can be passed in in order to bypass the offline store and compute features on-the-fly against the raw data source. This is useful for testing the expected output of feature values.

Use from_source=False (default) to see what data is materialized in the offline store.

result_dataframe = fv.get_features_in_range(start_time=datetime(2022, 5, 1), end_time=datetime(2022, 5, 2)).to_pandas()
0user_2051257466822022-05-01 00:00:0028342022-05-01 00:00:00
1user_2225067899842022-05-01 00:00:001421412022-05-01 00:00:00
2user_2685148449662022-05-01 00:00:00129662022-05-01 00:00:00
3user_3944957590232022-05-01 00:00:00121682022-05-01 00:00:00
4user_4598428899562022-05-01 00:00:00114392022-05-01 00:00:00

Read the Latest Features from Online Feature Store


For performance reasons, this function should only be used for testing and not in a production environment. To read features online efficiently, see Reading Features for Inference

fv.get_online_features({"user_id": "user_609904782486"}).to_dict()
Out: {
"transaction_count_1d_1d": 1,
"transaction_count_30d_1d": 17,
"transaction_count_90d_1d": 56,

Read Historical Features from Offline Feature Store with Time-Travel

Create an events DataFrame with events to look up. For more information on the events dataframe, check out Selecting Sample Keys and Timestamps.

events = pandas.DataFrame(
"user_id": ["user_722584453020", "user_461615966685"],
"timestamp": [datetime(2022, 5, 1, 3, 20, 0), datetime(2022, 6, 6, 2, 30, 0)],
0user_7225844530202022-05-01 03:20:00
1user_4616159666852022-06-06 02:30:00

from_source=True can be passed in in order to bypass the offline store and compute features on-the-fly against the raw data source. However, this will be slower than reading feature data that has been materialized to the offline store.

result_dataframe = fv.get_features_for_events(events, from_source=True).to_pandas()
0user_4616159666852022-06-06 02:30:0001340
1user_7225844530202022-05-01 03:20:0002873

Was this page helpful?

🧠 Hi! Ask me anything about Tecton!

Floating button icon