Version: 1.2

Testing Batch Features

This guide covers how to test batch features in Tecton using offline retrieval methods with mock data.

Overview

Testing batch features allows you to validate your feature transformation logic and aggregation computations before deploying to production. You can test batch features by passing mock data to the offline retrieval methods.

Testing Methods

Batch features support multiple testing approaches:

1. Testing Final Feature Values

Time Range Testing

Use get_features_in_range() to test the complete feature computation including aggregations over a time range:

# Test final aggregated feature values over a time range
result_df = batch_fv.get_features_in_range(
    start_time=datetime(2022, 5, 1), end_time=datetime(2022, 5, 2), mock_inputs={"source_name": mock_data}
)

Result

user_id	transaction_count_1d_1d	transaction_count_30d_1d	transaction_count_90d_1d	_valid_from	_valid_to
user_1	2	8	34	2022-05-01 00:00:00	2022-05-02 00:00:00
user_2	1	42	141	2022-05-01 00:00:00	2022-05-02 00:00:00

Point-in-Time Testing

Use get_features_for_events() to test final feature values for specific entity/timestamp combinations:

# Test final feature values for specific events
events_df = pandas.DataFrame(
    {"user_id": ["user_1", "user_2"], "timestamp": [datetime(2022, 5, 1, 12), datetime(2022, 5, 1, 15)]}
)

result_df = batch_fv.get_features_for_events(events=events_df, mock_inputs={"source_name": mock_data})

Result

user_id	timestamp	user_transaction_counts__transaction_count_1d_1d	user_transaction_counts__transaction_count_30d_1d
user_1	2022-05-01 12:00:00	0	28
user_2	2022-05-01 15:00:00	0	13

2. Testing Partial Aggregates

For batch features with aggregations, use get_partial_aggregates() to test intermediate aggregation results (time range only):

# Test partial aggregation tiles
result_df = batch_fv.get_partial_aggregates(
    start_time=datetime(2022, 5, 1), end_time=datetime(2022, 5, 2), mock_inputs={"source_name": mock_data}
)

Result

user_id	transaction_count_1d	_interval_start_time	_interval_end_time
user_1	4	2022-05-01 00:00:00	2022-05-02 00:00:00
user_2	1	2022-05-01 00:00:00	2022-05-02 00:00:00

3. Testing Transformation Logic Only

Use run_transformation() to test just the transformation logic without aggregations (time range only). This runs the transformation function as it would for a materialization job with the time range [start_time, end_time):

# Test transformation logic only
result_df = batch_fv.run_transformation(
    start_time=datetime(2022, 5, 1), end_time=datetime(2022, 5, 2), mock_inputs={"source_name": mock_data}
)

Result

user_id	timestamp	transaction	signup_timestamp	credit_card_issuer
user_1	2022-05-01 21:04:38	1	2021-01-01 06:25:57	other
user_2	2022-05-01 19:45:14	1	2021-01-01 07:16:06	Visa

4. Testing Online Feature Store

Test the latest feature values from the online feature store using get_online_features():

Do not use get_online_features() to read features in production.

This method is intended for testing and does not have production level performance. To read features online efficiently in production, see Reading Features for Inference.

# Test reading the latest features from the online store
online_result = fv.get_online_features({"user_id": "user_1"}).to_dict()
print(online_result)

Result

{
    "transaction_count_1d_1d": 1,
    "transaction_count_30d_1d": 17,
    "transaction_count_90d_1d": 56,
}

Understanding Batch Feature View Aggregation Levels

When a feature view has tile aggregations, the query operates in three logical steps:

Transformation (run_transformation()) - The feature view query runs over the provided time range [start_time, end_time) with user-defined transformations applied
Partial Aggregation (get_partial_aggregates()) - Results are aggregated into tiles based on the aggregation_interval
Final Aggregation (get_features_in_range() or get_features_for_events()) - Tiles are combined to form final feature values based on the time_window of each aggregation

Mock Data Guidelines

When creating mock data for batch feature testing:

Data Requirements

Include all columns referenced in your feature view transformation
Ensure timestamp columns are properly formatted
Include entity join key columns
Provide sufficient data to cover your test scenarios

Best Practices

Use realistic data ranges that match your expected production data
Include edge cases (nulls, extreme values, empty results)
Test with data that spans multiple aggregation intervals
Verify timestamp filtering works correctly with your mock data

Common Testing Scenarios

Testing Non-Aggregate Features

For simple batch features without aggregations:

Focus on transformation logic correctness
Verify data filtering and joins work as expected
Test with various data patterns and edge cases

Testing Aggregate Features

For batch features with Tecton-managed aggregations:

Test each aggregation level (transformation, partial, final)
Verify aggregation windows compute correctly
Test boundary conditions (start/end of windows)
Validate that different time windows produce expected results

Testing Feature Views that use Incremental Backfills

For features using incremental backfills:

Test that the incremental backfill job is reading the expected amount of data
Ensure that you are running run_transformation() with a time period equal to the batch schedule of the feature view

Example: Testing Aggregate Features End-to-End

This example demonstrates testing all three aggregation levels for a batch feature view with Tecton-managed aggregations:

import tecton
import pandas as pd
from datetime import datetime

ws = tecton.get_workspace("prod")
agg_fv = ws.get_feature_view("user_transaction_counts")

# Create transaction data spanning multiple days for aggregation testing
transaction_data = pd.DataFrame(
    {
        "user_id": ["user_1", "user_2", "user_3"] * 5,
        "timestamp": [
            datetime(2022, 5, 1, 21, 4, 38),
            datetime(2022, 5, 1, 19, 45, 14),
            datetime(2022, 5, 1, 15, 18, 48),
            datetime(2022, 5, 1, 7, 11, 31),
            datetime(2022, 5, 1, 1, 50, 51),
            datetime(2022, 5, 2, 9, 30, 15),
            datetime(2022, 5, 2, 14, 20, 22),
            datetime(2022, 5, 2, 18, 45, 33),
            datetime(2022, 5, 3, 8, 15, 44),
            datetime(2022, 5, 3, 12, 30, 55),
            datetime(2022, 5, 3, 16, 45, 11),
            datetime(2022, 5, 3, 20, 0, 22),
            datetime(2022, 5, 4, 10, 15, 33),
            datetime(2022, 5, 4, 14, 30, 44),
            datetime(2022, 5, 4, 18, 45, 55),
        ],
        "transaction": [1] * 15,
    }
)

# 1. Test transformation logic
transformation_result = agg_fv.run_transformation(
    start_time=datetime(2022, 5, 1), end_time=datetime(2022, 5, 2), mock_inputs={"transactions": transaction_data}
)
print("Transformation output:")
display(transformation_result.to_pandas())

# 2. Test partial aggregates (tiles)
partial_result = agg_fv.get_partial_aggregates(
    start_time=datetime(2022, 5, 1), end_time=datetime(2022, 5, 2), mock_inputs={"transactions": transaction_data}
)
print("Partial aggregates (tiles):")
display(partial_result.to_pandas())

# 3. Test final aggregated features
final_result = agg_fv.get_features_in_range(
    start_time=datetime(2022, 5, 1), end_time=datetime(2022, 5, 2), mock_inputs={"transactions": transaction_data}
)
print("Final aggregated features:")
display(final_result.to_pandas())

Overview​

Testing Methods​

1. Testing Final Feature Values​

Time Range Testing​

Point-in-Time Testing​

2. Testing Partial Aggregates​

3. Testing Transformation Logic Only​

4. Testing Online Feature Store​

Understanding Batch Feature View Aggregation Levels​

Mock Data Guidelines​

Data Requirements​

Best Practices​

Common Testing Scenarios​

Testing Non-Aggregate Features​

Testing Aggregate Features​

Testing Feature Views that use Incremental Backfills​

Example: Testing Aggregate Features End-to-End​

Was this page helpful?

Overview

Testing Methods

1. Testing Final Feature Values

Time Range Testing

Point-in-Time Testing

2. Testing Partial Aggregates

3. Testing Transformation Logic Only

4. Testing Online Feature Store

Understanding Batch Feature View Aggregation Levels

Mock Data Guidelines

Data Requirements

Best Practices

Common Testing Scenarios

Testing Non-Aggregate Features

Testing Aggregate Features

Testing Feature Views that use Incremental Backfills

Example: Testing Aggregate Features End-to-End