Skip to main content
Version: Beta 🚧

Realtime Feature View

A Realtime Feature View runs row-level, request-time transformations on data from Request Sources, Batch Feature Views, or Stream Feature Views. Unlike Batch and Stream Feature Views, Realtime Feature Views do not precompute and materialize data to the Feature Store, but instead run transformations at request time.

Use Cases​

Realtime Feature Views are useful for:

  • Calculating features from request-time data (e.g. current transaction, user location)
  • Creating features based on one or more upstream Materialized Feature Views
  • Post-processing feature data (e.g. null imputation)

Common Examples​

  • Converting GPS coordinates to geohash
  • Parsing search strings
  • Comparing transactions against user averages
  • Calculating Z-Score or other statistical metrics
  • Computing embedding similarities

Defining Realtime Feature Views​

Using Calculation Features​

Calculations are used to define row-level SQL-like expressions, which will be efficiently executed directly in the Feature Server without the overhead of a Python or Pandas transformation.

transaction_analysis = RealtimeFeatureView(
sources=[transaction_metrics],
features=[
Calculation(
name="transaction_z_score",
expr="COALESCE(transaction_metrics.amount, 0) / COALESCE(transaction_metrics.stddev, 1)",
),
],
)
caution

Feature Views using Calculation Features can not use a python or pandas mode transformation function.

Using Transformation Functions with Python Mode​

For more complex transformations, Tecton supports python mode which allows you to define arbitrary Python transformations using a decorator pattern.

A Python transformation function accepts a dictionary of input sources and returns a dictionary of output features.

Attribute features are used to project the output of the python transformation function.

@realtime_feature_view(
sources=[txn_request, user_metrics],
mode="python",
features=[Attribute("risk_score", Float64)],
)
def calculate_risk(txn_request, user_metrics):
# Arbitrary Python calculating risk_score
return {"risk_score": txn_request["amount"] * user_metrics["fraud_score"]}

Using Transformation Functions with pandas Mode​

Similarly, pandas mode allows you to define powerful transformations that accept a Pandas DataFrame as input and returns a Pandas DataFrame of output features.

Attribute features are used to project the output of the pandas transformation function.

@realtime_feature_view(
sources=[txn_request, user_metrics],
mode="pandas",
features=[Attribute("risk_score", Float64)],
)
def calculate_risk(txn_request, user_metrics):
# Calculate risk score using pandas operations
result = pd.DataFrame()
result["risk_score"] = txn_request["amount"] * user_metrics["fraud_score"]
return result

Realtime Feature View Best Practices​

Use Calculation Features when:

  • Your use case can be accomplished using the set of supported SQL functions.
  • When performance is critical. Since you avoid the overhead of a Python or Pandas transformation, Calculation Features will be more efficient than Python and Pandas mode transformations for most use cases.

Use Python or Pandas Mode Transformations when you need:

  • Full Python capabilities
  • Complex algorithms / logic
  • External libraries
  • External API calls
Performance Considerations

python mode is recommended for more efficient online serving. pandas mode is recommended if you would like to optimize for more efficient offline retrieval.

Calculation Features are recommended for simple transformations and are most performant for both online and offline retrieval.

Examples​

from tecton import realtime_feature_view, RequestSource, Attribute
from tecton.types import Float64, Bool, Field
from features.user_transaction_amount_averages import user_transaction_amount_averages


transaction_request = RequestSource(schema=[Field("amount", Float64)])

transaction_amount_comparison = RealtimeFeatureView(
name="transaction_amount_comparison",
sources=[transaction_request, user_transaction_amount_averages],
features=[
Calculation(
name="transaction_amount_is_higher_than_average",
feature_type=Bool,
expr="transaction_request.amount > COALESCE(user_transaction_amount_averages.amount_mean_24h_10m, 0)",
)
],
)

Was this page helpful?