Skip to main content
Version: Beta 🚧

Realtime Feature View

A Realtime Feature View runs row-level, request-time transformations on data from Request Sources, Batch Feature Views, or Stream Feature Views. Unlike Batch and Stream Feature Views, Realtime Feature Views do not precompute and materialize data to the Feature Store, but instead run transformations at request time.

Use Cases​

Realtime Feature Views are useful for:

  • Calculating features from request-time data (e.g. current transaction, user location)
  • Creating features based on one or more upstream Materialized Feature Views
  • Defining features without rematerializing Feature Store data
  • Post-processing feature data (e.g. null imputation)

Common Examples​

  • Converting GPS coordinates to geohash
  • Parsing search strings
  • Comparing transactions against user averages
  • Calculating Z-Score or other statistical metrics
  • Computing embedding similarities

Defining Realtime Feature Views​

Using Attribute Features​

Attribute features can be used to extract an input field from a source or the output of a transformation. To project an input field, simply use the __ syntax to access the field in the source:

transaction_analysis = RealtimeFeatureView(
sources=[transaction_aggregates], # A Batch Feature View with Aggregate Features
features=[
Attribute("transaction_aggregates__amount_avg_7d_1d", Float64),
],
)

Using Calculation Features​

Calculations are used to define SQL-like expressions which will be efficiently executed directly in the Feature Server without the overhead of a Python or Pandas transformation.

transaction_analysis = RealtimeFeatureView(
sources=[transaction_metrics],
features=[
Calculation(
name="transaction_z_score",
expr="COALESCE(transaction_metrics.amount, 0) / COALESCE(transaction_metrics.stddev, 1)",
),
],
)
caution

Feature Views using Calculation Features can not use a transformation function.

Using Transformation Functions with Python Mode​

For more complex transformations, Tecton supports python mode which allows you to define arbitrary Python transformations using a decorator pattern.

A Python transformation function accepts a dictionary of input sources and returns a dictionary of output features.

@realtime_feature_view(
sources=[txn_request, user_metrics],
mode="python",
features=[Attribute("risk_score", Float64)],
)
def calculate_risk(txn_request, user_metrics):
# Arbitrary Python calculating risk_score
return {"risk_score": txn_request["amount"] * user_metrics["fraud_score"]}

Using Transformation Functions with pandas Mode​

Similarly, pandas mode allows you to define powerful transformations that accept a Pandas DataFrame as input and returns a Pandas DataFrame of output features.

@realtime_feature_view(
sources=[txn_request, user_metrics],
mode="pandas",
features=[Attribute("risk_score", Float64)],
)
def calculate_risk(txn_request, user_metrics):
# Calculate risk score using pandas operations
result = pd.DataFrame()
result["risk_score"] = txn_request["amount"] * user_metrics["fraud_score"]
return result

Realtime Feature View Best Practices​

Use Calculation Features when:

  • Your use case can be accomplished using the set of supported SQL functions.
  • When performance is critical. Since you avoid the overhead of a Python or Pandas transformation, Calculation Features will be more efficient than Python and Pandas mode transformations for most use cases.

Use Python or Pandas Mode Transformations when you need:

  • Full Python capabilities
  • Complex algorithms / logic
  • External libraries
  • External API calls
Performance Considerations

python mode is recommended for more efficient online serving. pandas mode is recommended if you would like to optimize for more efficient offline retrieval.

Calculation Features are recommended for simple transformations that are performant for both online and offline retrieval.

Examples​

from tecton import realtime_feature_view, RequestSource, Attribute
from tecton.types import Float64, Bool, Field
from features.user_transaction_amount_averages import user_transaction_amount_averages


transaction_request = RequestSource(schema=[Field("amount", Float64)])

transaction_amount_comparison = RealtimeFeatureView(
name="transaction_amount_comparison",
sources=[transaction_request, user_transaction_amount_averages],
features=[
Calculation(
name="transaction_amount_is_higher_than_average",
feature_type=Bool,
expr="transaction_request.amount > COALESCE(user_transaction_amount_averages.amount_mean_24h_10m, 0)",
)
],
)

Was this page helpful?