Realtime Feature View
A Realtime Feature View runs row-level, request-time transformations on data from Request Sources, Batch Feature Views, or Stream Feature Views. Unlike Batch and Stream Feature Views, Realtime Feature Views do not precompute and materialize data to the Feature Store, but instead run transformations at request time.
Use Cases​
Realtime Feature Views are useful for:
- Calculating features from request-time data (e.g. current transaction, user location)
- Creating features based on one or more upstream Materialized Feature Views
- Defining features without rematerializing Feature Store data
- Post-processing feature data (e.g. null imputation)
Common Examples​
- Converting GPS coordinates to geohash
- Parsing search strings
- Comparing transactions against user averages
- Calculating Z-Score or other statistical metrics
- Computing embedding similarities
Defining Realtime Feature Views​
Using Attribute
Features​
Attribute
features can be used to extract an input field from a source or the
output of a transformation. To project an input field, simply use the __
syntax to access the field in the source:
transaction_analysis = RealtimeFeatureView(
sources=[transaction_aggregates], # A Batch Feature View with Aggregate Features
features=[
Attribute("transaction_aggregates__amount_avg_7d_1d", Float64),
],
)
Using Calculation
Features​
Calculations are used to define SQL-like expressions which will be efficiently executed directly in the Feature Server without the overhead of a Python or Pandas transformation.
transaction_analysis = RealtimeFeatureView(
sources=[transaction_metrics],
features=[
Calculation(
name="transaction_z_score",
expr="COALESCE(transaction_metrics.amount, 0) / COALESCE(transaction_metrics.stddev, 1)",
),
],
)
Feature Views using Calculation
Features can not use a transformation
function.
Using Transformation Functions with Python
Mode​
For more complex transformations, Tecton supports python
mode which allows you
to define arbitrary Python transformations using a decorator pattern.
A Python transformation function accepts a dictionary of input sources and returns a dictionary of output features.
@realtime_feature_view(
sources=[txn_request, user_metrics],
mode="python",
features=[Attribute("risk_score", Float64)],
)
def calculate_risk(txn_request, user_metrics):
# Arbitrary Python calculating risk_score
return {"risk_score": txn_request["amount"] * user_metrics["fraud_score"]}
Using Transformation Functions with pandas
Mode​
Similarly, pandas
mode allows you to define powerful transformations that
accept a Pandas DataFrame as input and returns a Pandas DataFrame of output
features.
@realtime_feature_view(
sources=[txn_request, user_metrics],
mode="pandas",
features=[Attribute("risk_score", Float64)],
)
def calculate_risk(txn_request, user_metrics):
# Calculate risk score using pandas operations
result = pd.DataFrame()
result["risk_score"] = txn_request["amount"] * user_metrics["fraud_score"]
return result
Realtime Feature View Best Practices​
Use Calculation Features when:
- Your use case can be accomplished using the set of supported SQL functions.
- When performance is critical. Since you avoid the overhead of a Python or Pandas transformation, Calculation Features will be more efficient than Python and Pandas mode transformations for most use cases.
Use Python or Pandas Mode Transformations when you need:
- Full Python capabilities
- Complex algorithms / logic
- External libraries
- External API calls
python
mode is recommended for more efficient online serving. pandas
mode
is recommended if you would like to optimize for more efficient offline
retrieval.
Calculation Features are recommended for simple transformations that are performant for both online and offline retrieval.
Examples​
- Calculations
- Python
- Pandas
from tecton import realtime_feature_view, RequestSource, Attribute
from tecton.types import Float64, Bool, Field
from features.user_transaction_amount_averages import user_transaction_amount_averages
transaction_request = RequestSource(schema=[Field("amount", Float64)])
transaction_amount_comparison = RealtimeFeatureView(
name="transaction_amount_comparison",
sources=[transaction_request, user_transaction_amount_averages],
features=[
Calculation(
name="transaction_amount_is_higher_than_average",
feature_type=Bool,
expr="transaction_request.amount > COALESCE(user_transaction_amount_averages.amount_mean_24h_10m, 0)",
)
],
)
from tecton import realtime_feature_view, RequestSource, Attribute
from tecton.types import Float64, Bool, Field
from features.user_transaction_amount_averages import user_transaction_amount_averages
transaction_request = RequestSource(schema=[Field("amount", Float64)])
@realtime_feature_view(
sources=[transaction_request, user_transaction_amount_averages],
mode="python",
features=[Attribute("transaction_amount_is_higher_than_average", Bool)],
)
def transaction_amount_is_higher_than_average(transaction_request, user_transaction_amount_averages):
amount_mean = user_transaction_amount_averages["amount_mean_24h_10m"] or 0
return {"transaction_amount_is_higher_than_average": transaction_request["amount"] > amount_mean}
from tecton import realtime_feature_view, RequestSource, Attribute
from tecton.types import Float64, Bool, Field
from features.user_transaction_amount_averages import user_transaction_amount_averages
transaction_request = RequestSource(schema=[Field("amount", Float64)])
@realtime_feature_view(
sources=[transaction_request, user_transaction_amount_averages],
mode="pandas",
features=[Attribute("transaction_amount_is_higher_than_average", Bool)],
)
def transaction_amount_is_higher_than_average(transaction_request, user_transaction_amount_averages):
user_transaction_amount_averages["amount"] = transaction_request["amount"]
user_transaction_amount_averages["transaction_amount_is_higher_than_average"] = (
user_transaction_amount_averages["amount"] > user_transaction_amount_averages["amount_mean_24h_10m"]
)
return user_transaction_amount_averages[["transaction_amount_is_higher_than_average"]]