tecton.on_demand_feature_view¶
-
tecton.
on_demand_feature_view
(output_schema, mode, inputs, description='', owner='', family='', tags=None, name_override=None)¶ Declare an on-demand feature view
- Parameters
output_schema (
StructType
) – Spark schema matching the expected output.mode (
str
) – Whether the annotated function is a pipeline function (PIPELINE_MODE) or a transformation function (SPARK_SQL_MODE, PYSPARK_MODE or PANDAS_MODE). If it’s a transformation mode, we infer the pipeline function.inputs (
Dict
[str
,Input
]) – The inputs passed into the pipeline.description (
str
) – (Optional) description.owner (
str
) – Owner name (typically the email of the primary maintainer).family (
str
) – (Optional) Family of this Feature View, used to group Tecton Objects.tags (
Optional
[Dict
[str
,str
]]) – (Optional) Tags associated with this Tecton Object (key-value pairs of arbitrary metadata).name_override (
Optional
[str
]) – Unique, human friendly name override that identifies the FeatureView.
- Returns
An On Demand Feature View.
An example declaration of an on-demand feature view
from tecton import RequestDataSource, Input, on_demand_feature_view from pyspark.sql.types import DoubleType, StructType, StructField, LongType import pandas # Define the request schema request_schema = StructType() request_schema.add(StructField('amount', DoubleType())) transaction_request = RequestDataSource(request_schema=request_schema) # Define the output schema output_schema = StructType() output_schema.add(StructField('transaction_amount_is_high', LongType())) # This On-Demand Feature View evaluates a transaction amount and declares it as "high", if it's higher than 10,000 @on_demand_feature_view( inputs={'transaction_request': Input(transaction_request)}, mode='pandas', output_schema=output_schema, family='fraud', owner='matt@tecton.ai', tags={'release': 'production'}, description='Whether the transaction amount is considered high (over $10000)' ) def transaction_amount_is_high(transaction_request: pandas.DataFrame) -> pandas.DataFrame: import pandas as pd df = pd.DataFrame() df['transaction_amount_is_high'] = (transaction_request['amount'] >= 10000).astype('int64') return df