tecton.on_demand_feature_view(output_schema, mode, inputs, description='', owner='', family='', tags=None, name_override=None)

Declare an on-demand feature view

  • output_schema (StructType) – Spark schema matching the expected output.

  • mode (str) – Whether the annotated function is a pipeline function (PIPELINE_MODE) or a transformation function (SPARK_SQL_MODE, PYSPARK_MODE or PANDAS_MODE). If it’s a transformation mode, we infer the pipeline function.

  • inputs (Dict[str, Input]) – The inputs passed into the pipeline.

  • description (str) – (Optional) description.

  • owner (str) – Owner name (typically the email of the primary maintainer).

  • family (str) – (Optional) Family of this Feature View, used to group Tecton Primitives.

  • tags (Optional[Dict[str, str]]) – (Optional) Tags associated with this Tecton Primitive (key-value pairs of arbitrary metadata).

  • name_override (Optional[str]) – Unique, human friendly name override that identifies the FeatureView.


An On Demand Feature View.

An example declaration of an on-demand feature view

from tecton import RequestDataSource, Input, on_demand_feature_view
from pyspark.sql.types import DoubleType, StructType, StructField, LongType
import pandas

# Define the request schema
request_schema = StructType()
request_schema.add(StructField('amount', DoubleType()))
transaction_request = RequestDataSource(request_schema=request_schema)

# Define the output schema
output_schema = StructType()
output_schema.add(StructField('transaction_amount_is_high', LongType()))

# This On-Demand Feature View evaluates a transaction amount and declares it as "high", if it's higher than 10,000
    inputs={'transaction_request': Input(transaction_request)},
    tags={'release': 'production'},
    description='Whether the transaction amount is considered high (over $10000)'
def transaction_amount_is_high(transaction_request: pandas.DataFrame) -> pandas.DataFrame:
    import pandas as pd

    df = pd.DataFrame()
    df['transaction_amount_is_high'] = (transaction_request['amount'] >= 10000).astype('int64')
    return df