tecton.on_demand_feature_view

tecton.on_demand_feature_view(*, output_schema, mode, inputs, description=None, owner=None, family=None, tags=None, name_override=None)

Declare an on-demand feature view

Parameters
  • output_schema (StructType) – Spark schema matching the expected output.

  • mode (str) – Whether the annotated function is a pipeline function (PIPELINE_MODE) or a transformation function (SPARK_SQL_MODE, PYSPARK_MODE or PANDAS_MODE). If it’s a transformation mode, we infer the pipeline function.

  • inputs (Dict[str, Input]) – The inputs passed into the pipeline.

  • description (Optional[str]) – Human readable description.

  • owner (Optional[str]) – Owner name (typically the email of the primary maintainer).

  • family (Optional[str]) – Family of this Feature View, used to group Tecton Objects.

  • tags (Optional[Dict[str, str]]) – Tags associated with this Tecton Object (key-value pairs of arbitrary metadata).

  • name_override (Optional[str]) – Unique, human friendly name override that identifies the FeatureView.

Returns

An object of type tecton.feature_views.OnDemandFeatureView.

An example declaration of an on-demand feature view

from tecton import RequestDataSource, Input, on_demand_feature_view
from pyspark.sql.types import DoubleType, StructType, StructField, LongType
import pandas

# Define the request schema
request_schema = StructType()
request_schema.add(StructField('amount', DoubleType()))
transaction_request = RequestDataSource(request_schema=request_schema)

# Define the output schema
output_schema = StructType()
output_schema.add(StructField('transaction_amount_is_high', LongType()))

# This On-Demand Feature View evaluates a transaction amount and declares it as "high",
# if it's higher than 10,000
@on_demand_feature_view(
    inputs={'transaction_request': Input(transaction_request)},
    mode='pandas',
    output_schema=output_schema,
    family='fraud',
    owner='matt@tecton.ai',
    tags={'release': 'production'},
    description='Whether the transaction amount is considered high (over $10000)'
)
def transaction_amount_is_high(transaction_request: pandas.DataFrame) -> pandas.DataFrame:
    import pandas as pd

    df = pd.DataFrame()
    df['transaction_amount_is_high'] = transaction_request['amount'] >= 10000).astype('int64')
    return df