tecton.on_demand_feature_view

tecton.on_demand_feature_view(*, mode, inputs, output_schema, description=None, owner=None, family=None, tags=None, name_override=None)

Declare an on-demand feature view

Parameters
  • mode (str) – Whether the annotated function is a pipeline function (“pipeline” mode) or a transformation function (“python” or “pandas” mode). For the non-pipeline mode, an inferred transformation will also be registered.

  • inputs (Dict[str, Input]) – The inputs passed into the pipeline. An Input can be a RequestDataSource or a materialized Feature View.

  • output_schema (Union[StructType, List[Field]]) – Spark schema matching the expected output (of either a dictionary or a Pandas DataFrame).

  • description (Optional[str]) – Human readable description.

  • owner (Optional[str]) – Owner name (typically the email of the primary maintainer).

  • family (Optional[str]) – Family of this Feature View, used to group Tecton Objects.

  • tags (Optional[Dict[str, str]]) – Tags associated with this Tecton Object (key-value pairs of arbitrary metadata).

  • name_override (Optional[str]) – Unique, human friendly name override that identifies the FeatureView.

Returns

An object of type tecton.feature_views.OnDemandFeatureView.

An example declaration of an on-demand feature view using Python mode. With Python mode, the function inputs will be dictionaries, and the function is expected to return a dictionary matching the schema from output_schema. Tecton recommends using Python mode for improved online serving performance.

from tecton import RequestDataSource, Input, on_demand_feature_view
from pyspark.sql.types import DoubleType, StructType, StructField, LongType

request_schema = StructType([
    StructField('amount', DoubleType())
])
transaction_request = RequestDataSource(request_schema=request_schema)

output_schema = StructType([
    StructField('transaction_amount_is_high', LongType())
])


# This On-Demand Feature View evaluates a transaction amount and declares it as "high", if it's higher than 10,000
@on_demand_feature_view(
    inputs={'transaction_request': Input(transaction_request)},
    mode='python',
    output_schema=output_schema,
    family='fraud',
    owner='matt@tecton.ai',
    tags={'release': 'production'},
    description='Whether the transaction amount is considered high (over $10000)'
)
def transaction_amount_is_high(transaction_request):

    result = {}
    result['transaction_amount_is_high'] = int(transaction_request['amount'] >= 10000)
    return result

An example declaration of an on-demand feature view using Pandas mode. With Pandas mode, the function inputs will be Pandas Dataframes, and the function is expected to return a Dataframe matching the schema from output_schema.

from tecton import RequestDataSource, Input, on_demand_feature_view
from pyspark.sql.types import DoubleType, StructType, StructField, LongType
import pandas

# Define the request schema
request_schema = StructType()
request_schema.add(StructField('amount', DoubleType()))
transaction_request = RequestDataSource(request_schema=request_schema)

# Define the output schema
output_schema = StructType()
output_schema.add(StructField('transaction_amount_is_high', LongType()))

# This On-Demand Feature View evaluates a transaction amount and declares it as "high",
# if it's higher than 10,000
@on_demand_feature_view(
    inputs={'transaction_request': Input(transaction_request)},
    mode='pandas',
    output_schema=output_schema,
    family='fraud',
    owner='matt@tecton.ai',
    tags={'release': 'production'},
    description='Whether the transaction amount is considered high (over $10000)'
)
def transaction_amount_is_high(transaction_request):
    import pandas as pd

    df = pd.DataFrame()
    df['transaction_amount_is_high'] = transaction_request['amount'] >= 10000).astype('int64')
    return df