Skip to main content
Version: 0.8

On-Demand Feature Views and Struct Types

info

This feature is not supported in Tecton on Snowflake.

If you are interested in this functionality, please file a feature request.

On-Demand Feature Views that Consume Struct Types

An On-Demand Feature View (ODFV) can depend on sources that output a Struct data type e.g. BatchFeatureView, RequestSource. There are a few limitations when ODFVs depend on sources with Struct types to keep in mind.

On all Computes​

  • Pandas mode ODFVs cannot have a RequestSource with a Struct type as a source.

On Spark Compute​

Struct types are immutable in offline queries in python mode​

In most cases, Tecton feature view definitions are reusable in offline and online queries. However, there is an exception in python mode ODFVs that depend on a source with a Struct when the offline compute is Spark.

When executing a python mode ODFV offline on Spark, the ODFV's transform function is executed as a python UDF. PySpark passes the source's Struct to the transform as a pyspark.sql.Row object, which is immutable. In online queries, however, Tecton passes the source's Struct to the transform as a dict, which is mutable.

This means if you are trying to mutate a source's Struct in the ODFV transform, your offline queries will produce an error like the following:

Running the transformation resulted in the following error: TypeError: 'Row' object does not support item assignment

You can account for the Row object immutability by adjusting your transform function to convert Row objects to dict using Row.asDict() before you mutate them. This will allow your ODFV to succeed Online and Offline as expected.

request_source = RequestSource(
[
Field(
"struct_field",
Struct(
[
Field("string_field", String),
]
),
),
]
)


@on_demand_feature_view(
mode="python",
sources=[request_source],
...,
)
def my_odfv(request):
from pyspark.sql import Row

with_spark = isinstance(request["struct_field"], Row)
struct_field = request["struct_field"].asDict(recursive=True) if with_spark else request["struct_field"]

struct_field["string_field"] += "_some_suffix"

return {"struct_feature": struct_field}

On-Demand Feature Views that Return Struct Features

You can include a Struct data type in the output schema of an On-Demand Feature View (ODFV). A Struct can contain multiple fields with mixed data types.

A Struct can be nested within other complex types. For example, you can have a Struct within a Struct, or an array of Structs.

Using a Struct in the output schema of an ODFV allows you to easily parse the ODFV's output when it contains multiple feature values.

Example usage: An output Struct containing two fields​

The ODFV definition​

from tecton import on_demand_feature_view, RequestSource
from tecton.types import Array, Field, Float64, String, Struct

request_source = RequestSource([Field("input_float", Float64)])

output_schema = [
Field(
"output_struct",
Struct([Field("string_field", String), Field("float64_field", Float64)]),
)
]


@on_demand_feature_view(
mode="python",
sources=[request_source],
schema=output_schema,
description="Output a struct with two fields.",
)
def simple_struct_example_odfv(request):
input_float = request["input_float"]
return {
"output_struct": {
"string_field": str(input_float * 2),
"float64_field": input_float * 2,
}
}

Example usage in a notebook​

import tecton
import pandas

spine_df = pandas.DataFrame(data={"input_float": [1.23, 3.22]})

simple_struct_example_odfv = tecton.get_workspace("my_workspace").get_feature_view("simple_struct_example_odfv")
simple_struct_example_odfv.get_historical_features(spine_df).to_spark().show(10, False)

Output:

+-----------+-----------------------------------------+
|input_float|simple_struct_example_odfv__output_struct|
+-----------+-----------------------------------------+
|1.23 |{2.46, 2.46} |
|3.22 |{6.44, 6.44} |
+-----------+-----------------------------------------+

Example HTTP request​

$ curl -X POST http://<your_cluster>.tecton.ai/api/v1/feature-service/get-features\
-H "Authorization: Tecton-key $TECTON_API_KEY" -d\
'{
"params": {
"workspace_name": "my_workspace",
"feature_view_name": "simple_struct_example_odfv",
"request_context_map": {
"input_float": 1.23
},
"metadata_options": {
"include_names": true,
"include_data_types": true
}
}
}'

Output:

{
"result": {
"features": [["2.46", 2.46]]
},
"metadata": {
"features": [
{
"name": "output_struct",
"dataType": {
"type": "struct",
"fields": [
{
"name": "string_field",
"dataType": {
"type": "string"
}
},
{
"name": "float64_field",
"dataType": {
"type": "float64"
}
}
]
}
}
]
}
}

Example usage: An output Struct containing an array of Structs with some nulls​

The ODFV definition​

from tecton import on_demand_feature_view, RequestSource
from tecton.types import Array, Field, Float64, String, Struct

request_source = RequestSource([Field("input_float", Float64)])

output_schema = [
Field(
"array_of_structs",
Array(Struct([Field("string_field", String), Field("float64_field", Float64)])),
),
]


@on_demand_feature_view(
mode="python",
sources=[request_source],
schema=output_schema,
description="Output an array of Structs with some null examples.",
)
def array_of_structs_example_odfv(request):
input_float = request["input_float"]
return {
"array_of_structs": [
{"string_field": str(input_float * 2), "float64_field": input_float * 2},
{"string_field": str(input_float * 3), "float64_field": input_float * 3},
# A Struct missing one key and setting the other explicitly to None. These are equivalent
# was to return a "null" field.
{
"string_field": None,
# "float64_field": ...
},
# All Tecton data types are nullable, including Structs.
None,
]
}

Example usage in a notebook​

array_of_structs_example_odfv = tecton.get_workspace("my_workspace").get_feature_view("array_of_structs_example_odfv")
array_of_structs_example_odfv.get_historical_features(spine_df).to_spark().show(10, False)

Output:

+-----------+------------------------------------------------+
|input_float|array_of_structs_example_odfv__array_of_structs |
+-----------+------------------------------------------------+
|1.23 |[{2.46, 2.46}, {3.69, 3.69}, {null, null}, null]|
|3.22 |[{6.44, 6.44}, {9.66, 9.66}, {null, null}, null]|
+-----------+------------------------------------------------+

Example HTTP request​

$ curl -X POST http://<your_cluster>.tecton.ai/api/v1/feature-service/get-features\
-H "Authorization: Tecton-key $TECTON_API_KEY" -d\
'{
"params": {
"workspace_name": "my_workspace",
"feature_view_name": "array_of_structs_example_odfv",
"request_context_map": {
"input_float": 1.23
},
"metadata_options": {
"include_names": true,
"include_data_types": true
}
}
}'

Output:

{
"result": {
"features": [[["2.46", 2.46], ["3.69", 3.69], [null, null], null]]
},
"metadata": {
"features": [
{
"name": "array_of_structs",
"dataType": {
"type": "array",
"elementType": {
"type": "struct",
"fields": [
{
"name": "string_field",
"dataType": {
"type": "string"
}
},
{
"name": "float64_field",
"dataType": {
"type": "float64"
}
}
]
}
}
}
]
}
}

Note that null or missing fields are returned in the JSON response as JSON null, and that there is a difference between a Struct containing all null values and a null Struct. Both are shown in this example.

Was this page helpful?