tecton.interactive.OnDemandFeatureView

class tecton.interactive.OnDemandFeatureView(proto, fco_container)

OnDemandFeatureView class.

To get a FeatureView instance, call tecton.get_feature_view().

Methods

cancel_materialization_job

Cancels the scheduled or running batch materialization job for this Feature View specified by the job identifier.

delete_keys

Deletes any materialized data that matches the specified join keys from the FeatureView.

deletion_status

Displays information for deletion jobs created with the delete_keys() method, which may include past jobs, scheduled jobs, and job failures.

get_historical_features

Returns a Tecton TectonDataFrame of historical values for this feature view.

get_materialization_job

Retrieves data about the specified materialization job for this Feature View.

get_online_features

Returns a single Tecton tecton.FeatureVector from the Online Store.

list_materialization_jobs

Retrieves the list of all materialization jobs for this Feature View.

run

Run the OnDemandFeatureView using mock inputs.

summary

Returns various information about this feature definition, including the most critical metadata such as the name, owner, features, etc.

cancel_materialization_job(job_id)

Cancels the scheduled or running batch materialization job for this Feature View specified by the job identifier. Once cancelled, a job will not be retried further.

Job run state will be set to MANUAL_CANCELLATION_REQUESTED. Note that cancellation is asynchronous, so it may take some time for the cancellation to complete. If job run is already in MANUAL_CANCELLATION_REQUESTED or in a terminal state then it’ll return the job.

Parameters

job_id (str) – ID string of the materialization job.

Returns

MaterializationJobData object for the cancelled job.

delete_keys(keys, online=True, offline=True)

Deletes any materialized data that matches the specified join keys from the FeatureView. This method kicks off a job to delete the data in the offline and online stores. If a FeatureView has multiple entities, the full set of join keys must be specified. Only supports Delta offline store and Dynamo online store. (offline_store=DeltaConfig() and online_store left as default) Maximum 10000 keys can be deleted per request.

Parameters
  • keys (Union[DataFrame, DataFrame]) – The Dataframe to be deleted. Must conform to the FeatureView join keys.

  • online (bool) – (Optional, default=True) Whether or not to delete from the online store.

  • offline (bool) – (Optional, default=True) Whether or not to delete from the offline store.

Returns

None if deletion job was created successfully.

deletion_status(verbose=False, limit=1000, sort_columns=None, errors_only=False)

Displays information for deletion jobs created with the delete_keys() method, which may include past jobs, scheduled jobs, and job failures.

Parameters
  • verbose – If set to true, method will display additional low level deletion information, useful for debugging.

  • limit – Maximum number of jobs to return.

  • sort_columns – A comma-separated list of column names by which to sort the rows.

Param

errors_only: If set to true, method will only return jobs that failed with an error.

get_historical_features(spine, timestamp_key=None, from_source=False, save=False, save_as=None)

Returns a Tecton TectonDataFrame of historical values for this feature view.

Parameters
  • spine (Union[pyspark.sql.DataFrame, pandas.DataFrame, TectonDataFrame]) – The spine to join against, as a dataframe. The returned data frame will contain rollups for all (join key, request data key) combinations that are required to compute a full frame from the spine.

  • timestamp_key (str) – Name of the time column in spine. This method will fetch the latest features computed before the specified timestamps in this column. If unspecified and this feature view has feature view dependencies, timestamp_key will default to the time column of the spine if there is only one present.

  • from_source (bool) – Whether feature values should be recomputed from the original data source. If False, we will read the materialized values from the offline store.

  • save (bool) – Whether to persist the DataFrame as a Dataset object. Default is False.

  • save_as (Optional[str]) – Name to save the DataFrame as. If unspecified and save=True, a name will be generated.

Type

save_as: str

Examples

An OnDemandFeatureView fv that expects request time data for the key amount.

The request time data is defined in the feature definition as such:
request_schema = StructType()
request_schema.add(StructField(‘amount’, DoubleType()))
transaction_request = RequestDataSource(request_schema=request_schema)

1) fv.get_historical_features(spine) where spine=pandas.Dataframe({'amount': [30, 50, 10000]}) Fetch historical features from the offline store with request time data inputs 30, 50, and 10000 for key ‘amount’.

2) fv.get_historical_features(spine, save_as='my_dataset') where spine=pandas.Dataframe({'amount': [30, 50, 10000]}) Fetch historical features from the offline store request time data inputs 30, 50, and 10000 for key ‘amount’. Save the DataFrame as dataset with the name ‘my_dataset’.

An OnDemandFeatureView fv the expects request time data for the key amount and has a feature view dependency with join key user_id.

1) fv.get_historical_features(spine) where spine=pandas.Dataframe({'user_id': [1,2,3], 'date_1': [datetime(...), datetime(...), datetime(...)], 'amount': [30, 50, 10000]}) Fetch historical features from the offline store for users 1, 2, and 3 for the specified timestamps and values for amount in the spine.

Returns

A Tecton TectonDataFrame.

get_materialization_job(job_id)

Retrieves data about the specified materialization job for this Feature View.

This data includes information about job attempts.

Parameters

job_id (str) – ID string of the materialization job.

Returns

MaterializationJobData object for the job.

get_online_features(join_keys=None, include_join_keys_in_response=False, request_data=None)

Returns a single Tecton tecton.FeatureVector from the Online Store. At least one of join_keys or request_data is required.

Parameters
  • join_keys (Optional[Mapping[str, Union[int, int64, str, bytes]]]) – Join keys of the enclosed FeatureViews.

  • include_join_keys_in_response (bool) – Whether to include join keys as part of the response FeatureVector.

  • request_data (Optional[Mapping[str, Union[int, int64, str, bytes, float]]]) –

    Dictionary of request context values used for OnDemandFeatureViews.

    Examples:

    An OnDemandFeatureView fv that expects request time data for the key amount.

    The request time data is defined in the feature definition as such:
    request_schema = StructType()
    request_schema.add(StructField(‘amount’, DoubleType()))
    transaction_request = RequestDataSource(request_schema=request_schema)

    1) fv.get_online_features(request_data={'amount': 50}) Fetch the latest features with input amount=50.

    An OnDemandFeatureView fv that has a feature view dependency with join key user_id and expects request time data for the key amount.

    1) fv.get_online_features(join_keys={'user_id': 1}, request_data={'amount': 50}, include_join_keys_in_respone=True) Fetch the latest features from the online store for user 1 with input amount=50. In the returned FeatureVector, nclude the join key information (user_id=1).

Returns

A tecton.FeatureVector of the results.

list_materialization_jobs()

Retrieves the list of all materialization jobs for this Feature View.

Returns

List of MaterializationJobData objects.

run(**mock_inputs)

Run the OnDemandFeatureView using mock inputs.

Parameters

**mock_inputs – Required. Keyword args with the same expected keys as the OnDemandFeatureView’s inputs parameters. For the “python” mode, each input must be a Dictionary representing a single row. For the “pandas” mode, each input must be a DataFrame with all of them containing the same number of rows and matching row ordering.

Example:

# Given a python on-demand feature view defined in your workspace:
@on_demand_feature_view(
    sources=[transaction_request, user_transaction_amount_metrics],
    mode='python',
    schema=output_schema,
    description='The transaction amount is higher than the 1 day average.'
)
def transaction_amount_is_higher_than_average(request, user_metrics):
    return {'higher_than_average': request['amt'] > user_metrics['daily_average']}
# Retrieve and run the feature view in a notebook using mock data:
import tecton

fv = tecton.get_workspace('prod').get_feature_view('transaction_amount_is_higher_than_average')

result = fv.run(request={'amt': 100}, user_metrics={'daily_average': 1000})

print(result)
# {'higher_than_average': False}
Returns

A Dict object for the “python” mode and a tecton DataFrame of the results for the “pandas” mode.

summary()

Returns various information about this feature definition, including the most critical metadata such as the name, owner, features, etc.

Attributes

created_at

Returns the creation date of this Tecton Object.

data_source_names

Returns the names of the data sources for this Feature View.

defined_in

Returns filename where this Tecton Object has been declared.

description

The description of this Tecton Object, set by user.

entity_names

Returns the names of entities for this Feature View.

family

Deprecated.

feature_start_time

This represents the time at which features are first available.

features

Returns the names of the (output) features.

id

Returns the id of this object

is_on_demand

Deprecated.

is_temporal

Deprecated.

is_temporal_aggregate

Deprecated.

join_keys

Returns the join key column names

name

The name of this Tecton Object.

online_serving_index

Returns Defines the set of join keys that will be indexed and queryable during online serving.

owner

The owner of this Tecton Object (typically the email of the primary maintainer.)

tags

Tags associated with this Tecton Object (key-value pairs of arbitrary metadata set by user.)

url

Returns a link to the Tecton Web UI.

wildcard_join_key

Returns a wildcard join key column name if it exists; Otherwise returns None.

workspace

Returns the workspace this Tecton Object was created in.