tecton.interactive.FeatureTable

class tecton.interactive.FeatureTable(proto, fco_container)

FeatureTable class.

To get a FeatureTable instance, call tecton.get_feature_table().

Methods

delete_keys

Deletes any materialized data that matches the specified join keys from the FeatureTable.

deletion_status

Displays information for deletion jobs created with the delete_keys() method, which may include past jobs, scheduled jobs, and job failures.

get_historical_features

Returns a Tecton TectonDataFrame of historical values for this feature table.

get_online_features

Returns a single Tecton FeatureVector from the Online Store.

ingest

Ingests a Dataframe into the FeatureTable.

materialization_status

Displays materialization information for the FeatureTable, which may include past jobs, scheduled jobs, and job failures.

summary

Returns various information about this feature definition, including the most critical metadata such as the name, owner, features, etc.

delete_keys(keys, online=True, offline=True)

Deletes any materialized data that matches the specified join keys from the FeatureTable. This method kicks off a job to delete the data in the offline and online stores. If a FeatureTable has multiple entities, the full set of join keys must be specified. Only supports Dynamo online store. Maximum 10000 keys can be deleted per request.

Parameters
  • keys (Union[DataFrame, DataFrame]) – The Dataframe to be deleted. Must conform to the FeatureTable join keys.

  • online (bool) – (Optional, default=True) Whether or not to delete from the online store.

  • offline (bool) – (Optional, default=True) Whether or not to delete from the offline store.

Returns

None if deletion job was created successfully.

deletion_status(verbose=False, limit=1000, sort_columns=None, errors_only=False)

Displays information for deletion jobs created with the delete_keys() method, which may include past jobs, scheduled jobs, and job failures.

Parameters
  • verbose – If set to true, method will display additional low level deletion information, useful for debugging.

  • limit – Maximum number of jobs to return.

  • sort_columns – A comma-separated list of column names by which to sort the rows.

Param

errors_only: If set to true, method will only return jobs that failed with an error.

get_historical_features(spine=None, timestamp_key=None, entities=None, start_time=None, end_time=None, save=False, save_as=None)

Returns a Tecton TectonDataFrame of historical values for this feature table. If no arguments are passed in, all feature values for this feature table will be returned in a TectonDataFrame. Note: The timestamp_key parameter is only applicable when a spine is passed in. Parameters start_time, end_time, and entities are only applicable when a spine is not passed in.

Parameters
  • spine (Union[pyspark.sql.DataFrame, pandas.DataFrame, TectonDataFrame]) – The spine to join against, as a dataframe. If present, the returned DataFrame will contain rollups for all (join key, temporal key) combinations that are required to compute a full frame from the spine. To distinguish between spine columns and feature columns, feature columns are labeled as feature_view_name.feature_name in the returned DataFrame. If spine is not specified, it’ll return a DataFrame of feature values in the specified time range.

  • timestamp_key (str) – Name of the time column in spine. This method will fetch the latest features computed before the specified timestamps in this column. If unspecified, will default to the time column of the spine if there is only one present.

  • entities (Union[pyspark.sql.DataFrame, pandas.DataFrame, TectonDataFrame]) – A DataFrame that is used to filter down feature values. If specified, this DataFrame should only contain join key columns.

  • start_time (Union[pendulum.DateTime, datetime.datetime]) – The interval start time from when we want to retrieve features. If no timezone is specified, will default to using UTC.

  • end_time (Union[pendulum.DateTime, datetime.datetime]) – The interval end time until when we want to retrieve features. If no timezone is specified, will default to using UTC.

  • save (bool) – Whether to persist the DataFrame as a Dataset object. Default is False.

  • save_as (str) – name to save the DataFrame as. If unspecified and save=True, a name will be generated.

Examples

A FeatureTable ft with join key user_id.

1) ft.get_historical_features(spine) where spine=pandas.Dataframe({'user_id': [1,2,3], 'date': [datetime(...), datetime(...), datetime(...)]}) Fetch historical features from the offline store for users 1, 2, and 3 for the specified timestamps in the spine.

2) ft.get_historical_features(spine, save_as='my_dataset) where spine=pandas.Dataframe({'user_id': [1,2,3], 'date': [datetime(...), datetime(...), datetime(...)]}) Fetch historical features from the offline store for users 1, 2, and 3 for the specified timestamps in the spine. Save the DataFrame as dataset with the name :py:mod`my_dataset`.

3) ft.get_historical_features(spine, timestamp_key='date_1') where spine=pandas.Dataframe({'user_id': [1,2,3], 'date_1': [datetime(...), datetime(...), datetime(...)], 'date_2': [datetime(...), datetime(...), datetime(...)]}) Fetch historical features from the offline store for users 1, 2, and 3 for the specified timestamps in the ‘date_1’ column in the spine.

4) ft.get_historical_features(start_time=datetime(...), end_time=datetime(...)) Fetch all historical features from the offline store in the time range specified by start_time and end_time.

Returns

A TectonDataFrame with features values.

get_online_features(join_keys, include_join_keys_in_response=False)

Returns a single Tecton FeatureVector from the Online Store.

Parameters
  • join_keys (Mapping[str, Union[int, int64, str, bytes]]) – Join keys of the enclosed FeatureTable.

  • include_join_keys_in_response (bool) – Whether to include join keys as part of the response FeatureVector.

Examples

A FeatureTable ft with join key user_id.

1) ft.get_online_features(join_keys={'user_id': 1}) Fetch the latest features from the online store for user 1.

2) ft.get_online_features(join_keys={'user_id': 1}, include_join_keys_in_respone=True) Fetch the latest features from the online store for user 1 and include the join key information (user_id=1) in the returned FeatureVector.

Returns

A FeatureVector of the results.

ingest(df)

Ingests a Dataframe into the FeatureTable. This method kicks off a materialization job to write the data into the offline and online store, depending on the Feature Table configuration.

Parameters

df (Union[DataFrame, DataFrame]) – The Dataframe to be ingested. Has to conform to the FeatureTable schema.

materialization_status(verbose=False, limit=1000, sort_columns=None, errors_only=False)

Displays materialization information for the FeatureTable, which may include past jobs, scheduled jobs, and job failures.

Parameters
  • verbose – If set to true, method will display additional low level materialization information, useful for debugging.

  • limit – Maximum number of jobs to return.

  • sort_columns – A comma-separated list of column names by which to sort the rows.

Param

errors_only: If set to true, method will only return jobs that failed with an error.

summary()

Returns various information about this feature definition, including the most critical metadata such as the name, owner, features, etc.

Attributes

created_at

Returns the creation date of this Tecton Object.

data_source_names

Returns the names of the data sources for this Feature View.

defined_in

Returns filename where this Tecton Object has been declared.

description

The description of this Tecton Object, set by user.

entity_names

Returns the names of entities for this Feature View.

family

Deprecated.

features

Returns the names of the (output) features.

id

Returns the id of this object

join_keys

Returns the join key column names

name

The name of this Tecton Object.

online_serving_index

Returns Defines the set of join keys that will be indexed and queryable during online serving.

owner

The owner of this Tecton Object (typically the email of the primary maintainer.)

tags

Tags associated with this Tecton Object (key-value pairs of arbitrary metadata set by user.)

timestamp_field

Returns the timestamp_field of this FeatureView.

url

Returns a link to the Tecton Web UI.

wildcard_join_key

Returns a wildcard join key column name if it exists; Otherwise returns None.

workspace

Returns the workspace this Tecton Object was created in.