tecton.FeatureTable

class tecton.FeatureTable(*, name, entities, schema, ttl, online=False, offline=False, description=None, owner=None, family=None, tags=None, offline_config=DeltaConfig(time_partition_size='24h'), online_config=None, batch_cluster_config=None, online_serving_index=None)

Declare a FeatureTable.

The FeatureTable class is used to represent one or many features that are pushed to Tecton from external feature computation systems.

Methods

__init__

Instantiates a new FeatureTable.

__init__(*, name, entities, schema, ttl, online=False, offline=False, description=None, owner=None, family=None, tags=None, offline_config=DeltaConfig(time_partition_size='24h'), online_config=None, batch_cluster_config=None, online_serving_index=None)

Instantiates a new FeatureTable.

Parameters
  • name (str) – Unique, human friendly name that identifies the FeatureTable.

  • entities (List[Union[Entity, Entity, OverriddenEntity]]) – A list of Entity objects, used to organize features.

  • schema (StructType) – A Spark schema definition (StructType) for the FeatureTable. Supported types are: LongType, DoubleType, StringType, BooleanType and TimestampType (for inferred timestamp column only).

  • ttl (str) – The TTL (or “look back window”) for features defined by this feature table. This parameter determines how long features will live in the online store and how far to “look back” relative to a training example’s timestamp when generating offline training sets. Shorter TTLs improve performance and reduce costs.

  • online (Optional[bool]) – Enable writing to online feature store. (Default: False)

  • offline (Optional[bool]) – Enable writing to offline feature store. (Default: False)

  • description (Optional[str]) – Human readable description.

  • owner (Optional[str]) – Owner name (typically the email of the primary maintainer).

  • family (Optional[str]) – Family of this Feature Table, used to group Tecton Objects.

  • tags (Optional[Dict[str, str]]) – Tags associated with this Tecton Object (key-value pairs of arbitrary metadata).

  • offline_config (DeltaConfig) – Configuration for how data is written to the offline feature store.

  • online_config (Union[DynamoConfig, RedisConfig, None]) – Configuration for how data is written to the online feature store.

  • batch_cluster_config (Union[ExistingClusterConfig, DatabricksClusterConfig, EMRClusterConfig, None]) – Batch materialization cluster configuration. Should be one of: [EMRClusterConfig, DatabricksClusterConfig, ExistingClusterConfig]

  • online_serving_index (Optional[List[str]]) – (Advanced) Defines the set of join keys that will be indexed and queryable during online serving. Defaults to the complete set of join keys. Up to one join key may be omitted. If one key is omitted, online requests to a Feature Service will return all feature vectors that match the specified join keys.

Returns

A Feature Table

An example declaration of a FeatureTable

from pyspark.sql.types import StructType, StructField, LongType, StringType, TimestampType
from tecton import Entity, FeatureTable

# Declare your user Entity instance here or import it if defined elsewhere in
# your Tecton repo.
user = ...

# Schema for your feature table
schema = StructType([
    StructField('user_id', StringType()),
    StructField('timestamp', TimestampType()),
    StructField('user_login_count_7d', LongType()),
    StructField('user_login_count_30d', LongType())
])

user_login_counts = FeatureTable(
    name='user_login_counts',
    entities=[user],
    schema=schema,
    online=True,
    offline=True,
    ttl='30day'
)

Attributes

name

Name of this Tecton Object.