tecton.BatchDataSource

class tecton.BatchDataSource(*, name, description='', family='', tags=None, owner='', batch_ds_config)

Declare a BatchDataSource, used to read batch data into Tecton.

BatchFeatureViews and BatchWindowAggregateFeatureViews ingest data from a BatchDataSource.

Methods

__init__

Creates a new BatchDataSource

__init__(*, name, description='', family='', tags=None, owner='', batch_ds_config)

Creates a new BatchDataSource

Parameters
  • name (str) – An unique name of the DataSource.

  • description (str) – (Optional) Description.

  • family (str) – (Optional) Family of this DataSource, used to group Tecton Objects.

  • tags (Optional[Dict[str, str]]) – (Optional) Tags associated with this Tecton Object (key-value pairs of arbitrary metadata).

  • owner (str) – Owner name (typically the email of the primary maintainer).

  • batch_ds_config (Union[FileDSConfig, HiveDSConfig, RedshiftDSConfig, SnowflakeDSConfig]) – BatchDSConfig object containing the configuration of the batch data source to be included in this DataSource.

Returns

A BatchDataSource class instance.

Example of a BatchDataSource declaration:

from tecton import HiveDSConfig

# Declare a BatchSource with HiveConfig instance as its batch_ds_config parameter
# Refer to Configs API documentation other batch_ds_config types.
credit_scores_batch = BatchDataSource(name='credit_scores_batch',
                                      batch_ds_config=HiveDSConfig(
                                            database='demo_fraud',
                                            table='credit_scores',
                                            timestamp_column_name='timestamp'),
                                      family='fraud_detection',
                                      owner='matt@tecton.ai',
                                      tags={'release': 'staging',
                                            'source: 'nexus'})

Attributes

name

The name of this DataSource.

timestamp_key

The name of the timestamp column or key of this DataSource.