tecton.SnowflakeDSConfig

class tecton.SnowflakeDSConfig(url, database, schema, warehouse, role=None, table=None, query=None, timestamp_key=None, raw_batch_translator=None)

Configuration used to reference a Snowflake table or query.

The SnowflakeDSConfig class is used to create a reference to a Snowflake table. You can also create a reference to a query on one or more tables, which will be registered in Tecton in a similar way as a view is registered in other data systems.

This class used as an input to a BatchDataSource’s parameter batch_ds_config. This class is not a Tecton Object: it is a grouping of parameters. Declaring this class alone will not register a data source. Instead, declare as part of BatchDataSource that takes this configuration class instance as a parameter.

Methods

__init__

Instantiates a new SnowflakeDSConfig.

__init__(url, database, schema, warehouse, role=None, table=None, query=None, timestamp_key=None, raw_batch_translator=None)

Instantiates a new SnowflakeDSConfig. One of table and query should be specified when creating this file.

Parameters
  • url (str) – The connection URL to Snowflake, which contains account information (e.g. https://xy12345.eu-west-1.snowflakecomputing.com).

  • database (str) – The Snowflake database for this Data source.

  • schema (str) – The Snowflake schema for this Data source.

  • warehouse (str) – The Snowflake warehouse for this Data source.

  • role (Optional[str]) – (Optional) The Snowflake role that should be used for this Data source.

  • table (Optional[str]) – The table for this Data source. Only one of table and query must be specified.

  • query (Optional[str]) – The query for this Data source. Only one of table and query must be specified.

  • raw_batch_translator – Python user defined function f(DataFrame) -> DataFrame that takes in raw PySpark data source DataFrame and translates it to the DataFrame to be consumed by the Feature View. See an example of raw_batch_translator in the User Guide.

  • timestamp_key (Optional[str]) – (Optional) The name of the timestamp column (after the raw_batch_translator has been applied). The column name does not need to be specified if there is exactly one timestamp column after the translator is applied. This is needed for efficient time filtering when materializing batch features.

Returns

A SnowflakeDSConfig class instance.

Example of a SnowflakeDSConfig declaration:

from tecton import SnowflakeDSConfig, BatchDataSource

# Declare SnowflakeDSConfig instance object that can be used as an argument in BatchDataSource
snowflake_ds_config = SnowflakeDSConfig(
                                  url="https://<your-cluster>.eu-west-1.snowflakecomputing.com/",
                                  database="CLICK_STREAM_DB",
                                  schema="CLICK_STREAM_SCHEMA",
                                  warehouse="COMPUTE_WH",
                                  table="CLICK_STREAM_FEATURES",
                                  query="SELECT timestamp as ts, created, user_id, clicks, click_rate"
                                         "FROM CLICK_STREAM_DB.CLICK_STREAM_FEATURES")

# Use in the BatchDataSource
snowflake_ds = BatchDataSource(name="click_stream_snowflake_ds",
                               batch_ds_config=snowflake_ds_config)