SnowflakeDSConfig(url, database, schema, warehouse, role=None, table=None, query=None, timestamp_key=None, raw_batch_translator=None)¶
Configuration used to reference a Snowflake table or query.
The SnowflakeDSConfig class is used to create a reference to a Snowflake table. You can also create a reference to a query on one or more tables, which will be registered in Tecton in a similar way as a view is registered in other data systems.
This class used as an input to a
batch_ds_config. This class is not a Tecton Object: it is a grouping of parameters. Declaring this class alone will not register a data source. Instead, declare as part of
BatchDataSourcethat takes this configuration class instance as a parameter.
Instantiates a new SnowflakeDSConfig.
__init__(url, database, schema, warehouse, role=None, table=None, query=None, timestamp_key=None, raw_batch_translator=None)¶
Instantiates a new SnowflakeDSConfig. One of table and query should be specified when creating this file.
str) – The Snowflake database for this Data source.
str) – The Snowflake schema for this Data source.
str) – The Snowflake warehouse for this Data source.
raw_batch_translator – Python user defined function f(DataFrame) -> DataFrame that takes in raw PySpark data source DataFrame and translates it to the DataFrame to be consumed by the Feature View. See an example of raw_batch_translator in the User Guide.
str]) – (Optional) The name of the timestamp column (after the raw_batch_translator has been applied). The column name does not need to be specified if there is exactly one timestamp column after the translator is applied. This is needed for efficient time filtering when materializing batch features.
A SnowflakeDSConfig class instance.
Example of a SnowflakeDSConfig declaration:
from tecton import SnowflakeDSConfig, BatchDataSource # Declare SnowflakeDSConfig instance object that can be used as an argument in BatchDataSource snowflake_ds_config = SnowflakeDSConfig( url="https://<your-cluster>.eu-west-1.snowflakecomputing.com/", database="CLICK_STREAM_DB", schema="CLICK_STREAM_SCHEMA", warehouse="COMPUTE_WH", table="CLICK_STREAM_FEATURES", query="SELECT timestamp as ts, created, user_id, clicks, click_rate" "FROM CLICK_STREAM_DB.CLICK_STREAM_FEATURES") # Use in the BatchDataSource snowflake_ds = BatchDataSource(name="click_stream_snowflake_ds", batch_ds_config=snowflake_ds_config)