This feature is currently in Private Preview.
Connect to Snowflake
To run feature pipelines based on data in Snowflake, Tecton needs to be configured with access to your Snowflake account. The following guide shows how to configure these permissions and validate that Tecton is able to connect to your data source.
Prerequisites​
To set up Tecton to use a data source on Snowflake, you need the following:
- The URL for your Snowflake account.
- The name of the virtual warehouse Tecton will use for querying data from Snowflake.
- A Snowflake username and password. We recommend you create a new user in Snowflake configured to give Tecton read-only access. This user needs to have access to the warehouse. See Snowflake documentation on how to configure this access.
Configuring Snowflake Authentication​
To enable
materialization jobs
to authenticate to Snowflake you will add the username and password as secrets
in Tecton Secrets and reference them in your
Snowflake configuration block as shown below. If you use a different type of
authentication such as key-pair or OAuth, you can instead use a custom
pandas_batch_config
and retrieve and inject secrets into a block of code you
define there to connect to your Snowflake instance.
Testing a Snowflake Data Source​
To validate that Tecton can read your Snowflake data source, create a Tecton
Data Source definition and test that you can read data from the Data Source. The
following example shows how to define a
SnowflakeConfig
in your notebook environment using username/password authentication, and
validate that Tecton is able to read from your Snowflake data source.
import tecton
# Follow the prompt to complete your Tecton Account sign in
tecton.login("https://<your-account>.tecton.ai")
# Declare SnowflakeConfig instance object that can be used as an argument in BatchSource
snowflake_config = SnowflakeConfig(
url="https://<your-cluster>.<your-snowflake-region>.snowflakecomputing.com/",
database="CLICK_STREAM_DB",
schema="CLICK_STREAM_SCHEMA",
warehouse="COMPUTE_WH",
table="CLICK_STREAM_FEATURES",
user=Secret(scope="your-snowflake-scope", key="your-snowflake-user-key"),
password=Secret(scope="your-snowflake-scope", key="your-snowflake-password-key"),
)
# Use in the BatchSource
snowflake_ds = BatchSource(name="click_stream_snowflake_ds", batch_config=snowflake_config)
# Read sample data
snowflake_ds.get_dataframe().to_pandas().head(10)