Skip to main content
Version: Beta 🚧

Connect to Snowflake

To run feature pipelines based on data in Snowflake, Tecton needs to be configured with access to your Snowflake account. The following guide shows how to configure these permissions and validate that Tecton is able to connect to your data source.

Prerequisites​

To set up Tecton to use a data source on Snowflake, you need the following:

  • The URL for your Snowflake account.
  • The name of the virtual warehouse Tecton will use for querying data from Snowflake.
  • A Snowflake username and private key. See Snowflake's guide on configuring key-pair authentication.
    • We recommend you create a new user in Snowflake configured to give Tecton read-only access. This user needs to have access to the warehouse. See Snowflake documentation on how to configure this access.
    • The Snowflake role must have USAGE and SELECT permissions on the relevant database objects (database, schema, tables) to perform read operations.

Configuring Snowflake Authentication​

caution

In the past snowflake supported authentication with password, and you may have snowflake data sources that relied on this authentication method. Existing data sources that connect to Snowflake using the password parameter will continue to work. However, password authentication is being deprecated by Snowflake and will be disabled later this year. See Snowflake's deprecation notice for a more detailed timeline. We recommend setting up private key authentication as soon as possible.

To enable materialization jobs to authenticate to Snowflake you will add the username and private key as secrets in Tecton Secrets and reference them in your Snowflake configuration block as shown below.

For advanced authentication scenarios (such as OAuth or custom authentication logic), you can use the connection_provider parameter to provide a custom function that returns a Snowflake connection. Alternatively, you can use a custom pandas_batch_config and retrieve and inject secrets into a block of code you define there to connect to your Snowflake instance.

When adding the private key secret, copy the entire key including delimiters.

Testing a Snowflake Data Source​

To validate that Tecton can read your Snowflake data source, create a Tecton Data Source definition and test that you can read data from the Data Source. The following example shows how to define a SnowflakeConfig in your notebook environment using username/private_key authentication, and validate that Tecton is able to read from your Snowflake data source.

You can also supply an additional parameter private_key_passphrase in case you choose to generate an encrypted private key.

import tecton

# Follow the prompt to complete your Tecton Account sign in
tecton.login("https://<your-account>.tecton.ai")

# Declare SnowflakeConfig instance object that can be used as an argument in BatchSource
snowflake_config = SnowflakeConfig(
url="https://<your-cluster>.<your-snowflake-region>.snowflakecomputing.com/",
database="CLICK_STREAM_DB",
schema="CLICK_STREAM_SCHEMA",
warehouse="COMPUTE_WH",
table="CLICK_STREAM_FEATURES",
user=Secret(scope="your-snowflake-scope", key="your-snowflake-user-key"),
private_key=Secret(scope="your-snowflake-scope", key="your-snowflake-private-key"),
# Add private_key_passphrase only if you generate an encrypted private key
# private_key_passphrase=Secret(scope="your-snowflake-scope", key="your-snowflake-private)
)

# Use in the BatchSource
snowflake_ds = BatchSource(name="click_stream_snowflake_ds", batch_config=snowflake_config)

# Read sample data
snowflake_ds.get_dataframe().to_pandas().head(10)

Using Custom Connection Provider​

For advanced authentication scenarios that require custom logic beyond standard username/password or private key authentication, you can use the connection_provider parameter. This parameter accepts a function that returns a snowflake.connector.SnowflakeConnection object.

note

The connection_provider parameter is only supported in Rift compute environments. It is not available for Spark-based materialization.

Here's an example of using a custom connection provider for OAuth authentication:

import snowflake.connector
from tecton import SnowflakeConfig, BatchSource

def create_snowflake_connection():
"""
Custom function to create a Snowflake connection with OAuth authentication.
This function must return a snowflake.connector.SnowflakeConnection object.
"""
# Example OAuth authentication logic
# Replace with your actual OAuth token retrieval logic
oauth_token = get_oauth_token() # Your custom OAuth token function

connection = snowflake.connector.connect(
account='<your-account>',
user='<your-username>',
authenticator='oauth',
token=oauth_token,
warehouse='<your-warehouse>',
database='<your-database>',
schema='<your-schema>'
)

return connection

# Use the custom connection provider in SnowflakeConfig
snowflake_config = SnowflakeConfig(
table="CLICK_STREAM_FEATURES",
connection_provider=create_snowflake_connection
)

# Use in the BatchSource
snowflake_ds = BatchSource(name="click_stream_snowflake_ds", batch_config=snowflake_config)

# Read sample data
snowflake_ds.get_dataframe().to_pandas().head(10)

When using connection_provider, you only need to specify the table parameter (and optionally query) in the SnowflakeConfig. Connection details like url, database, schema, warehouse, and authentication parameters should be handled within your custom connection function.

Was this page helpful?