Connect Notebooks to Snowflake
You can use any Python-supported notebook software to explore feature values and create training datasets with Tecton on Snowflake. Jupyter is recommended.
Configuration Steps​
Authenticate your Tecton user​
Log in as your Tecton user in the shell where the notebook will be launched from:
tecton login
This will configure your local Python environment which will be used by the notebook so that you can run commands in the notebook while authenticated as your user.
Configure your notebook to use a supported Python version​
If you are defining transformations using @transformation mode="snowpark"
,
your notebook must use Python 3.8. Otherwise, your notebook can use Python 3.8
or 3.9.
If you are using Jupyter, you can use a notebook with a specific Python version by following these steps:
-
Install the
ipykernel
package in the root directory of your Python installing using the command:sudo <root directory of the version to install>/pip install ipykernel
-
Make sure you have run
tecton login
(see previous section). -
Run
jupyter notebook
to start the notebook server and open the UI in your browser. Then gotoKernel | Change Kernel
and select the Python version. -
If you want to use Snowpark
DataFrame
s in a notebook, install Snowpark in the notebook, as follows:pip install snowflake-snowpark-python
For an example of using a Snowpark DataFrame
in a notebook, see
Constructing Training Data Using a Notebook.
Connect to Snowflake​
Run the following commands in your notebook to connect to Snowflake.
# Import Tecton and other libraries
import logging
import os
import tecton
import pandas as pd
import snowflake.connector
from datetime import datetime, timedelta
from pprint import pprint
# The following two lines log only warnings to the console. To log all events to the console, remove the two lines.
logging.getLogger('snowflake.connector').setLevel(logging.WARNING)
logging.getLogger('snowflake.snowpark').setLevel(logging.WARNING)
# connection_parameters assumes the Snowflake connection credentials are stored in the environment
# variables `SNOWFLAKE_USER`,`SNOWFLAKE_PASSWORD` and `SNOWFLAKE_ACCOUNT`.
# Uncomment the "authenticator" parameter below, only if authenticating through a browser.
# If the "authenticator" parameter is included, do not include the password parameter below.
connection_parameters = {
# "authenticator": "externalbrowser",
"user": os.environ['SNOWFLAKE_USER'], # Your username in the Snowflake account that you're using with Tecton
"password": os.environ['SNOWFLAKE_PASSWORD'], # Your password in the Snowflake account that you're using with Tecton. Not needed if using the authenticator parameter above.
"account": os.environ['SNOWFLAKE_ACCOUNT'], # The Snowflake account you're using with Tecton (takes the form \<SNOWFLAKE_ACCOUNT\>.snowflakecomputing.com
"warehouse": "TRIAL_WAREHOUSE",
# Database and schema are required to create various temporary objects by tecton
"database": "TECTON",
"schema": "PUBLIC"
}
conn = snowflake.connector.connect(**connection_parameters)
tecton.snowflake_context.set_connection(conn) # Tecton will use this Snowflake connection for all interactive queries
# Quick helper function to query snowflake from a notebook
# Make sure to replace with the appropriate connection details for your own account
def query_snowflake(query):
df = conn.cursor().execute(query).fetch_pandas_all()
return df
# Set the compute mode to snowflake
tecton.conf.set("TECTON_COMPUTE_MODE", "snowflake")
tecton.version.summary()
If the commands above are successful, tecton.version.summary()
will return the
version number and other information.