Skip to main content
Version: 0.8

Connect Databricks Notebooks

You can use the Tecton SDK in a Databricks notebook to explore feature values and create training datasets. The following guide covers how to configure your all-purpose cluster for use with Tecton. If you haven't already completed your deployment of Tecton with Databricks, please see the guide for Configuring Databricks.

Supported Databricks Runtimes for Notebooks​

See this page for the list of supported Databricks Runtimes in Tecton.

Note that when using Databricks Runtime 9.1 LTS & 10.4 LTS, Tecton only supports Service Account credentials. When using Databricks Runtime 11.3 LTS or above, Tecton supports both Service Account credentials and User credentials.

As a best practice, use the same version for your Notebook Cluster as is configured for your Feature View materialization.

Install the Tecton SDK​

This step must be done once per notebook cluster.

On the cluster configuration page:

  1. Go to the Libraries tab
  2. Click Install New
  3. Select PyPI under Library Source
  4. Set Package to your desired Tecton SDK version, such as tecton==0.8.0 or tecton==0.8.*.

Authenticate to Tecton Account​

Authenticating to a Tecton instance from a notebook can happen in 3 ways. They are listed here in the order that Tecton searches for credentials to use. For example, credentials set using Option 1 will override any credentials set in Options 2 and 3.

Option 1: User Credentials in Notebook Session Scope​

User credentials configured using tecton.login() are scoped to the notebook session, and must be reconfigured when a notebook is restarted or its state is cleared. User credentials override any credentials set in both Option 2: Service Account Credentials in Notebook Session Scope and Option 3: Service Account Credentials in Databricks Workspace Scope.

note

tecton.login(interactive=True) requires the cluster to be on Databricks Runtime 11.3 or higher.

To authenticate as a user, run the following in your notebook, replacing "https://example.tecton.ai" with the URL of your Tecton instance:

tecton.login("https://example.tecton.ai")

Then follow the directions to open the login link in your browser, sign in to the Tecton instance as your user, and copy and paste the authorization code from the Identity Verified web page back into your notebook's input box. Please be aware the authorization code is one-time use only.

note

Note that get_online_features requires Service Account credentials to call the online store. If you want to use get_online_features, please follow Option 2 or Option 3 to also set Service Account credentials.

Option 2: Service Account Credentials in Notebook Session Scope​

Service account credentials configured using tecton.set_credentials() are scoped to the notebook session. They must be reconfigured whenever a notebook is restarted or its state is cleared. They override credentials set in Option 3: Service Account Credentials in Databricks Workspace Scope.

Prerequisites​

Please have a Tecton Service Account already set up (and have its API Key secret value accessible). If you don't have one, create a new one using these instructions.

Set API Key in Session​

To authenticate as a Service Account, make sure you have its API Key secret value, and run the following command in your notebook, replacing <key> with the API key value, and https://example.tecton.ai/api with the URL of your Tecton instance:

tecton.set_credentials(tecton_api_key=<key>, tecton_url="https://example.tecton.ai/api")

Option 3: Service Account Credentials in Databricks Workspace Scope​

If User credentials or Service Account credentials are not found in the notebook session scope, Tecton will look for Service Account credentials set in Databricks secret scopes. This should be pre-configured with the Tecton deployment, but if needed they can be created in the following format (such as if you wanted to access Tecton from another Databricks workspace).

Prerequisites​

Please have a Tecton Service Account already set up (and have its API Key secret value accessible). If you don't have one, create a new one using these instructions.

Set API Key in Secret Scope​

First, ensure the Databricks CLI is installed and configured. Next, create a secret scope and name and populate it as follows:

Naming the Secret Scope​

The secret scope name is derived from the cluster name:

  • <deployment-name>, if your deployment name begins with tecton
  • tecton-<deployment-name>, otherwise

<deployment-name> is the first part of the URL used to access the Tecton UI: https://<deployment-name>.tecton.ai

If the above doesn't work, verify that your cluster name is set using

tecton.conf.get_or_raise("TECTON_CLUSTER_NAME")
# if not set, run tecton.conf.set("TECTON_CLUSTER_NAME", <deployment-name>)

Then check what secret scopes the cluster can read from:

tecton.conf._get_secret_scopes()

This should show 2 secret scopes, the one derived from the cluster name, and one called tecton. The tecton scope is a fallback if the first scope is not present or populated, so make sure to create the secret scope with the correct name.

Populating the secret scope​

The secret scope needs to be populated with secrets:

databricks secrets create-scope --scope <scope_name>
databricks secrets put --scope <scope_name> \
--key API_SERVICE --string-value https://foo.tecton.ai/api
databricks secrets put --scope <scope_name> \
--key TECTON_API_KEY --string-value <TOKEN>

Depending on your Databricks setup, you may need to configure ACLs for the secret scope before it is usable. See Databricks documentation for more information. For example:

databricks secrets put-acl --scope <scope_name> \
--principal your@email.com --permission MANAGE

Additionally, depending on data sources used, you may need to configure the following:

  • <secret-scope>/REDSHIFT_USER
  • <secret-scope>/REDSHIFT_PASSWORD
  • <secret-scope>/SNOWFLAKE_USER
  • <secret-scope>/SNOWFLAKE_PASSWORD

Authorize Principal To Access Resources​

In order to access objects from a given Tecton workspace, the User or Service Account used in the last step must be authorized with at least the Viewer role on that workspace. To enable testing Online Feature Retrieval, you should grant the Service Account at least the Consumer role.

Grant Authorization Using Tecton CLI​

Use the access-control assign-role command to grant your user or Service Account the proper role on a workspace (or across all workspaces if you choose)

For example, to grant a User the Viewer role on a workspace:

tecton access-control assign-role --role viewer \
--workspace <Your-workspace> \
--user <Your-user@example.com>

To grant a Service Account the Consumer role on a workspace:

tecton access-control assign-role --role consumer \
--workspace <Your-workspace> \
--service-account <Your-Service-Account-Id>

[Optional] You can also use CLI version 0.6.6 or newer to grant roles across all workspaces:

tecton access-control assign-role --role consumer \
--all-workspaces \
--service-account <Your-Service-Account-Id>

When new workspaces are created, you will automatically be able to access objects from that workspace in your notebooks.

Grant Authorization Using Tecton Web UI​

Alternatively, follow these steps in the Tecton Web UI to authorize your user or Service Account:

  1. Locate your workspace by selecting it from the drop down list at the top.
  2. On the left navigation bar, select Permissions.
  3. Select the Users or Service Accounts tab.
  4. Click Add User or Add Service Account.
  5. In the dialog box that appears, search for the user or Service Account name.
  6. Select the principal's name.
  7. Select a role. You can select any of these roles: Owner, Editor, Consumer, or Operator Viewer.
  8. Click Add.

Configure permissions for cross-account access​

If your Databricks workspace is in a different AWS account from your Tecton dataplane, you must configure AWS access so that Databricks can read all of the S3 buckets Tecton uses (which are in the data plane account, and are prefixed with tecton-), as well as access to the underlying data sources Tecton reads in order to have full functionality.

Verify the connection​

After following the previous steps to authenticate, run these commands in the notebook. If successful, you should see a list of workspaces on the Tecton instance:

tecton.test_credentials()
tecton.list_workspaces()

Appendix​

Create a Tecton Service Account​

If you need to create a new Tecton Service Account, you can do so using the Tecton CLI or the Tecton Web UI.

Using the CLI​

Create a Service Account with the CLI using the tecton service-account create command.

Using the Web UI​

Create a Service Account with the Web UI using these instructions

Was this page helpful?