Version: 0.7

Databricks Unity Catalog

Prerequisites

Tecton SDK 0.6+
DBR 11+ (with the Premium plan or above)

Limitations

Tecton is currently compatible with the SINGLE USER Databricks cluster access mode, but not yet with SHARED MODE.
In order for your Tecton notebook to be able to read directly from Unity Catalog data sources (e.g. to run FeatureView.get_historical_features(from_source=True)), you must create your notebook cluster with the SINGLE USER access mode. This means each Databricks user will need a separate notebook cluster.

Databricks & AWS Setup

Assign your Databricks workspaces used by Tecton to the metastore that you plan to use.
Add the Databricks Service Principal used by Tecton as users of the metastore.
For the S3 bucket you configured as the Tecton offline store, make sure all AWS IAM requirements here are also met and this IAM role ARN is registered with storage credentials in Unity Catalog via Databricks Data Explorer.

Create an external location for this S3 bucket with the above storage credential and grant the Databricks account used by Tecton at least the READ FILES and WRITE FILES permissions. This can be done by running the following SQL commands in a notebook or the Databricks SQL editor which is backed by a Unity-enabled cluster or SQL warehouse.

CREATE EXTERNAL LOCATION [IF NOT EXISTS] <location_name>
URL 's3://<bucket_path>'
WITH ([STORAGE] CREDENTIAL <storage_credential_name>)
[COMMENT <comment_string>];
GRANT READ FILES ON EXTERNAL LOCATION <location_name> TO <tecton_databricks_account>;
GRANT WRITE FILES ON EXTERNAL LOCATION <location_name> TO <tecton_databricks_account>;

Configuring Tecton Data Sources & Feature Views to work with Unity

Please let Tecton know that you plan to use Unity Catalog, so that we can appropriately configure internal Spark clusters used by Tecton's SDK.
No changes are needed for Feature Views that don’t use a Unity data source.
Please note that changing a Feature View's Data Source may result in re-materialization.

Tecton SDK Version 0.7+

We recommend using UnityConfig as follows:

test_unity_batch_source = BatchSource(
    name="test_unity_config_batch_source",
    batch_config=UnityConfig(
        catalog="main",  # <catalog_name>
        schema="default",  # <schema_name>
        table="department",  # <table_name>
    ),
)

Prerequisites​

Limitations​

Databricks & AWS Setup​

Configuring Tecton Data Sources & Feature Views to work with Unity​

Tecton SDK Version 0.7+​

Was this page helpful?

Prerequisites

Limitations

Databricks & AWS Setup

Configuring Tecton Data Sources & Feature Views to work with Unity

Tecton SDK Version 0.7+