Skip to main content
Version: 1.0

Configure Offline Store Access per Workspace

For Tecton on Databricks or Tecton on EMR deployments, offline materialized features are stored in S3.

This guide explains how to limit a Notebook Cluster's access to feature data from specific workspaces.

Offline Store Paths​

Feature data in the offline store is organized by subdirectory. For workspaces created after November 7, 2022, Feature Views in that workspace will be written to a subdirectory under the workspace name. These subdirectories can be secured by different IAM policies.

Creating per-Workspace Policies​

Workspace subdirectories can be used to give more fine grained read access to materialized features. The following example shows how you can modify the policy in a Notebook instance profile to scope access to the materialized features in a specific workspace.

{
"Sid": "S3ReadOnly${YOUR_WORKSPACE_NAME}",
"Effect": "Allow",
"Actions": [
"s3:Get*",
"s3:List*"
],
"Resource": [
"arn:aws:s3:::tecton-${YOUR_DEPLOYMENT_NAME}/offline-store/ws/${YOUR_WORKSPACE_NAME}"
]
}

Overriding Subdirectories​

You can override the subdirectory for a Feature View if you want to set up a different organizational structure for your offline store.

The following example shows how to override the subdirectory path used for a Feature View.

@batch_feature_view(
...
offline_store=ParquetConfig(
subdirectory_override='${YOUR_CUSTOMIZED_SUBDIRECTORY}'),
...
)
def my_fv(data_source):
pass

To check the exact path for your feature view, you could do so with the FeatureView.summary() method and look for the Offline Materialized Data Location item.

Migrating existing Workspaces and Feature Views​

If your workspace was created before November 7, 2022, and you want to adopt this subdirectory structure for existing workspaces and feature views, please reach out to Tecton Support to initiate the process.

Note that there might be materialization and historical feature retrieval downtime while we are migrating your data.

Expect the following steps during the migration process:

  1. Pause offline materialization on all your feature views. You can do this by setting the offline=False parameter. Then run tecton apply.
  2. Tecton will migrate existing data to the workspace subdirectory.
  3. Re-enable materialization for your feature views. You can do this by setting offline=True and then run tecton apply.

Was this page helpful?