Skip to main content
Version: 0.8

Tecton on Databricks in Google Cloud Deployment Model

Overview​

The Google Cloud deployment architecture consists of a control plane, a data plane, and a serving plane:

  • The control plane lives in a single-tenant Tecton AWS account and is operated by Tecton to guarantee core services such as data pipeline orchestration and metadata management.

  • The data plane is where data is processed and lives entirely in the customer-owned Google Cloud Project. It connects to raw data sources and processes them within your account. It does not host any proprietary services from Tecton.

  • The serving plane serves online features to applications running in production. The serving plane is deployed to a single-tenant Google Cloud Project managed by Tecton.

This architecture ensures data security and compliance while leaving operational overhead to Tecton. It enables you to receive constant updates and get access to new product features seamlessly. And it guarantees that Tecton engineers can independently maintain and resolve any urgent issues with your features, 24/7.

The architecture is illustrated in the following diagram.

Tecton on GCP Diagram

Before you get started​

Decide on a name for your deployment​

(e.g. mycompany-production). This name will become part of the URL that is used to access your Tecton UI (mycompany-production.tecton.ai). The name must be less than 22 characters.

Decide on your primary region​

You will select a region (e.g. us-west1).

Confirm your Tecton control plane service account.​

Your Tecton control plane service account is provided by your Tecton deployment specialist. It will look like tecton-<deployment>-control-plane@<tecton-deployment>.iam.gserviceaccount.com.

Configure Compute Platform​

See Connecting Databricks.

Configure Online Storage​

Tecton supports both Redis and Bigtable as options for online storage.

See Connecting Redis or Connecting Bigtable.

Configure Offline Storage and Permissions​

Tecton requires the following steps for setup:

1. Create a Cloud Storage Bucket (where Tecton will write feature data) in the data plane project​

Tecton will use a single Cloud Storage bucket to store all of your offline materialized feature data.

To set this up, create a bucket in your data-plane project called tecton-[DEPLOYMENT_NAME]-dataplane (e.g. tecton-mycompany-production-dataplane).

info

Ensure the bucket's region is the same as the region in which you would like to deploy Tecton (e.g. us-west1)

2. Grant your Spark jobs permission to read your data sources and write to your bucket.​

Create a service account and grant it the "Storage Object Admin" role on the bucket created above.

This service account should also be granted access to your data sources. For example, to read from Bigquery, grant the account the "Bigquery Data Viewer" role.

3. Grant Tecton access to the bucket created above, and to run Spark jobs with your desired service account.​

Tecton needs access to the data plane bucket for ensuring data pipeline integrity, cleaning up unused feature data, and storing some forms of logging of Tecton services in the control plane. Access to this service account for these purposes is restricted to automated processes in the Tecton account. Grant the Tecton control plane service account the following permissions on the data plane bucket created above:

storage.buckets.get
storage.buckets.update
storage.objects.create
storage.objects.delete
storage.objects.get
storage.objects.getIamPolicy
storage.objects.list

Was this page helpful?