Overview of Transformations
What is a transformation?
A transformation is a function that specifies logic to run against data retrieved from external data sources.
Transformations are a crucial piece of Tecton's functionality; Feature pipelines, via Feature Views, call transformations to compute feature values.
Transformations support the
spark_sql modes. See
Transformation Modes for details.
Where transformations can be defined
A transformation can be defined inside or outside of a Feature View.
Compared to defining a transformation inside of a Feature View, the main advantages of defining a transformation outside of a Feature View are:
- Transformations can be reused by multiple Feature Views.
- A Feature View can call multiple transformations.
- Discoverability: Transformations can be searched in the Web UI.
Defining a transformation inside of a Feature View
The following example shows a Feature View that implements a transformation in
the body of the Feature View function
my_feature_view. The transformation runs
spark_sql mode and renames columns from the data source to
column_a AS feature_one,
column_b AS feature_two
Defining a transformation function outside of a Feature View
Transformation input and output
The input to a transformation contains the columns in the data source.
When a transformation is defined inside of a Feature View, the output of the
transformation is a
DataFrame that must include:
- The join keys of all entities included in the
- A timestamp column. If there is more than one timestamp column, a
timestamp_keyparameter must be set to specify which column is the correct timestamp of the feature values.
- Feature value columns. All columns other than the join keys and timestamp will be considered features in a Feature View.