Using Plan Hooks to Automate Testing
Overview
Plan Hooks provide a framework for writing general purpose hooks in Python that automatically run every time tecton plan
or apply
is run. Plan Hooks lets you trigger customizable behavior during key actions during the Tecton CLI workflow.
Plan Hooks are written in Python and therefore completely customizable. Example use cases include enforcing a commit policy or running basic tests against your code (see the Examples section).
Enabling Plan Hooks
When tecton init
is run to configure a feature repository in a new directory, it creates a folder called .tecton
containing the file .tecton/hooks/plan.py.example
. To enable Plan Hooks, rename this file to .tecton/hooks/plan.py
.
How Plan Hooks Work
Arbitrary logic can be defined in plan.py
as long as it adheres to the return code contract for run()
. Each time tecton plan
or apply
is run, it will execute the run()
method in plan.py
. tecton
expects the following return codes when running run()
:
0
if all tests passNone
if no tests were run- Non-zero integer in the case of test failures
If a non-zero value is returned from run()
, stdout will be printed to stderr
. If a 0
or None
is returned, all hook output will be suppressed.
In summary, plan hooks must meet the following requirements:
- Must be defined in
.tecton/hooks/plan.py
- Must contain a
run()
method that accepts no arguments. run()
must return either0
(tests pass),None
(no tests run) or a non-zero integer return code (test failures).
To configure multiple plan hooks, it's recommended to define them in separate functions in plan.py
and
call each function from run()
.
Default Plan Hook: plan.py.example
The default contents of plan.py.example
contain a no-op hook that returns None
.
### plan.py.example ###
from typing import Optional
def run() -> Optional[int]:
# No-op plan hook that returns None indicating no tests we run.
# See https://docs.tecton.ai/how-to/using_plan_hooks.html for more info.
return None
If you rename plan.py.example
to plan.py
and run tecton plan
, you'll see ✅ [Plan Hooks] No tests run
in the output. For example:
$ tecton plan
Using workspace "prod"
✅ Imported 4 Python modules from the feature repository
✅ [Plan Hooks] No tests run
✅ Collecting local feature declarations
✅ Performing server-side validation of feature declarations
Examples
Configuring pytest
The following plan.py
configures a test harness for running pytest
against all files in the feature repo matching the pattern *_test.py
, test_*.py
, or test.py
.
### plan.py.example ###
from pathlib import Path
from typing import Optional
import pytest
def run() -> Optional[int]:
# Run pytest on all *_test.py, test_*.py, and test.py files and return:
# - 0 if all tests pass
# - None if no tests were run
# - Non-zero exit code indicating test failures
root_path = str(Path().resolve())
tests = []
tests.extend([str(p.resolve()) for p in Path(root_path).glob("**/*_test.py")])
tests.extend([str(p.resolve()) for p in Path(root_path).glob("**/test_*.py")])
tests.extend([str(p.resolve()) for p in Path(root_path).glob("**/test.py")])
num_py_tests = len(tests)
exitcode = pytest.main(tests)
if exitcode == 5:
# No tests were run:
# https://docs.pytest.org/en/stable/usage.html#possible-exit-codes
return None
return exitcode
OnlineTransformation Unit Test
Using the test harness provided in the example above, it's possible to define unit tests on OnlineTransformation functions. Suppose you have an OnlineTransformation that simply doubles the values passed to it:
### my_transformations.py ###
import pandas
from tecton import online_transformation
from pyspark.sql.types import LongType, StructType, StructField
out_schema = StructType()
out_schema.add(StructField(f"output", LongType()))
online_rc = RequestContext(schema={"input": LongType()})
@online_transformation(request_context=online_rc, output_schema=out_schema)
def transformation_double(input: pandas.Series):
import pandas
series = []
for a in input:
features = {}
features["output"] = a * 2
series.append(features)
return pandas.DataFrame(series)
Now, we want to write a test that asserts the output DataFrame's values are in
in fact double the values provided in the input Series. Transformations have a
transformer
attribute that returns the transformer function
(i.e. the transformation_double
above) that can be used for testing.
### my_transformation_test.py ###
from .my_transformations import transformation_double
import pandas as pd
from pandas.testing import assert_frame_equal
def test_my_favorite_doubling_transformation():
input = pd.Series([1, 2, 3])
actual = transformation_double.transformer(input)
expected = pd.DataFrame({"output": [2, 4, 6]})
assert_frame_equal(actual, expected)
After renaming plan.py.example
to plan.py
and adding my_transformation_test.py
to your repo,
you'll see the message ✅ [Plan Hooks] Tests passed!
when running tecton plan
and apply
if all tests passed.
$ tecton plan
Using workspace "prod"
✅ Imported 4 Python modules from the feature repository
✅ [Plan Hooks] Tests passed!
✅ Collecting local feature declarations
✅ Performing server-side validation of feature declarations
↓↓↓↓↓↓↓↓↓↓↓↓ Plan Start ↓↓↓↓↓↓↓↓↓↓
+ Create FeaturePackage
name: my_feature_package
owner: sally
↑↑↑↑↑↑↑↑↑↑↑↑ Plan End ↑↑↑↑↑↑↑↑↑↑↑↑
If tests fail, you'll see â›” [Plan Hooks] Tests failed
along with test failure messages.
$ tecton plan
Using workspace "prod"
✅ Imported 4 Python modules from the feature repository
â›” [Plan Hooks] Tests failed :(
E AssertionError: DataFrame.iloc[:, 0] (column name="bar") are different
E
E DataFrame.iloc[:, 0] (column name="bar") values are different (33.33333 %)
E [left]: [2, 4, 7]
E [right]: [2, 4, 6]
pandas/_libs/testing.pyx:174: AssertionError
========================================================================================================= short test summary info ==========================================================================================================
FAILED my_transform_test.py::test_answer - AssertionError: DataFrame.iloc[:, 0] (column name="bar") are different
File Naming Policy Test
Suppose you would like to create a naming policy that ensures all python files are prefixed with "ml_ops_"
.
The example below performs this assertion on all python files in the feature repository and returns 0 if all
names adhere to the policy or 1 if some names do not adhere to this policy.
### plan.py ###
from pathlib import Path
from typing import Optional
def run() -> Optional[int]:
# Run a naming policy check on all python files that checks that
# all file names begin with "ml_ops_"
# - 0 if all names adhere to the policy.
# - 1 (or any non-zero code) if names do not meet the policy.
root_path = str(Path().resolve())
py_files = []
py_files.extend([p.resolve() for p in Path(root_path).glob("**/*.py")])
bad_names = [p for p in py_files if not p.name.startswith("ml_ops_")]
if len(bad_names) > 0:
print("Invalid names:")
for n in bad_names:
print(str(n))
return 1
return 0
Skip Plan Hooks
Specifying the --skip-tests
flag when running tecton plan
or apply
will skip execution of Plan Hooks.
Reset Plan Hooks
If you get carried away writing customized Plan Hook behavior and want to revert to the default, simply run tecton init --reset-hooks
. This will delete the contents of .tecton/
and recreate the default plan.py.example
.