Skip to content

Data Quality Monitoring (Coming Soon)

Models can only be as good as the data that powers them. Broken data pipelines and data drift are common causes of poor model quality in production. To address this, Tecton provides native data quality monitoring using Great Expectations.

Users can easily enable profiling on any Tecton Feature Service in order to monitor a broad set of statistics for both training and serving data. Then, directly inside of the Tecton Feature Repo, users can define expectations regarding the values and distributions of data being served to models. Tecton will alert on any issues found during training or serving.

Configuring Profiling on a Feature Service

Data profiling generates robust documentation of statistics for a given dataset.

To enable profiling of training data, simply add a DataMonitoringConfig to any Feature Service and configure an alert email as shown below.

In order to profile the data that is served in production, add an OnlineLoggingConfig to the Feature Service. Tecton will log online requests and responses for the Feature Service and run data profiling according to the feature_log_monitoring_schedule.

Tecton will automatically generate data documentation on each scheduled run that can be found in the web UI under the given Feature Service page. This profiling documentation will also be generated any time a training dataset is saved (see Datasets).

  from tecton import FeatureService, DataMonitoringConfig, OnlineLoggingConfig

  ctr_prediction_service = FeatureService(
      name='ctr_prediction_service',
      features=[
          ad_ground_truth_ctr_performance_7_days
          user_partner_impression_count_7_days,
          user_total_ad_frequency_counts,
          ad_group_ctr_performance
      ],
      data_monitoring=DataMonitoringConfig(
          alert_email="matt@tecton.ai",
          profiler=GreatExpectationsProfiler(),
          feature_log_monitoring_schedule="1d",
      ),
      online_logging=OnlineLoggingConfig(
          sample_rate=0.8
      )
  )

Configuring Validations on a Feature Services

In order to define custom validations for a Feature Service, simply add a great_expectations_validator function with the validations (also referred to as "expectations") to run and set that in a Feature Service's DataMonitoringConfig as shown below.

In this function, you can use any expectation function in Great Expectation's Glossary of Expectations. These expectations functions can check for a large variety of data quality issues including missing values, unexpected distributions, and unexpected category values.

from tecton import FeatureService, DataMonitoringConfig, OnlineLoggingConfig, great_expectations_validator

@great_expectations_validator
def ctr_prediction_service_validations(df: SparkDataset):
    # put all of your expectations here
      df.expect_column_min_to_be_between("ad_ground_truth_ctr_performance_7_days.ad_total_impressions_7days", 341, 343)
    df.expect_column_mean_to_be_between("ad_ground_truth_ctr_performance_7_days.ad_total_impressions_7days", 762, 924)


ctr_prediction_service = FeatureService(
    name='ctr_prediction_service',
    features=[
        ad_ground_truth_ctr_performance_7_days
        user_partner_impression_count_7_days,
        user_total_ad_frequency_counts,
        ad_group_ctr_performance
    ],
    data_monitoring=DataMonitoringConfig(
        alert_email="matt@tecton.ai",
        profiler=GreatExpectationsProfiler(),
        feature_log_monitoring_schedule="1d",
        validator=ctr_prediction_service_validations
    ),
    online_logging=OnlineLoggingConfig(
        sample_rate=0.8
    )
)

Iterating on Validations Interactively

Expectations can be iterated on and tested interactively inside of a notebook. Simply fetch a training data set from the Feature Service (either by generating it on the fly or fetching a Saved Dataset) or fetch an existing online-logged dataset for the Feature Service (see Datasets).

Then, using the Great Expectations Python SDK, you can interactively run validations as shown below. Once you are satisfied with the expectations, simply copy them to the great_expectations_validator for your Feature Service and run tecton apply to apply your changes to your cluster.

  import great_expectations as ge

  training_data = tecton.get_historical_features(spine, my_feature_service, save_as="my_training_data").to_pandas()
  # or
  training_data = tecton.get_dataset("my_training_data").to_pandas()

  # Test expectations interactively
  df = ge.from_pandas(training_data)
  df.expect_column_min_to_be_between("ad_ground_truth_ctr_performance_7_days.ad_total_impressions_7days", 341, 343)
  df.expect_column_mean_to_be_between("ad_ground_truth_ctr_performance_7_days.ad_total_impressions_7days", 762, 924)
  ... # more expectations
  df.validate() # view JSON validation output and iterate until satisfied

Automatically Creating a Base Suite of Expectations

Great Expectations offers the ability to automatically generate an initial set of expectations. You can use this set to get started using the code below and then fine tune your expectations as necessary.

  import great_expectations as ge
  from great_expectations.profile import BasicSuiteBuilderProfiler

  training_data = tecton.get_historical_features(spine, my_feature_service, save_as="my_training_data").to_pandas()
  # or
  training_data = tecton.get_dataset("my_training_data").to_pandas()

  df = ge.from_pandas(training_data)
  scaffold_config = {"included_columns": list(df.columns),}
  suite, evr = BasicSuiteBuilderProfiler().profile(training_data, profiler_configuration=scaffold_config)

  ge.get_python_expectations(suite)

Monitoring for Data Drift Between Training and Serving

A common cause of poor model quality in production is data drift between training and serving data. Data distributions can often change over time in production for natural reasons such as user behavioral changes. If the data no longer resembles what the model was trained on, performance will decay. This is a strong sign to retrain the model.

Tecton's Data Quality Monitoring capability can be used to guard against this situation. Follow the steps below to properly set this up:

  1. Inside your notebook, generate or fetch the saved training data set that your production model is trained on.
  2. Iterate on expectations using the example above and set distribution expectations that make sense for your data. Each time you run an expectation interactively, Great Expectations will report on the actual value observed in the dataset. You can use these values and and set comfortable error bounds on metrics such as min, max, median, mean, and standard deviation.
  3. Add these fitted expectations to the Feature Service that generated the training dataset and is being used to serve online data to the model. Refer to the configuration steps in the sections above.
  4. Tecton will now use these expectations monitor online data. Because these expectations were fine-tuned to the dataset that the model was trained on, relevant alerts will be generates when metrics such as the feature value average drift outside of your defined error bounds.