Skip to content

Welcome to Tecton

Tecton is the easiest way to transform data into features, and to serve those features to models for training or prediction. We provide a platform that enables:

  • Rapid engineering of production batch, streaming, and on-demand data pipelines using Python and SQL
  • Cataloging and reuse of features across ML projects through our discovery UI and Git centric tooling
  • Automatic management of data processing for tasks like backfilling, joining features from different sources, and moving features to production databases
  • Industrial grade online serving API which allows models to read the latest features at low-latency

API Reference | How-to Guides | FAQ | What’s New


5-min Tecton Overview

Imagine you are developing a fraud detection model that catches fraudulent transactions. The model needs to access data in two ways:

  1. To quickly make predictions at transaction time, the application must retrieve and compute relevant data for each transaction at low latency.
  2. To train the model, the model trainer must efficiently access relevant data for millions of historical examples.

Where does Tecton fit in? Tecton orchestrates complex data processing systems with simple APIs so you can train and serve models with one integrated feature platform. Here is how Tecton plugs into your ML Engineering workflow.

Tecton Data Flow

1. Connect to Data

Register batch and streaming data as Data Sources.

// data_sources/transactions_batch.py
transactions_batch = BatchDataSource(
    name='transactions_batch',
    batch_ds_config=HiveDSConfig(
        database='fraud',
        table='fraud_transactions',
        timestamp_column_name='timestamp',
    )
)

2. Write Feature Definitions

Define your feature as python functions with one of Tecton's Feature Views and Feature Tables.

// features/user_transaction_counts.py
@stream_window_aggregate_feature_view(
    inputs={'transactions': Input(transactions_batch)},
    ...
)
def user_transaction_counts(transactions):
    return f'''
        SELECT
            user_id, COUNT(*)
        FROM
            {transactions}
        GROUP BY
            user_id
        '''

3. Define a Feature Service

Group together the features your model needs into a Feature Service.

// feature_services/fraud_detection.py
fraud_detection_feature_service = FeatureService(
    name='fraud_detection_feature_service',
    features=[
        user_transaction_counts,
                ...
    ]
)

4. Train Models

Get model-ready feature data frames from the Feature Service and feed them to your model trainer.

// model_trainer.py
df = tecton.get_historical_data(fraud_detection_feature_service, ...)

model.fit(df)

5. Serve Features to Models in Production

Serve model-ready feature vectors from the Feature Service at low latency, backed by automatically materialized feature values.

$ curl -X POST https://<your_cluster>.tecton.ai/api/v1/feature-service/get-features\
     -H "Authorization: Tecton-key $TECTON_API_KEY" -d\
'{
  "params": {
    "feature_service_name": "fraud_detection_feature_service",
    "join_key_map": {
      "user_id": "$USER_ID_VALUE"
    },
    ...
  }
}'
{
  "result": {
  "features": [
    "3"
  ],
  "metadata": {
    "features": [
      {
        "name": "user_transaction_counts",
      },
    ]
  }
}

With that, you have a single platform serving both training and production ML systems, all from a unified set of feature definitions. Management of data processing, feature cataloging, and industrial grade serving all comes out of the box.

Ready to dive in? Get started with the tutorial →

Want an in-depth overview? Check out the guide to Feature Stores →