Version: 1.1

API Resources

Private Preview

This feature is currently in Private Preview.

This feature has the following limitations:

Must be enabled by Tecton Support.
Available for Rift-based Feature Views.
Realtime Feature Views must use Transform Server Groups.

If you would like to participate in the preview, please file a support ticket.

Building powerful features often requires accessing external data. Some examples of such data include:

External APIs: For enriching features with financial data (Plaid), fraud signals (Socure, Sift), environmental data (OpenWeatherMap), and many more APIs
AI/ML Services: For leveraging state-of-the-art models from OpenAI and AWS Bedrock to generate embeddings and other ML-powered features
Operational Data: Query mission-critical data directly from your production databases like PostgreSQL for real-time feature computation

API Resources enable retrieval of such data seamlessly for Batch and Realtime Feature Views. API Resources optimize both security and performance by maintaining persistent connections to external data sources. By handling the heavy lifting of initialization just once and reusing the established connection across multiple Feature View transformations, API Resources eliminate redundant setup costs and deliver consistently fast data retrieval.

Unlike Batch and Stream Sources which process large volumes of data from data warehouses or streams on schedule, API Resources enable on-demand access to small amounts of dynamic data from fast-serving endpoints. API Resources can be combined with Batch or Stream Sources within feature transformations, allowing you to enrich historical data with real-time signals or make conditional API calls based on batch/stream data.

The example below shows how to retrieve embeddings with the OpenAI API using a @resource_provider in Batch and Realtime Feature Views.

note

This reference doc focuses on API Resources and assumes familiarity with other Tecton concepts such as Environments and Secrets.

Define Your Resource Provider

The first step for using API Resources for feature calculation is defining a @resource_provider that initializes a stateful client. Tecton will re-use that initialized client for repeated calls from your FeatureView. In this example, we're creating a resource provider that instantiates and returns the OpenAI client.

The first step for using API Resources for feature calculation is defining a @resource_provider that initializes a stateful client. Tecton will re-use that initialized client for repeated calls from your FeatureView. Resource Providers are augmented with ResourceProviderContext objects. ResourceProviderContext allows users to access secrets, which are consistent in both online and offline query paths. In this example, we're creating a resource provider that instantiates and returns the OpenAI client.

from tecton import resource_provider, Secret


@resource_provider(
    tags={"environment": "staging"},
    owner="tom@tecton.ai",
    secrets={
        "open_ai_key": Secret(scope="openai_embeddings", key="open_ai"),
    },
)
def open_ai_client(context):
    from openai import OpenAI

    client = OpenAI(api_key=context.secrets["open_ai_key"])
    return client

For safely using API credentials in the client, you'll first need to setup a scope and secret using Tecton Secrets and then use the secret within the resource function using context.secrets[<KEY>].

note

By default, resources are limited to make HTTP/HTTPS requests. If you'd like access beyond HTTP/HTTPS, please file a support ticket.

Use an API Resource in your Batch Feature View

For generating batch features with external data, resources are passed to Batch Feature Views through the resource_providers parameter. Batch Feature Views are augmented with MaterializationContext objects. MaterializationContext allows users to access secrets and resources, consistent in both online and offline query paths. The resource is then accessed in the transformation of the Batch Feature View to make an API call as shown below.

from datetime import datetime, timedelta
from tecton import batch_feature_view, Entity, Attribute
from tecton.types import Field, String

entity = Entity(name="user_id", join_keys=[Field("user_id", String)])


@batch_feature_view(
    name="bfv",
    mode="python",
    sources=[ds],
    entities=[entity],
    batch_schedule=timedelta(days=1),
    feature_start_time=datetime(2024, 10, 26),
    timestamp_field="timestamp",
    features=[
        Attribute("text", String),
    ],
    resource_providers={
        "openai": open_ai_client,
    },
    environment="openai_env",
    offline=True,
)
def batch_embedding(input_map, context):
    model = "text-embedding-3-small"
    openai = context.resources["openai"]
    response = openai.embeddings.create(input=input_map["text"], model=model)
    return {"embedding": response.data[0].embedding}

Ensure that the Feature View materialization environment has all the packages needed for your resource provider's function body. In this example the openai_env environment must contain the openai sdk. See more information about environments: Environments in Rift.

Use an API Resource in your Realtime Feature View

For generating realtime features with external data, resources are passed to Realtime Feature Views through the resource_providers parameter. Realtime Feature Views are augmented with RealtimeContext objects. RealtimeContext allows users to access resources, secrets, and a request_timestamp, consistent in both online and offline query paths. The resource is then accessed in the transformation of the Realtime Feature View to make an API call as shown below.

from tecton import Attribute, realtime_feature_view
from tecton.types import Field, String

input_request = RequestSource(schema=[Field("input", String)])


@realtime_feature_view(
    sources=[input_request],
    mode="python",
    features=[Attribute("embedding", String)],
    resource_providers={"openai": open_ai_client},
)
def realtime_embedding(input_request, context):
    openai = context.resources["openai"]
    response = openai.embeddings.create(input=input_request["input"], model="text-embedding-ada-002")
    return {"embedding": response.data[0].embedding}

Create a Feature Service

To query your Realtime Feature View that uses a Resource Provider, the Realtime Feature View is passed into the Feature Service. Realtime Feature Views that use an API Resource require that the Feature View's Feature Service uses a Transform Server Group. Ensure the Transform Server Group used for this Feature View's Feature Service contains all the packages needed for your resource provider's function body, in this example the openai sdk.

from tecton import FeatureService

open_ai_feature_service = FeatureService(
    name="open_ai_feature_service",
    online_serving_enabled=True,
    transform_server_group="<tsg-reference>",
    features=[realtime_embedding],
)

Generating Training Data with API Resources

By default, Tecton ensures consistency between training and serving environments by using identical feature computation logic in both contexts. For features using API Resources, this means Tecton makes API calls during:

Batch materialization (for Batch Feature Views)
Online feature computation (for Realtime Feature Views)
Training data generation (for get_features_in_range() and get_features_for_events())

While this consistency is ideal, making API calls during training data generation may not always be practical due to:

Scale Limitations: Large-scale training data generation could trigger too many API calls in a short time window, potentially overwhelming the API service.
Cost Constraints: The cost of making extensive API calls during training might be prohibitively expensive.
Historical Data Availability: Many APIs only provide current data and cannot retrieve historical values, making point-in-time accurate training data impossible to generate through live API calls. Additionally, when historical API responses are already stored in a batch data source, making new API calls would be redundant and inefficient.

When live API calls are impractical for training data generation, follow these steps to use a batch source:

Create a Batch Feature View: Define a new Batch Feature View that uses historical data (e.g., API call logs) as the source corresponding to your API Resource-based Feature View.
Create a Feature Service for training: Include all the original features, but replace the API Resource-based Feature View with the Batch Feature View created in Step 1.
Training & Serving Feature Services: Use the training Feature Service for generating historical features and maintain the original Feature Service with live API calls for online serving.

This approach ensures efficient training data generation while preserving the benefits of real-time API data in production.

Best Practices

Here are some best practices to follow with API Resources:

Implement Resilient Data Processing with Resources: Use retry logic and batch processing in transformation functions. This is essential for handling high-volume requests and preventing API throttling.
Ensure Thread Safety: For Realtime Feature Views, design resources to be thread-safe. Each resource instance must handle multiple concurrent requests reliably.
Optimize Latency: Keep transformations (including API call time) in Realtime Feature Views fast and efficient. API latency directly impacts overall Feature Service performance

Limitations

API Resources currently have the following limitations:

API Rate Limiting: API Resources do not include built-in rate limiting for external API calls. You must implement rate limiting logic in your transformation code to prevent API quota exhaustion.
Resource Initialization: If resource provider initialization fails, Tecton will not automatically retry the initialization. Ensure your implementation includes proper error handling for initialization failures.
Resource Re-Instantiation: For Realtime Feature Views, resource providers are not automatically reinitialized when a resource encounters failures, such as server-side issues or dropped connections.

Define Your Resource Provider​

Use an API Resource in your Batch Feature View​

Use an API Resource in your Realtime Feature View​

Create a Feature Service​

Generating Training Data with API Resources​

Best Practices​

Limitations​

Was this page helpful?