API Resources
This feature is currently in Private Preview.
- Must be enabled by Tecton Support.
- Available for Rift-based Feature Views.
- Realtime Feature Views must use Transform Server Groups.
Building powerful features often requires accessing external data. Some examples of such data include:
- External APIs: For enriching features with financial data (Plaid), fraud signals (Socure, Sift), environmental data (OpenWeatherMap), and many more APIs
- AI/ML Services: For leveraging state-of-the-art models from OpenAI and AWS Bedrock to generate embeddings and other ML-powered features
- Operational Data: Query mission-critical data directly from your production databases like PostgreSQL for real-time feature computation
API Resources enable retrieval of such data seamlessly for Batch and Realtime Feature Views. API Resources optimize both security and performance by maintaining persistent connections to external data sources. By handling the heavy lifting of initialization just once and reusing the established connection across multiple Feature View transformations, API Resources eliminate redundant setup costs and deliver consistently fast data retrieval.
Unlike Batch and Stream Sources which process large volumes of data from data warehouses or streams on schedule, API Resources enable on-demand access to small amounts of dynamic data from fast-serving endpoints. API Resources can be combined with Batch or Stream Sources within feature transformations, allowing you to enrich historical data with real-time signals or make conditional API calls based on batch/stream data.
The example below shows how to retrieve embeddings with the OpenAI API using a
@resource_provider
in Batch and Realtime Feature Views.
This reference doc focuses on API Resources and assumes familiarity with other Tecton concepts such as Environments and Secrets.
Define Your Resource Provider​
The first step for using API Resources for feature calculation is defining a
@resource_provider
that initializes a stateful client. Tecton will re-use that
initialized client for repeated calls from your FeatureView. In this example,
we're creating a resource provider that instantiates and returns the OpenAI
client.
The first step for using API Resources for feature calculation is defining a
@resource_provider
that initializes a stateful client. Tecton will re-use that
initialized client for repeated calls from your FeatureView. Resource Providers
are augmented with ResourceProviderContext
objects. ResourceProviderContext
allows users to access secrets
, which are consistent in both online and
offline query paths. In this example, we're creating a resource provider that
instantiates and returns the OpenAI client.
from tecton import resource_provider, Secret
@resource_provider(
tags={"environment": "staging"},
owner="tom@tecton.ai",
secrets={
"open_ai_key": Secret(scope="openai_embeddings", key="open_ai"),
},
)
def open_ai_client(context):
from openai import OpenAI
client = OpenAI(api_key=context.secrets["open_ai_key"])
return client
For safely using API credentials in the client, you'll first need to setup a
scope and secret using Tecton Secrets and
then use the secret within the resource function using context.secrets[<KEY>]
.
By default, resources are limited to make HTTP/HTTPS requests. If you'd like access beyond HTTP/HTTPS, please file a support ticket.
Use an API Resource in your Batch Feature View​
For generating batch features with external data, resources are passed to Batch
Feature Views through the resource_providers
parameter. Batch Feature Views
are augmented with MaterializationContext
objects. MaterializationContext
allows users to access secrets
and resources
, consistent in both online and
offline query paths. The resource is then accessed in the transformation of the
Batch Feature View to make an API call as shown below.
from datetime import datetime, timedelta
from tecton import batch_feature_view, Entity, Attribute
from tecton.types import Field, String
entity = Entity(name="user_id", join_keys=[Field("user_id", String)])
@batch_feature_view(
name="bfv",
mode="python",
sources=[ds],
entities=[entity],
batch_schedule=timedelta(days=1),
feature_start_time=datetime(2024, 10, 26),
timestamp_field="timestamp",
features=[
Attribute("text", String),
],
resource_providers={
"openai": open_ai_client,
},
environment="openai_env",
offline=True,
)
def batch_embedding(input_map, context):
model = "text-embedding-3-small"
openai = context.resources["openai"]
response = openai.embeddings.create(input=input_map["text"], model=model)
return {"embedding": response.data[0].embedding}
Ensure that the Feature View materialization environment has all the packages
needed for your resource provider's function body. In this example the
openai_env
environment must contain the openai
sdk. See more information
about environments:
Environments in Rift.
Use an API Resource in your Realtime Feature View​
For generating realtime features with external data, resources are passed to
Realtime Feature Views through the resource_providers
parameter. Realtime
Feature Views are augmented with RealtimeContext
objects. RealtimeContext
allows users to access resources
, secrets
, and a request_timestamp
,
consistent in both online and offline query paths. The resource is then accessed
in the transformation of the Realtime Feature View to make an API call as shown
below.
from tecton import Attribute, realtime_feature_view
from tecton.types import Field, String
input_request = RequestSource(schema=[Field("input", String)])
@realtime_feature_view(
sources=[input_request],
mode="python",
features=[Attribute("embedding", String)],
resource_providers={"openai": open_ai_client},
)
def realtime_embedding(input_request, context):
openai = context.resources["openai"]
response = openai.embeddings.create(input=input_request["input"], model="text-embedding-ada-002")
return {"embedding": response.data[0].embedding}
Create a Feature Service​
To query your Realtime Feature View that uses a Resource Provider, the Realtime
Feature View is passed into the Feature Service. Realtime Feature Views that use
an API Resource require that the Feature View's Feature Service uses a
Transform Server Group.
Ensure the Transform Server Group used for this Feature View's Feature Service
contains all the packages needed for your resource provider's function body, in
this example the openai
sdk.
from tecton import FeatureService
open_ai_feature_service = FeatureService(
name="open_ai_feature_service",
online_serving_enabled=True,
transform_server_group="<tsg-reference>",
features=[realtime_embedding],
)
Best Practices​
Here are some best practices to follow with API Resources:
- Implement Resilient Data Processing with Resources: Use retry logic and batch processing in transformation functions. This is essential for handling high-volume requests and preventing API throttling.
- Ensure Thread Safety: For Realtime Feature Views, design resources to be thread-safe. Each resource instance must handle multiple concurrent requests reliably.
- Optimize Latency: Keep transformations (including API call time) in Realtime Feature Views fast and efficient. API latency directly impacts overall Feature Service performance
Limitations​
API Resources currently have the following limitations:
- API Rate Limiting: API Resources do not include built-in rate limiting for external API calls. You must implement rate limiting logic in your transformation code to prevent API quota exhaustion.
- Resource Initialization: If resource provider initialization fails, Tecton will not automatically retry the initialization. Ensure your implementation includes proper error handling for initialization failures.
- Resource Re-Instantiation: For Realtime Feature Views, resource providers are not automatically reinitialized when a resource encounters failures, such as server-side issues or dropped connections.