API Resources
This feature is currently in Private Preview.
- Must be enabled by Tecton Support.
- Available for Rift-based Feature Views.
- Realtime Feature Views must use Transform Server Groups.
Building powerful features often requires accessing external data. Some examples of such data include:
- External APIs: For enriching features with financial data (Plaid), fraud signals (Socure, Sift), environmental data (OpenWeatherMap), and many more APIs
- AI/ML Services: For leveraging state-of-the-art models from OpenAI and AWS Bedrock to generate embeddings and other ML-powered features
- Operational Data: Query mission-critical data directly from your production databases like PostgreSQL for real-time feature computation
API Resources enable retrieval of such data seamlessly for Batch and Realtime Feature Views. API Resources optimize both security and performance by maintaining persistent connections to external data sources. By handling the heavy lifting of initialization just once and reusing the established connection across multiple Feature View transformations, API Resources eliminate redundant setup costs and deliver consistently fast data retrieval
The example below shows how to retrieve embeddings with the OpenAI API using a
@resource_provider
in Batch and Realtime Feature Views.
This reference doc focuses on API Resources and assumes familiarity with other Tecton concepts such as Environments and Secrets.
Define Your Resource Provider​
The first step for using API Resources for feature calculation is defining a
@resource_provider
that initializes a client. In this example, we're creating
a resource provider that instantiates and returns the OpenAI client.
@resource_provider(
tags={"environment": "staging"},
owner="tom@tecton.ai",
secrets={
"open_ai_key": Secret(scope="openai_embeddings", key="open_ai"),
},
)
def open_ai_client(context):
from openai import OpenAI
client = OpenAI(api_key=context.secrets["open_ai_key"])
return client
For safely using API credentials in the client, you'll first need to setup a
scope and secret using Tecton Secrets and
then use the secret within the resource function using context.secrets[<KEY>]
.
like access beyond HTTP/HTTPS, please file a support ticket.
Create a Batch Feature View​
For generating batch features with external data, resources are passed to Batch
Feature Views through the resource_providers
parameter. The resource is then
accessed in the transformation of the Batch Feature View to make an API call as
shown below.
entity = Entity(name="user_id", join_keys=[Field("user_id", String)])
@batch_feature_view(
name="bfv",
mode="python",
sources=[ds],
entities=[entity],
batch_schedule=timedelta(days=1),
feature_start_time=datetime(2024, 10, 26),
timestamp_field="timestamp",
features=[
Attribute("text", String),
],
resource_providers={
"openai": open_ai_client,
},
environment="openai_env",
offline=True,
)
def batch_embedding(input_map, context):
model = "text-embedding-3-small"
openai = context.resources["openai"]
response = openai.embeddings.create(input=input_map["text"], model=model)
return {"embedding": response.data[0].embedding}
Ensure that the Feature View materialization environment has all the packages
needed for your resource provider's function body. In this example the
openai_env
environment must contain the openai
sdk. See more information
about environments:
Environments in Rift.
Create a Realtime Feature View​
For generating realtime features with external data, resources are passed to
Realtime Feature Views through the resource_providers
parameter. The resource
is then be accessed in the transformation of the Realtime Feature View to make
an API call as shown below.
input_request = RequestSource(schema=[Field("input", String)])
@realtime_feature_view(
sources=[input_request],
mode="python",
features=[Attribute("embedding", String)],
resource_providers={"openai": open_ai_client},
)
def realtime_embedding(input_request, context):
openai = context.resources["openai"]
response = openai.embeddings.create(input=input_request["input"], model="text-embedding-ada-002")
return {"embedding": response.data[0].embedding}
Realtime Feature Views that use a API Resource require that the Feature View's
Feature Service uses a
Transform Server Group.
Ensure the Transform Server Group used for this Feature View's Feature Service
contains all the packages needed for your resource provider's function body, in
this example the openai
sdk.
Best Practices​
Here are some best practices to follow with API Resources:
- Implement Resilient Data Processing with Resources: Use retry logic and batch processing in transformation functions. This is essential for handling high-volume requests and preventing API throttling.
- Ensure Thread Safety: For Realtime Feature Views, design resources to be thread-safe. Each resource instance must handle multiple concurrent requests reliably.
- Optimize Latency: Keep transformations (including API call time) in Realtime Feature Views fast and efficient. API latency directly impacts overall Feature Service performance
Limitations​
API Resources currently have the following limitations:
- API Rate Limiting: API Resources do not include built-in rate limiting for external API calls. You must implement rate limiting logic in your transformation code to prevent API quota exhaustion.
- Resource Initialization: If resource provider initialization fails, Tecton will not automatically retry the initialization. Ensure your implementation includes proper error handling for initialization failures.
- Resource Re-Instantiation: For Realtime Feature Views, resource providers are not automatically reinitialized when a resource encounters failures, such as server-side issues or dropped connections.