Stream Feature View with Stream Ingest API
This feature is currently in Public Preview.
- Available for Tecton on Databricks and EMR.
- The first 10m writes through the Stream Ingest API are free for existing Tecton customers. To increase this limit, contact Tecton Support.
The Stream Ingest API is an endpoint to update features in the Tecton Feature Store with sub-second latency. Records sent to the Stream Ingest API are immediately written to the Tecton Feature Store and made available for training and inference.
By using the Stream Ingest API to send data to the Tecton Feature Platform, ML teams can:
- Integrate Tecton with any existing streaming feature pipeline without migrating feature code. The Stream Ingest API lets teams get all the data management, serving, governance, monitoring, etc. benefits of Tecton’s Feature Platform on top of their existing feature pipelines, without having to rewrite any feature code. This means no need to migrate feature pipelines that are already working to get started using an enterprise feature platform, making it faster and easier for ML and DS teams to get their features centrally managed for trusted and reliable training and serving.
- Easily build powerful streaming features on event data using Python and performant aggregations: Tecton’s Serverless Python and Aggregations Engines enable Data Scientists and ML Engineers to author and manage transformations in familiar Python, allowing you to skip the complicated code and heavy stream processing infrastructure required by other solutions.
- Bring read-after-write consistency to your feature infrastructure: The Stream Ingest API can block until input data has been fully processed and corresponding features are updated, making it easy for your application to push event data to the feature platform and quickly retrieve up-to-date feature vectors — something very useful for event-driven decisioning applications like loan approvals and fraud monitoring.
A Push Source is a Data Source that represents data that is ingested to the Tecton Feature Platform through records pushed to the Stream Ingest API. Push Sources need schemas defined explicitly, and records push to the Stream Ingest API must conform to the schema. Push Sources may optionally be backed by a Batch Source, which is used to backfill data into the online and/or offline store.
Stream Feature View
A Stream Feature View is a Feature View that is backed by Stream or Push sources. In the case that a Stream Feature View is backed by a Push Source, the Stream Feature View will be updated in real-time as records are pushed to the Stream Ingest API. Stream Feature Views can optionally have a transformation defined, which will be applied to the data before it is written to the online and/of offline store. In case the backing Push Source has a batch source defined, the transformation will also be applied to the batch source data before it is written to the online and/or offline store.
Using the Stream Ingest API
Creating a simple Stream Feature View with a Push Source
To write features with the Stream Ingest API, first create a Push Source. Then, define a Stream Feature View using that Push Source as the data source.
The following example declares a Push Source object.
from tecton import PushSource, HiveConfig
from tecton.types import String, Int64, Timestamp, Field
input_schema = [
impressions_event_source = PushSource(
description="A push source for synchronous, online ingestion of ad-click events with user info.",
schema: This is the schema of records that will be sent to the Stream Ingest API. Tecton will use this schema to validate the JSON pyaload. See the Data Types documentation for the list of supported types.
You can then create a Stream Feature View with the Push Source you created. In this example, we want to simply serve the records sent to the Stream Ingest API as-is, so no transformation is defined.
from datetime import datetime, timedelta
from tecton import StreamFeatureView
from ads.entities import user
from ads.data_sources.ad_impressions import impressions_event_source
click_events_fv = StreamFeatureView(
feature_start_time=datetime(2022, 1, 1),
description="The count of ad clicks for a user",
- When the parameter
onlineis set to
True, then records sent to the Stream Ingest API will be immediately written to the Online Store.
- When the parameter
offlineis set to
True, then records sent to the Stream Ingest API will be asynchronously written to the Offline Store.
Once applied successfully, the Push Source and Stream Feature View are ready to
receive records from the Stream Ingest API. There may be a few seconds delay
tecton apply before the API can accept records for the Push Source.
Authenticating requests with an API key
To authenticate your requests to the Stream Ingest API, you will need to create a Service Account, and grant that Service Account the Editor role for your workspace:
- Create the Service Account to obtain your API key.
tecton service-account create \
--name "stream-ingest-service-acount" \
--description "A Stream Ingest API sample"
Save this API Key - you will not be able to get it again.
API Key: <Your-api-key>
Service Account ID: <Your-Service-Account-Id>
- Assign the Editor role to that Service Account.
tecton access-control assign-role --role editor \
--workspace <Your-workspace> \
Successfully updated role.
- Export the API key as an environment variable named
TECTON_API_KEYor add the key to your secret manager.
Sending Data to the Stream Ingest API
Note: The following example uses cURL as the HTTP client and can be executed from the command line, but the HTTP call is the same for any client.
The Stream Ingest API accepts data in JSON format, via an HTTP API. To ingest
events into Tecton, you can use the
The below example shows a cURL query that sends a batch with 2 records to the Stream Ingest API. In this example:
workspace_name: name of the workspace where the Push Sources(s) are defined
dry_run: When set to True, the request will be validated but no events will be written to the Online Store. This parameter can be used for debugging and/or verifying the correctness of the request
records: Contains a map of Push Source names to an array of records to be ingested. Note that a single Stream Ingest API request can support batch of records for multiple sources.
$ curl -X POST https://preview.<your_cluster>.tecton.ai/ingest\
-H "Authorization: Tecton-key $TECTON_API_KEY" -d\
A successful response from the Stream Ingest API includes metrics on the number of records ingested per Feature View. If there are multiple Stream Feature Views using the same Push Source, any record sent to the Stream Ingest API for the Push Source will be written to each dependent Feature View.
Reading Ingested Records from the Online Store
In order to read ingested data for Feature Views, you must define a Feature Service that includes the Feature View with online serving enabled, as below
from tecton import FeatureService
from ads.features.stream_features.click_events_fv import click_events_fv
ad_ctr_feature_service = FeatureService(
description="A FeatureService providing features for a model that predicts if a user will click an ad.",
Once applied successfully, you can then retrieve features from the online store using the FeatureService HTTP API
$ curl -X POST https://<your_cluster>.tecton.ai/api/v1/feature-service/get-features\
-H "Authorization: Tecton-key $TECTON_API_KEY" -d\
Reading Ingested Records from the Offline Store
Stream Ingest API Error Handling
An error response from Stream Ingest API could contain a
RequestError or a
RecordError depending on the type of error encountered. Request Errors include
errors such as a missing required field in the request, incorrectly formatted
request, invalid workspace name etc
"errorMessage": "Required field `workspace_name` is missing in the request",
Record Errors include any error encountered when processing the batch of records in the request. Some examples are missing event timestamp column, invalid timestamp format, event timestamp outside of the serving TTL for the Feature View etc
"errorMessage": "timestamp value is outside of the serving TTL defined for Feature View keyword_push_fv"
HTTP Response Status Codes
Here is a list of the possible HTTP response codes:
|200 OK||Success. Features have been ingested into the online store.|
|400 Bad Request||Incorrectly formatted request body (JSON syntax errors)|
Missing required fields (
Schema mismatch between pushed record and the schema declared in the Push Source definition.
|401 Unauthorized||Missing or invalid |
|403 Forbidden||The Service Account associated with the API Key is not authorized to perform a write operation for the workspace.|
|404 Not Found|
|500 Internal Server Error||e.g. Internal Server Error. Please contact Tecton Support for assistance|
Stream Ingest API Expectations
dry_runis set to False in the Stream Ingest API request, then the workspace must be a live workspace with at least one Stream Feature View subscribed to the corresponding Push Source in the request and has online write enabled (
- The schema of each record in the request must match the schema of the
corresponding Push Source, as defined during
- The event timestamp in the record must be within the serving TTL defined for the Stream Feature View
- Fields with type
Timestampin the Push Source schema must have the timestamp provided formatted as a RFC 3339 formatted string in the JSON payload. If using Java, the DateTimeFormatter.html#ISO_OFFSET_DATE_TIME formatter produces a correctly formatted string from a DateTime instance.
If using Python, the
produces a correctly formatted string from a datetime instance For example,
from datetime import datetime, timezone
now = datetime.now(timezone.utc)
Stream Ingest API Service Level Objectives
The Service Level Objective (SLO) for the Stream Ingest API at
https://preview.<your_cluster>.tecton.ai/ingest is 99%, measured over the
trailing 30 days.
This SLO measurement does not include requests with a 4xx response code. Note that during sudden increases in request volume, there may be an increase in 429 response codes until the service has scaled to the appropriate capacity.