Version: 1.2

Sources of Data

Tecton enables you to build production-grade machine learning features by connecting to a variety of sources of data. Understanding the types of data sources available is key to designing robust and scalable feature pipelines. In Tecton, there are three primary types of data sources:

Data Sources
API Resources
Feature Tables

This guide introduces each type, explains their use cases, and provides examples of how to use them in your feature definitions.

Data Sources

Data Sources are the foundational way to bring raw data into Tecton. They represent connections to external storage systems or data streams, such as:

Data warehouses (e.g., Snowflake, Redshift, BigQuery)
Data lakes (e.g., S3, Delta Lake, Hive)
Streaming platforms (e.g., Kafka, Kinesis)
Files (e.g., Parquet, CSV)

Data Sources are used as inputs to Feature Views (Batch or Stream) and are defined using Tecton's configuration classes like BatchSource and StreamSource.

Example:

from tecton import BatchSource, FileConfig

transactions_batch = BatchSource(
    name="transactions_batch",
    batch_config=FileConfig(
        uri="s3://my-bucket/transactions.parquet", file_format="parquet", timestamp_field="timestamp"
    ),
)

API Resources

API Resources allow Tecton to ingest data from operational sources such as arbitrary APIs or databases. This is especially useful for:

Real-time event ingestion (e.g., user actions, sensor data)
Synchronous feature computation at request time

API Resources are typically used with Push Sources or Request Sources in Tecton, and are often paired with @realtime_feature_view or @stream_feature_view.

Example:

from tecton import RequestSource, Field
from tecton.types import String

request_schema = [Field("user_id", String), Field("item_id", String)]

user_request = RequestSource(schema=request_schema)

Feature Tables

Feature Tables are managed tables within Tecton that store precomputed features. They can be used as sources for new feature views, enabling feature reuse and modularity. Feature Tables are especially useful for:

Sharing features across teams or projects
Decoupling feature computation from feature consumption
Serving features at low latency

Feature Tables can be referenced in new feature views, allowing you to build on top of existing features.

Example:

from tecton import FeatureTable

user_features = FeatureTable(name="user_features", ...)

Summary

Source Type	Typical Use Case	Example Classes
Data Source	Raw data ingestion (batch/stream)	BatchSource, StreamSource
API Resource	Real-time or request-time features	RequestSource, PushConfig
Feature Table	Reusing and serving precomputed features	FeatureTable

What's Next

Define your features: Learn about the types of features you can create in Tecton.
Read more about Feature Views: Learn about the three types of feature views: Batch, Stream, and Realtime.

Data Sources​

API Resources​

Feature Tables​

Summary​

What's Next​

Was this page helpful?

Data Sources

API Resources

Feature Tables

Summary

What's Next