Glossary
A​
Aggregation Engine​
A distributed computation framework built into Tecton that enables efficient
creation and serving of time-windowed aggregation features. It handles complex
calculations over time windows while ensuring online/offline consistency and
optimizing for performance.
Learn more
Attribute Feature​
A simple feature type in Tecton that represents a direct value from a column in
transformed data. Attribute features are used for storing and serving individual
data points like user properties or metadata.
Learn more
B​
Backfill​
The process of computing historical feature values for a feature view, ensuring
data completeness before serving features in production. This is typically done
in batch to populate past timestamps.
Learn more
Batch Feature View​
A type of Feature View that transforms data from batch sources (like data
warehouses or data lakes) on a schedule. Batch Feature Views pre-compute feature
values and store them in Tecton's feature stores for later retrieval.
Learn more
Batch Source​
A Data Source that specifies how to connect to and read from a batch data
repository like Snowflake, BigQuery, S3, or a Hive table.
Learn more
Batch Schedule​
A configuration parameter that defines how frequently a Batch Feature View or
Stream Feature View's offline materialization jobs run, typically specified as a
time interval (e.g., timedelta(days=1)
).
Learn more
C​
Cache​
A temporary storage layer that speeds up feature retrieval by storing recently
accessed data, reducing the need for repeated computations or database queries.
In Tecton, caching improves low-latency access to online features.
Learn more
Cluster​
A group of computing resources (e.g., Spark, Databricks, or Kubernetes nodes) used to process, store, or serve feature data efficiently.
Compaction​
A process that optimizes stored feature data by reducing redundancy and merging
smaller data segments, improving performance and reducing storage costs.
Learn more
Compute Engine​
The processing system responsible for executing feature transformations and
aggregations. This can include batch processing frameworks (e.g., Spark, Rift)
or real-time streaming engines, depending on the feature pipeline
requirements.
Learn more
Control Plane​
The component of a system responsible for managing configurations,
orchestration, and metadata, ensuring proper coordination of data and compute
operations. In Tecton, the control plane defines feature views, materialization
schedules, and infrastructure settings.
Learn more
D​
Data Plane​
The layer responsible for executing data processing tasks, including feature
transformation, storage, and retrieval. It handles the movement and computation
of data based on configurations from the control plane.
Learn more
Data Source​
An object that defines how Tecton connects to and reads from external data
systems. Data Sources abstract away connection details and ensure consistent
interpretation of data both online and offline.
Learn more
F​
Feature​
A measurable property or attribute of data used as input for machine learning
models. Features are derived from raw data and can be transformed, aggregated,
or stored for inference.
Learn more
Feature Table​
A structured storage format that holds computed feature values, indexed by
entity keys and timestamps. Feature tables enable efficient retrieval of feature
data for training and inference.
Learn more
Feature View​
The core building block in Tecton that defines transformations to convert raw
data into features. Feature Views encapsulate logic for computing features
consistently in both online and offline environments.
Learn more
Freshness​
The time delay between when raw data is generated and when a corresponding
feature is available for model inference. Lower freshness latency improves
real-time predictions.
Learn more
M​
Model​
A machine learning algorithm trained on features to make predictions or
classifications based on new input data.
Learn more
P​
Pipeline​
A sequence of processes that transform raw data into features, including
ingestion, transformation, materialization, and serving for model training or
inference.
Learn more
S​
Skew/Drift​
The deviation between training and inference feature distributions (skew) or
gradual changes in data patterns over time (drift), which can degrade model
performance.
Learn more
Streaming​
A data processing paradigm where data is continuously ingested and transformed
in real-time, enabling low-latency feature updates for online predictions.
Learn more
T​
Tile​
A precomputed, time-bucketed feature aggregation unit used to optimize storage
and retrieval efficiency in Tecton's feature computation framework.
Learn more
Time Travel​
The ability to query historical feature values at specific timestamps, allowing
models to reconstruct past feature states for training and debugging.
Learn more
Training​
The process of feeding historical feature data into a machine learning model to
learn patterns and optimize predictive accuracy.
Learn more
TTL (Time-to-Live)​
A retention policy that defines how long feature data is stored before being
automatically deleted, balancing storage costs and data relevance.
Learn more
W​
Workspace​
A dedicated environment for managing and deploying Tecton feature pipelines.
Workspaces provide isolation between different stages of development (e.g., dev,
test, prod) or between teams sharing a Tecton deployment.
Learn more