Sequence Features
Sequence features represent a chronological list of recent events for each entity, such as logins, transactions, or user actions. They are especially useful for deep learning models like LSTMs and Transformers that learn patterns over time and require ordered, contextual input.
These models can leverage sequence features to understand not just what happened, but how it happened over time—enabling more accurate real-time predictions in domains like fraud detection and personalization.
Coinbase has demonstrated that with Tecton, sequence features can be kept fresh within 1 second and served online in under 5ms
Common Use Cases​
Sequence features are ideal for:
- Fraud detection – spotting suspicious activity across login and transaction patterns
- Personalized recommendations – analyzing sequences of product views or interactions
Defining Sequence Features in Tecton​
Sequence features in Tecton are implemented using the last_n aggregation function inside a stream_feature_view.
This approach materializes ordered lists of raw events (e.g. names, timestamps) for each entity in real time.
Example: Last 100 User Events​
from tecton import stream_feature_view, Aggregate, StreamProcessingMode
from tecton.aggregation_functions import last
from tecton.types import String, Timestamp, Field, Array
from datetime import timedelta, datetime
LAST_N = 100
@stream_feature_view(
source=user_events_stream,
entities=[user],
mode="spark_sql",
timestamp_field="event_time",
stream_processing_mode=StreamProcessingMode.CONTINUOUS,
features=[
Aggregate(
function=last(n=LAST_N),
input_column=Field("event_time_str", String),
time_window=timedelta(hours=1),
name="event_times"
),
Aggregate(
function=last(n=LAST_N),
input_column=Field("event_name", String),
time_window=timedelta(hours=1),
name="event_names"
)
],
feature_start_time=datetime(2022, 5, 1),
online=True,
offline=True,
description=f"Last {LAST_N} event times and event names per user in the past hour."
)
def user_event_sequence(user_events_stream):
return f"""
SELECT
user_id,
event_name,
event_time_str,
-- Truncate to second precision and cast to timestamp for consistency
CAST(DATE_FORMAT(event_time_str, 'yyyy-MM-dd HH:mm:ss') AS TIMESTAMP) AS event_time
FROM
{user_events_stream}
"""
Sequence Post-Processing​
Before feeding sequence features into a model, it can be helpful to do some post-processing. For example, you may want to insert explicit temporal markers to indicate gaps in time.
Example: Insert <DAY> Tokens​
from tecton import realtime_feature_view, Attribute
from tecton.types import Array, String
from datetime import datetime
@realtime_feature_view(
sources=[user_event_sequence],
mode="python",
description="Inserts <DAY> tokens between events that occur on different days.",
features=[
Attribute("event_sequence_with_day_tokens", Array(String))
]
)
def processed_event_sequence(user_event_sequence):
event_names = user_event_sequence.get("event_names", [])
event_times = user_event_sequence.get("event_times", [])
processed_sequence = []
last_date = None
DAY_MARKER = "<DAY>"
for name, time_str in zip(event_names, event_times):
try:
timestamp = datetime.fromisoformat(time_str)
current_date = timestamp.date()
if last_date and current_date != last_date:
processed_sequence.append(DAY_MARKER)
last_date = current_date
processed_sequence.append(name)
except Exception:
continue # Skip malformed timestamps
return {
"event_sequence_with_day_tokens": processed_sequence
}
Modeling with Sequence Features​
Sequence features retain full temporal detail and are best suited for models that can leverage ordered input:
| Model Type | Benefit |
|---|---|
| LSTMs / GRUs | Capture short- and long-term temporal dependencies |
| Transformers | Learn global context across long sequences |
Design Considerations​
| Topic | Notes |
|---|---|
| Freshness | Tecton can update sequence features within 1 second of event time |
| Latency | Online retrieval typically takes < 5ms |
When to Use Sequence Features​
✅ Use when:
- Event order matters
- You're using sequence-aware deep learning models
- Aggregated features lose too much temporal signal
🚫 Avoid when:
- You're using only linear models or tree-based models
- Aggregates (e.g., counts) capture sufficient signal