A Tecton BatchSource, used to read batch data into Tecton for use in a BatchFeatureView.
|Returns the duration that materialization jobs wait after the |
|Returns the description of the Tecton object.|
|Returns the unique id of the Tecton object.|
|Returns the name of the Tecton object.|
|Returns the owner of the Tecton object.|
|Returns the tags of the Tecton object.|
|Returns the workspace that this Tecton object belongs to.|
|Creates a new BatchSource.|
|Returns the column names of the Data Source’s schema.|
|Returns the data in this Data Source as a Tecton DataFrame.|
|Displays a human readable summary of this Data Source.|
|Validate this Tecton object and its dependencies (if any).|
Creates a new BatchSource.
str) – A unique name of the DataSource.
bool) – If True, this Tecton object will be blocked from being deleted or re-created (i.e. a destructive update) during tecton plan/apply. To remove or update this object,
prevent_destroymust be first set to False via a separate tecton apply.
prevent_destroycan be used to prevent accidental changes such as inadvertantly deleting a Feature Service used in production or recreating a Feature View that triggers expensive rematerialization jobs.
prevent_destroyalso blocks changes to dependent Tecton objects that would trigger a recreate of the tagged object, e.g. if
prevent_destroyis set on a Feature Service, that will also prevent deletions or re-creates of Feature Views used in that service.
prevent_destroyis only enforced in live (i.e. non-dev) workspaces. (Default:
SparkBatchConfig]) – BatchConfig object containing the configuration of the Batch Data Source to be included in this Data Source.
# Declare a BatchSource with a HiveConfig instance as its batch_config parameter.
# Refer to the "Configs Classes and Helpers" section for other batch_config types.
from tecton import HiveConfig, BatchSource
credit_scores_batch = BatchSource(
batch_config=HiveConfig(database="demo_fraud", table="credit_scores", timestamp_field="timestamp"),
Returns the column names of the Data Source’s schema.
Returns the data in this Data Source as a Tecton DataFrame.
datetime]) – The interval start time from when we want to retrieve source data. If no timezone is specified, will default to using UTC. Can only be defined if
apply_translatoris True. (Default:
datetime]) – The interval end time until when we want to retrieve source data. If no timezone is specified, will default to using UTC. Can only be defined if
apply_translatoris True. (Default:
bool) – If True, the transformation specified by
post_processorwill be applied to the dataframe for the data source.
apply_translatoris not applicable to batch sources configured with
spark_batch_configbecause it does not have a
A Tecton DataFrame containing the data source’s raw or translated source data.
apply_translatoris False, but
end_timefilters are passed in.
Displays a human readable summary of this Data Source.
Validate this Tecton object and its dependencies (if any).
Validation performs most of the same checks and operations as
Check for invalid object configurations, e.g. setting conflicting fields.
For Data Sources and Feature Views, test query code and derive schemas. e.g. test that a Data Source’s specified s3 path exists or that a Feature View’s SQL code executes and produces supported feature data types.
Objects already applied to Tecton do not need to be re-validated on retrieval
my_workspace.get_feature_view('my_fv')) since they have already been
Locally defined objects (e.g.
my_ds = BatchSource(name="my_ds", ...)) may need
to be validated before some of their methods can be called (e.g.