Skip to main content
Version: 0.6

Creating and Training a Model

Creating the model

In your notebook, create a RandomForestRegressor model, sending it the full_pipe pipeline that you created earlier.

n_estimators = 100
max_depth = 6
max_features = 3

rf = make_pipeline(
full_pipe,
RandomForestRegressor(n_estimators=n_estimators, max_depth=max_depth, max_features=max_features),
)

Training the model

To train the model, call the fit method of the model (in your notebook). The transforms in the pipeline are applied to the training data before the model is trained:

rf.fit(X_train, y_train)

Output:

Pipeline(steps=[('columntransformer',
...
...
...
)]
)

Make predictions using the testing data

To get a prediction, call the predict() method of the model (in your notebook):

predictions = rf.predict(X_test)
print(predictions)

Sample Output:

[0.32 0.36]
note

The prediction value will between 0 and 1. The higher the value, the higher the probability of the transaction being fraudulent. The value can be exactly 0 or 1, because this is the behavior of the RandomForestRegressor model that made the prediction.

Get the error in the prediction

Get the error in the predictions made by the model (with the training data), compared to the predictions in the testing data (y_test):

mse = mean_squared_error(y_test, predictions)
print(mse)

Sample Output:

0.11599999999999999

Iterating on your model

Based on the error in the prediction of a training model, you may choose to iterate on the model by updating your features, using a model with different parameters, and/or choosing a new model. Iterating on the model is not covered in this tutorial.

Was this page helpful?

🧠 Hi! Ask me anything about Tecton!

Floating button icon