I am performing three different regression models and to minimize the coding I have created a function to fit, train, test etc…
This is all fine and dandy, but after I have ran the function against the three different models, LR, DTR, and RFR, I am now trying to find a way to capture the results in an empty df so that I can print the results in a visually appealing comparison table. I only want to see the testing results RMSE and MAE for each model and preferably with a gradient applied.
MY FUNCTION
def evaluate_model(model, X_train, y_train, X_test, y_test):
# fit the model
model.fit(X_train, y_train)
# print the parameters
print(model.get_params(), end="\n\n")
# predict the values using training data
train_pred = model.predict(X_train)
# evaluate using training data
train_mae = mean_absolute_error(y_train, train_pred)
train_rmse = np.sqrt(mean_squared_error(y_train, train_pred))
# print the results of the training data
print("Results of the training data\n")
print("Mean Absolute Error: {:.2f}".format(train_mae))
print("Root Mean Squared Error: {:.2f}\n".format(train_rmse))
# visualize training data
fig_train = px.scatter(x=y_train, y=train_pred,
labels={'x': 'Actual Values', 'y': 'Predicted Values'},
title='Visualization of Actual Data vs. Prediction of Training Data')
fig_train.add_scatter(x=y_train, y=y_train, mode='lines', line=dict(color='#e6981c', width=4))
fig_train.show()
# predict the values using testing data
test_pred = model.predict(X_test)
# evaluate using testing data
test_mae = mean_absolute_error(y_test, test_pred)
test_rmse = np.sqrt(mean_squared_error(y_test, test_pred))
# print the results of the testing data
print("Results of the testing data\n")
print("Mean Absolute Error: {:.2f}".format(test_mae))
print("Root Mean Squared Error: {:.2f}\n".format(test_rmse))
# visualize testing data
fig_test = px.scatter(x=y_test, y=test_pred,
labels={'x': 'Actual Values', 'y': 'Predicted Values'},
title='Visualization of Actual Data vs. Prediction of Testing Data')
fig_test.add_scatter(x=y_test, y=y_test, mode='lines', line=dict(color='#1ce658', width=4))
fig_test.show()
Then in the next three blocks I have this…
lr = LinearRegression()
evaluate_model(lr, X_train, y_train, X_test, y_test)
then
dtr = DecisionTreeRegressor(random_state=42)
evaluate_model(dtr, X_train, y_train, X_test, y_test)
… and finally
rf = RandomForestRegressor(random_state=42)
evaluate_model(rf, X_train, y_train, X_test, y_test)
Now, from this point I would like to have a comparison table showing the test_data MAE and RMSE for all three regressions with gradient… I cant figure out how to do it.