# ML algorithm: problem to predict

The ML algorithm examines a dataset consisting of the geometric parameters of a metallic object in the first three columns. The fourth column indicates a coefficient related to the object’s speed. The first four columns form a dataframe X, and the last column highlights an efficiency coefficient, which represents the response, denoted by y.

| a | b | c | 0 | r1
| — | — |
| a | b | c | 0.1 | r2
| a | b | c |0.2 | r3
| e | f | g | 0 | s1
| e | f | g | 0.1 | s2
| e | f | g |0.2 | s3

The initial dataframe is split using the following Python comma
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

To train the model using linear regression
model = LinearRegression()

The values ofX_test are predicted:
y_pred = model.predict(X_test)

Then the model’s performance is evaluated: MAE + coefficient of determination, with very good results, the error corresponds to 3% and the coefficient of determination is 0.99346.

The problem arises when I create a new matrix to predict values on, that is, I present it with a matrix that the algorithm has never seen before, which respects the range of values of the training parameters.
| m | n | p | 0 |
| — | — |
| m | n | p | 0.1 |
| m | n | p | 0.2 |

The prediction error is very high and the coefficient of determination is very low.
I cannot understand what the problem is.