 # ValueError: Found input variables with inconsistent numbers of samples: [10, 1]

Hello,
Please help me, I am newbie. I tried to write data in x and y. Because I don’t know if I do it in excel then save it in csv. Here’s the code. I tried to modified from. an example

import time

import numpy as np

import matplotlib.pyplot as plt

from sklearn.kernel_ridge import KernelRidge
from sklearn.model_selection import GridSearchCV
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import WhiteKernel, ExpSineSquared

rng = np.random.RandomState(0)

# Generate sample data (original)

#X = 15 * rng.rand(100, 1)
#y = np.sin(X).ravel()
#y += 3 * (0.5 - rng.rand(X.shape)) # add noise
#above is original from example n it works

#(i try to make my own, not random number but I input/write it)
X = [[1, 2], [3, 4], [5, 6], [7, 8], [9, 10]]
X = np.reshape(X, (10, -1))
print(X)

y = 1, 5, 3, 9, 8, 13, 10, 15, 7,
y = np.reshape(y, (1, 10))
print(y)

# Fit KernelRidge with parameter selection based on 5-fold cross validation

param_grid = {“alpha”: [1e0, 1e-1, 1e-2, 1e-3],
“kernel”: [ExpSineSquared(l, p)
for l in np.logspace(-2, 2, 10)
for p in np.logspace(0, 2, 10)]}
kr = GridSearchCV(KernelRidge(), param_grid=param_grid)
stime = time.time()
kr.fit(X, y)
print(“Time for KRR fitting: %.3f” % (time.time() - stime))

gp_kernel = ExpSineSquared(1.0, 5.0, periodicity_bounds=(1e-2, 1e1))
+ WhiteKernel(1e-1)
gpr = GaussianProcessRegressor(kernel=gp_kernel)
stime = time.time()
gpr.fit(X, y)
print(“Time for GPR fitting: %.3f” % (time.time() - stime))

# Predict using kernel ridge

X_plot = np.linspace(0, 20, 10000)[:, None]
stime = time.time()
y_kr = kr.predict(X_plot)
print(“Time for KRR prediction: %.3f” % (time.time() - stime))

# Predict using gaussian process regressor

stime = time.time()
y_gpr = gpr.predict(X_plot, return_std=False)
print(“Time for GPR prediction: %.3f” % (time.time() - stime))

stime = time.time()
y_gpr, y_std = gpr.predict(X_plot, return_std=True)
print(“Time for GPR prediction with standard-deviation: %.3f”
% (time.time() - stime))

# Plot results

plt.figure(figsize=(10, 5))
lw = 2
plt.scatter(X, y, c=‘k’, label=‘data’)
plt.plot(X_plot, np.sin(X_plot), color=‘navy’, lw=lw, label=‘True’)
plt.plot(X_plot, y_kr, color=‘turquoise’, lw=lw,
label=‘KRR (%s)’ % kr.best_params_)
plt.plot(X_plot, y_gpr, color=‘darkorange’, lw=lw,
label=‘GPR (%s)’ % gpr.kernel_)
plt.fill_between(X_plot[:, 0], y_gpr - y_std, y_gpr + y_std, color=‘darkorange’,
alpha=0.2)
plt.xlabel(‘data’)
plt.ylabel(‘target’)
plt.xlim(0, 20)
plt.ylim(-4, 4)
plt.title(‘GPR versus Kernel Ridge’)
plt.legend(loc=“best”, scatterpoints=1, prop={‘size’: 8})
plt.show()

## Here’s the error

ValueError Traceback (most recent call last)
in ()
39 kr = GridSearchCV(KernelRidge(), param_grid=param_grid)
40 stime = time.time()
—> 41 kr.fit(X, y)
42 print(“Time for KRR fitting: %.3f” % (time.time() - stime))
43

2 frames
/usr/local/lib/python3.7/dist-packages/sklearn/utils/validation.py in check_consistent_length(*arrays)
210 if len(uniques) > 1:
211 raise ValueError(“Found input variables with inconsistent numbers of”
→ 212 " samples: %r" % [int(l) for l in lengths])
213
214

ValueError: Found input variables with inconsistent numbers of samples: [10, 1]

I think it’s about array. I tried to print the original and write my own X and y. But it had an error in
Line 41 kr.fit(X, y)

If I directly write the data x and y (each 10 datas) because I tried the original example (100 data) and I change it to 10 data. It worked.

If I write in csv so X and y data are in rows. How to code that in google collab
Assume that later the number of data in X is 50 data and sams for y 50 data

Thank you in advance for help