Help on OLS regression home work problem

I need help on OLS regression home work problem. I tried to complete this task by own but unfortunately it didn’t worked either. Appericaie your help.

from sklearn.datasets import load_boston
import pandas as pd
boston = load_boston()
dataset = pd.DataFrame(data=boston.data, columns=boston.feature_names)
dataset[‘target’] = boston.target
print(dataset.head())

Assign the values of column “RM”(average number of rooms per dwelling) to variable X
similerly assign the values of ‘target’(housing price) column to variable Y

###Start code here
X =
Y =
###End code(approx 2 lines)

initialise the OLS model by passing target(Y) and attribute(X).Assign the model to variable ‘statsModel’
fit the model and assign it to variable ‘fittedModel, make sure you add constant term to input X’
sample code for initialization: sm.OLS(target, attribute)

###Start code here

###End code(approx 2 lines)

print the summary of fittedModel using the summary() function

###Start code here

###End code(approx 1 line)

from the summary report note down the R-squared value and assign it to variable ‘r_squared’ in the below cell

###Start code here
r_squared =
###End code(approx 1 line)
with open(“output.txt”, “w”) as text_file:
text_file.write(“rsquared= %f\n” % r_squared)

1 Like

from sklearn.datasets import load_boston
import pandas as pd
boston = load_boston()
dataset = pd.DataFrame(data=boston.data, columns=boston.feature_names)
dataset[‘target’] = boston.target
print(dataset.head())
X = dataset[‘RM’]
Y = dataset[‘target’]
import statsmodels.api as sm
X= sm.add_constant(X)
statsModel =sm.OLS(Y,X)
fittedModel = statsModel.fit()
print (fittedModel.summary())
r_squared = fittedModel.rsquared
with open(“output.txt”, “w”) as text_file:
text_file.write(“rsquared= %f\n” % r_squared)

1 Like

Executed this piece of code as suggested , but unable to complete the handson … any suggestion please?

from sklearn.datasets
import load_boston
import pandas as pd
boston = load_boston()
dataset = pd.DataFrame(data=boston.data, columns=boston.feature_names)
dataset[‘target’] = boston.target
print(dataset.head())

X = dataset[“RM”]
Y = dataset[“target”]

import statsmodels.api as sm
X= sm.add_constant(X)
statsModel =sm.OLS(Y,X)
fittedModel = statsModel.fit()

print (fittedModel.summary())

r_squared = fittedModel.rsquared
with open(“output.txt”, “w”) as text_file:
text_file.write(“rsquared= %f\n” % r_squared)

###Start code here
X = dataset[[‘RM’]]
Y = dataset[‘target’]
###End code(approx 2 lines)

r_squared = < type the value here > from the output of previous step

igot 0.484

1 Like

Thank you so much Rizwan. I am able to complete the hands on.

Hi RV,
what change you made?


Hi Varadh , i executed as advised above , but not able to complete hands on . I am new bee to python . please help with complete code snippet

while executing the OLS regression hands on , i am getting below error

###Start code here
statsModel =sm.OLS(Y,X)
fittedModel = statsModel.fit()
###End code(approx 2 lines)

ERROR which i am getting is :-


NameError Traceback (most recent call last)
in
1 ###Start code here
----> 2 statsModel =sm.OLS(Y,X)
3 fittedModel = statsModel.fit()
4 ###End code(approx 2 lines)

NameError: name ‘Y’ is not defined

ANOTHER ERROR WHICH I AM GETTING :-

###Start code here
r_squared = 0.484
###End code(approx 1 line)
with open(“output.txt”, “w”) as text_file:
text_file.write(“rsquared= %f\n” % r_squared)

File “”, line 5
text_file.write(“rsquared= %f\n” % r_squared)
^
IndentationError: expected an indented block