Run the code in the below cell to load the iris data from sklearn dataset
- The features are the dimensions of flowers petals belonging to iris species
- The target names are the species to which the flower belongs.They are mapped as 0,1 and 2
- In this excercise you will perform logistic regression predict the species of flow given the petal dimensions as features
- to view the data by printing iris_X and iris_Y (Optional)
step 1:
from sklearn import datasets
iris = datasets.load_iris()
iris_X = iris.data
iris_y = iris.target
print(iris.feature_names)
print(iris.target_names)
[‘sepal length (cm)’, ‘sepal width (cm)’, ‘petal length (cm)’, ‘petal width (cm)’]
[‘setosa’ ‘versicolor’ ‘virginica’]
#step 2
- import train_test_split function from sklearn.model_selection
- split the data into train and test set with test_size = 0.33 and random_state = 101
#code
###Start co
import numpy as np
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.33,random_state=101)
###End code(approx 2 lines)
#step 3 - import LogisticRegression from sklearn
- initialise logistic regression model and assign to variable ‘model’
- fit the model with train data(X_train and y_train)
#code
from sklearn.linear_model import LogisticRegression
model =LogisticRegression()
model.fit(X_train,y_train)
#step 4 - Using the model predict the output of test data (i.e) X_test
#code
y_pred = model.predict(X_test)
#step 5 - import classification_report from sklearn
- pass y_test and y_pred to classification_report().
- print the output of classification_report
#code
from sklearn.metrics import classification_report
y_test=[0,1,2,2,2]
y_pred=[0,0,2,2,1]
target_names=[‘class 0’,‘class 1’,‘class 2’]
print(classification_report(y_test,y_pred,target_names=target_names))
#steop 6
Time to predict the new data
- Predict the labels for the data stored in
data/test_iris.csv
. - Store the predictions in the form of list to the variable
list_ans
. Note: List numbers are of integer type.
#code
from sklearn import datasets
iris=datasets.load_iris()
X_train=iris.data
y_train=iris.target
list_ans=y_pred
print(list_ans)
print(len(list_ans))