I am trying to read .csv file (7 x 7 matrix of data) in a python code.
I was able to import and plot the data using seaborn. I fitting a linear line through the set of data but I wanted to get statistical information after fitting the data. I learnt that seaborn is not suitable for getting any such information. I have explored sklearn.linear_model.LinearRegression — scikit-learn 1.0.2 documentation
But my problem is that my data is a matrix and I am unable to use linear regression to fit line through the combined set of data. Can anybody help me?
import matplotlib.pyplot as pyplot
import pandas
import seaborn
from sklearn.linear_model import LinearRegression
DATA = pandas.read_csv('pH1urea1010.csv')
DATA = DATA.melt(id_vars='Time', var_name='pH', value_name='Intensity (a.u.)')
DATA['pH'] = DATA['pH'].astype('int8')
FIGURE, AXES = pyplot.subplots(figsize=(10, 12))
seaborn.regplot(data=DATA, x='pH', y='Intensity (a.u.)', ax=AXES)
seaborn.scatterplot(data=DATA, x='pH', y='Intensity (a.u.)', hue='Time', ax=AXES)
reg = LinearRegression().fit(Data['pH'], Data['Intensity (a.u.)']) #This is where I feel my
code goes wrong, where I need help.
The data in .csv file is below (which is not part of my code)
pH 1 pH2 pH3 pH4 pH5 pH6 pH7
2 min 0.10119 0.1072 0.12321 0.13099 0.1672 0.15035 0.14725
5 min 0.10777 0.10166 0.12039 0.13197 0.1518 0.14225 0.15306
10 min 0.11221 0.11587 0.12266 0.12893 0.15028 0.14048 0.15383
15 min 0.10298 0.11139 0.11734 0.12721 0.15196 0.14164 0.16004
20 min 0.10445 0.10541 0.12057 0.12365 0.15198 0.1325 0.15735
25 min 0.10766 0.10603 0.116 0.12537 0.14608 0.13473 0.14655
30 min 0.11216 0.10999 0.11617 0.12242 0.14684 0.1344 0.15003