shomikc
(Shomik Chakraborty)
July 10, 2023, 5:51pm
1
Hi. I have a csv file for sales. I want to find the customer name and total sales.
This is my code.
import matplotlib.pyplot as plt
import pandas as pd
df = pd.read_csv('I:\sample-salesv2.csv',delimiter=',')
df2 = df[['name', 'net_price','date']].copy()
df2 = df2.groupby('name')['net_price'].sum()
df1 = df2.rename(columns={'name': 'EmpName', '': 'Sales'})
print(df2)
The first column is ‘name’ and has to be on the x axis but the second column which has to be the y axis has no name.
How can make the matplotlib if the column has no name?
Please help. Thankyou
shomikc
(Shomik Chakraborty)
July 10, 2023, 7:59pm
2
Hello experts,
I got a column name.
import matplotlib.pyplot as plt
import pandas as pd
df = pd.read_csv('I:\sample-salesv2.csv',delimiter=',')
df2 = df[['name', 'net_price','date']].copy()
df2 = df2.groupby('name').agg({'net_price': 'sum'}, inplace=True)
#print(df2['name'])
#df2 = df2.rename(columns={'name': 'EmpName'}, inplace=True)
print(df2)
#print(df2['EmpName'])
#print(df2.iloc[:, :1])
#name = df2.iloc[:,0]
#print(name)
#sales = df2['sum']
#print(sales)
#fig = plt.figure(figsize =(10, 7))
# Horizontal Bar Plot
#plt.bar(df2['name'], df2['net_price'])
# Show Plot
#plt.show()
but now the new dataframe, df2, is showing as none when I try to plot it. Could you point me in the right direction. What should I be looking at?
Thankyou.
"name"
is in the index so I don’t see how you are able to slice it out on the plt.bar
line.
I think you need to call reset_index
method on df2
first. Also you might have better luck using this method off of the dataframe directly.
https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.plot.bar.html
1 Like
shomikc
(Shomik Chakraborty)
July 11, 2023, 9:10am
4
Hello.
I wrote a different code and it works.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
sales=pd.read_csv("I:\sample-salesv2.csv",parse_dates=['date'])
#sales.head()
#sales.describe()
#sales['unit price'].describe()
#sales.dtypes
customers = sales[['name','net_price','date']]
#customers.head()
customer_group = customers.groupby('name')
customer_group.size()
sales_totals = customer_group.sum()
#print(sales_totals)
fig = plt.figure(figsize = (10, 5))
plt = sales_totals.plot(kind='bar')
plt.set_xlabel("Customers")
plt.set_ylabel("Sales ($)")
plt.set_title("Total Sales by Customer")
Looking back I think it was easier than it seemed.
Thanks for your help.
Rosuav
(Chris Angelico)
July 11, 2023, 9:55am
5
Suggestion:
sales=pd.read_csv("I:/sample-salesv2.csv",parse_dates=['date'])
Current versions of Python will give a warning on "\s"
and future versions will give an error. Use forward slashes for reliability, or double your backslashes (which looks ugly, especially with UNC names).
1 Like
Or use raw strings like r"I:\sample-salesv2.csv"
.
(But there’s a pitfall: A \
at the end (r"I:\dir\"
) doesn’t work.)
1 Like
Rosuav
(Chris Angelico)
July 11, 2023, 11:40am
7
Yeah, which is why I always recommend forward slashes.
1 Like