Is there any concise way to create a column of groupby mean in a pandas df?

john316 · May 19, 2023, 11:59pm

Hi, I have a pandas dataframe

import pandas as pd
df = pd.DataFrame({'a': [0,0,2,2,3,3], 'b':[4,5,2,1,8,6]})
df_temp = df.groupby('a')[['b']].mean()
df_temp.rename(columns={"b": "d"}, inplace = True)
df1 = df.join(df_temp, on = 'a')

df1 is as follows. Columns a and b are inherited from df, and d shows mean value for each a category.

I was wondering if there is a concise way to create the d columns?

CAM-Gerlach · May 23, 2023, 10:59pm

Thanks for the minimal but complete reproducible example.

In this case, after your initial setup code:

import pandas as pd

df = pd.DataFrame({'a': [0,0,2,2,3,3], 'b':[4,5,2,1,8,6]})

You can just use the transform method of the groupby object to get the means transformed back to the original, and assign the result back to a new column:

df['d'] = df.groupby('a').transform('mean')

This yields your desired result:

   a  b    d
0  0  4  4.5
1  0  5  4.5
2  2  2  1.5
3  2  1  1.5
4  3  8  7.0
5  3  6  7.0

This Stack Overflow answer has more details:

(Optionally, if you want it assigned to a new dataframe instead of the original, like you did above, create it first with df1 = df.copy() on a line before this.)