I'm learning python and need help with a topic

stl2lawton · September 25, 2023, 8:00am

I’ve got an assignment to use a lambda function in agg to aggregate an imported dataframe returning the number of rows with a value greater than 1.5. So far I have the following, but I don’t think its exactly what I’m supposed to pull and further so, it doesn’t have the built in function.

print(df.iloc[:,:-1].agg([lambda x : [i for i in x if i> 1.5]]))

abessman · September 25, 2023, 9:33am

Throughout the DataFrame or per column?

Your attempt is pretty close if your want it per column. Your lambda should be lambda x: len([i for i in x if i > 1.5]).

However, I don’t understand why the assignment says to use agg. It’s typically used for applying multiple independent operations to a DataFrame and aggregating the results. If all you want is a single result (“number of rows containing a value greater than 1.5”) apply is a better choice.

stl2lawton · September 25, 2023, 9:46am

Thank you very much. It is a training exercise mostly learning to work with agg . That’s why its asking for it specifically.

stl2lawton · September 25, 2023, 9:48am

The question is written as follows:

Aggregate the dataframe to show the number of rows greater than 1.5 in each columb using a lambda expresion in agg(), and built in functions.

hansgeunsmeyer · September 26, 2023, 5:05pm

If you read it as “show, for each column, the number of rows where …”, then @abessman’s suggestion works very nicely.
If you want to count the number of rows where all columns need to be > 1.5 then it can be simplified:

>>> df
   A  B
0  2  3
1  2  3
2  2  3
3  1  2
4  1  2
5  0  1
6  0  1

>>> sum(df.agg(lambda row: all(row > 1.5), axis=1))
3