Anyone please describe in words what the code does? Thanks

i200yrs · May 16, 2023, 8:07am

def paper_assignment(arr, reviewer_df, opt):
     body of the code
return reviewer


df["R1"] = df[["Author", "Theme", "Sub_dept", "R1", "R2"]].apply(paper_assignment, args=(reviewer_df, "R1"), axis=1)

abessman · May 16, 2023, 9:14am

As written, it raises SyntaxError. The return should presumably be inside the paper_assignment function.

This creates a new dataframe containing the "Author", "Theme", "Sub_dept", "R1", "R2" columns from df.

This applies the function paper_assignment to the new dataframe. The function is called once per row in the dataframe, with each row passed to the function as a pandas.Series with the column names as the index. Each time the function is called, the first argument is the Series corresponding to the dataframe row, the second argument is reviewer_df, and the third is "R1".

The results from these function calls are put into another Series object, which is assigned to the "R1" column of the original dataframe.

duncanb · May 16, 2023, 2:54pm

FWIW chatGPT does a pretty good job of answering this question (a good screenful of description).

Rather than copying the answer here I’ll just say it’s actually worth giving it this sort of question because it can answer it pretty well (though it complained a bit “Without knowing the specific implementation inside the paper_assignment function, it is not possible to provide a detailed description of what the code does”) and you can then ask follow-on questions if you need more clarity on anything.

vovavili · May 16, 2023, 3:56pm

Keep in mind that row-wise function application is considered an antipattern in data analysis libraries like pandas and dplyr. It’s not needed in 99 percent of cases where it has been used, and for a rare few exceptions you’re better off using list comprehensions.