TypeError: can't pickle _thread.RLock objects in pandas with multiprocessing

I have a dataframe of 12000 rows. I want to use pandas with multiprocessing and perform mapping on the dataframe.

df =  pd.read_csv(input_file, dtype=str, names=columns)
df_split = np.array_split(df, 4)
# pool = mp.Pool(4)
for df_data in df_split:
    param = [df_data, version, logger]
    with mp.Pool(4) as pool:
        out_df_lst = pool.map(func, param)
out_df = pd.concat(out_df_lst)

All this program is within a Django REST API, when I make a POST request through Postman it throws the error: ‘TypeError: can’t pickle _thread.RLock objects’. The program works as intended when I make a request without any multiprocessing.

Please help me understand this issue to make the program work with multiprocessing.

It looks like you’re passing an object to func that can’t be pickled by the multiprocessing pool. You didn’t mention what the variables version and logger are, but I suspect that it might be one of those. Try leaving those out of your list of params to test.
Also, you’re overwriting out_df_lst on each iteration, which I’m sure isn’t as intended.

I removed ‘logger’ parameter. Logger is the logger object to log info. Because of that parameter, the program is throwing TypeError: Can’t pickle _thread.Rlock objects. Which is strange, because in other applications I have passed logger as a parameter while using multiprocess.