I have a dataframe of 12000 rows. I want to use pandas with multiprocessing and perform mapping on the dataframe.
df = pd.read_csv(input_file, dtype=str, names=columns)
df_split = np.array_split(df, 4)
# pool = mp.Pool(4)
for df_data in df_split:
param = [df_data, version, logger]
with mp.Pool(4) as pool:
out_df_lst = pool.map(func, param)
out_df = pd.concat(out_df_lst)
All this program is within a Django REST API, when I make a POST request through Postman it throws the error: ‘TypeError: can’t pickle _thread.RLock objects’. The program works as intended when I make a request without any multiprocessing.
Please help me understand this issue to make the program work with multiprocessing.