I’m new to python, but my main point was to switch my calculations from excel to panda df My concern is that this line of code executes too long
TRX["group_code"] = np.dot((TRX["lookup"].values[:,None]==bcodes["ID_lookup"].values)& (TRX["completed_at / operation_completed_at"].values[:, None] >= bcodes["Effective from"].values)&(TRX["completed_at / operation_completed_at"].values[:, None] <= bcodes["Effective till"].values) ,bcodes["Code"])
TRX is dataframe with 298515 rows × 44 columns and bcodes is dataframe with 6960 rows × 41 columns
I timed this line of code: 1min 43s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each
Even when i left only 1 criteria to lookup( without dates) it still runs 1 min 37s
Im pretty sure there should be better/faster way to get the result.