Replace the value row into the previous row

arvin_90 · March 15, 2023, 12:32pm

I have a dataframe with two columns.

import pandas as pd
  
data = {'Date': ['03-10-11', '03-10-11', '0', '31-10-11', '0'],
        'Col1': ["B/F", "Loan", "Loan Disbursement", "Int.:03-10-201", "Int.:03-10-2011 To 31-10-2011"]}

df = pd.DataFrame(data)

and I want the output like:

data = {'Date': ['03-10-11', '03-10-11', '0', '31-10-11', '0'],
        'Col1': ["B/F", "Loan Disbursement", "Loan Disbursement", "Int.:03-10-2011 To 31-10-2011", "Int.:03-10-2011 To 31-10-2011"]}

Whenever there is a 0 value in the Date column, the corresponding value in the Col1 column gets copied into the previous value of the Col1 column.

I hope i am able to make you understand my problem.

Thanks in advance

MRAB · March 15, 2023, 8:43pm

I managed to do it by creating a mask of the rows whose date is ‘0’, shifting the mask down to make a mask for the previous rows, and then copying cells from the latter rows to the former rows:

mask = df['Date'] == '0'
df.loc[mask.shift(-1, fill_value=False), 'Col1'] = list(df.loc[mask, 'Col1'])

The list is there to stop it setting cells to Nan. I have no idea why it does that…

arvin_90 · March 16, 2023, 12:50pm

Thank you so much.
I made a minor change and it is working absolutely as I want.

mask = df['Date'] == 0
df.loc[mask.shift(-1, fill_value=False), 'Col1'] = list(df.loc[mask, 'Col1'])

arvin_90 · March 17, 2023, 11:37am

@ Matthew Barnett - is it possible that we can check whether the text is in the previous row and add only the text which is not in the previous row.