Pandas dataframe insert rows based on grouping,

cheesebird · January 12, 2023, 3:40pm

I have a pandas dataframe something like this( much large in reality and it wouldn’t fit on this editor.

NAME  ID_DES  ID
ZZ11    FISH     1
ZZ22    FISH     1
ZZ33    FISH     1
ZZ44    FISH     1
XX11    CHIPS    2
XX22    CHIPS    2
XX23    CHIPS    2
XX24    CHIPS    2
YY11    SOUP     3
YY12    SOUP     3
YY13    SOUP     3
YY14    SOUP     3

What I am trying to achieve is to insert 3 rows before the start of each group. The groups are determined by ID or ID_DES.

I have success with grouping the df and exporting to excel…
df.groupby(‘ID’)

But not inserting new rows. Ideally i would ike these values to be inserted
“Desc:” with value from ID
“Header1:” with value from ID_DES
“Header2:” with values of both ID and ID_DES.

So the output DF would look like this…

NAME		      ID_DES ID
"Desc: FISH"        FISH    1
"Header1: 1"        FISH	1
"Header2: FISH1 " . FISH	1
ZZ11                FISH    1
ZZ22                FISH    1
ZZ33                FISH    1
ZZ44                FISH    1
"Desc: CHIPS"       CHIPS 	2
"Header1: 2"        CHIPS	2
"Header2: CHIPS 2 " CHIPS   2
XX11                CHIPS   2
XX22                CHIPS   2
XX23                CHIPS   2
XX24                CHIPS   2
"Desc: SOUP"        SOUP 	3
"Header1: 3"        SOUP	3
"Header2: SOUP 3" . SOUP    3
YY11                SOUP    3
YY12                SOUP    3
YY13                SOUP    3
YY14                SOUP    3

rob42 · January 12, 2023, 4:41pm

Maybe you could use the information here:

When I create a DataFrame, I usually construct said from a list or a dictionary. That way, it’s relatively easy to edit the constructor then recreate the df.

cheesebird · January 12, 2023, 4:50pm

Thanks, that tutorial doesn’t deal with groupby insertions. My df comes from an excel and the df is used to split it into multiple excel worksheets

rob42 · January 12, 2023, 5:08pm

You’re welcome; sorry it was of little help.

Just a thought (and maybe you’ve already considered this), but it depends on how you are reading the data in: could you not insert the on-the-fly?
Read the excel headers.
Insert the new headers
Read the data, row by row, until you hit a new ID
Insert the new headers.
Do the above until EOF

cheesebird · January 13, 2023, 1:21pm

No worries , i found a simple way to do it without using python. In the output excel simply select all the worksheets (in my case 100’s). Insert the 3 lines empty lines and then use =CONCATENATE (“Desc”: B5), =CONCATENATE (“Header1:”: C5) etc etc. Much less frustrating than battling with pandas