4

I have a DataFrame and I want to convert it into the following:

import pandas as pd
df = pd.DataFrame({'ID':[111,111,111,222,222,333],
                   'class':['merc','humvee','bmw','vw','bmw','merc'],
                   'imp':[1,2,3,1,2,1]})
print(df)
    ID   class  imp
0  111    merc    1
1  111  humvee    2
2  111     bmw    3
3  222      vw    1
4  222     bmw    2
5  333    merc    1

Desired output:

    ID       0        1       2
0  111    merc   humvee     bmw
1  111       1        2       3
2  222      vw      bmw
3  222       1        2
4  333    merc      
5  333       1

I wish to transpose the entire dataframe, but grouped by a particular column, ID in this case and maintaining the row order.

My attempt: I tried using .set_index() und .unstack(), but it did not work.

2
  • Does the column names in the output has any significance or it can be anything? Commented Feb 1, 2021 at 12:00
  • It can be anything ....01,,2 or col1,col2,col3 ..... but maintaining a particular order.... Commented Feb 1, 2021 at 12:03

2 Answers 2

5

Use GroupBy.cumcount for counter and then reshape by DataFrame.stack and Series.unstack:

df1 = (df.set_index(['ID',df.groupby('ID').cumcount()])
         .stack()
         .unstack(1, fill_value='')
         .reset_index(level=1, drop=True)
         .reset_index())
print (df1)
    ID     0       1    2
0  111  merc  humvee  bmw
1  111     1       2    3
2  222    vw     bmw     
3  222     1       2     
4  333  merc             
5  333     1             
Sign up to request clarification or add additional context in comments.

6 Comments

Thanks Josef for the answer. Just a small question before I accept it. Will the order of the columns ´class´ and ´imp´ be maintained in the rows? I mean, for every group, the value appearing forst in the dataframe will appear first in the rows as well? Like 0,1,2 and never 2,0,1?
@cph_sto - No, it is sorting by alphanumeric. So for 100 % sure correct ordering need rename columns, give me some time for solution.
Oh I see. Actually, if merc is appearing on the top followed by humvee and bmy, then same order must be followed ..... Thanks for your help.
@cph_sto - Not sure if understand, just tested with pd.DataFrame({'ID':[111,111,111,222,222,333], 'z':['merc','humvee','bmw','vw','bmw','merc'], 'a':[1,2,3,1,2,1]}) and pd.DataFrame({'ID':[111,111,111,222,222,333], 'a':['merc','humvee','bmw','vw','bmw','merc'], 'z':[1,2,3,1,2,1]}) and got same output, so in last pandas versions (tested in pandas 1.1.3) it working same and correct.
What I meant is that, if in the initial df for ID 111, row 0,1,2 are merc, humvee & bmw respectively. Now, when we transpose, all I want it to ensure that column 0,1,2 are merc, humvee and them bmw and not merc, bmw and then humvee. The order from the initial dataframe must be maintained when transposed.
|
2

Another method would be to use groupby and concat - although this is not totally dynamic it works fine if you only have two columns you want to work with, namely class and imp

s = df.set_index([df['ID'],df.groupby('ID').cumcount()]).unstack(1)

df1 = pd.concat([s['class'],s['imp']],axis=0).sort_index().fillna('')

print(df1)

idx     0       1    2
ID                    
111  merc  humvee  bmw
111     1       2    3
222    vw     bmw     
222     1       2     
333  merc             
333     1             

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.