Pivot Duplicates Rows Into New Columns Pandas
I have a data frame like this and I'm trying reshape my data frame using Pivot from Pandas in a way that I can keep some values from the original rows while making the duplicates r
Solution 1:
Use cumcount
for count groups, create MultiIndex
by set_index
with unstack
and last flatten values of columns:
g = df.groupby(["ID","Agent", "OV"]).cumcount().add(1)
df = df.set_index(["ID","Agent","OV", g]).unstack(fill_value=0).sort_index(axis=1, level=1)
df.columns = ["{}{}".format(a, b) for a, b in df.columns]
df = df.reset_index()
print (df)
ID Agent OV Zone1 Value1 PTC1 Zone2 Value2 PTC2 Zone3 Value3 PTC3
0 1 10.0 26.0 M1 10 100 0 0 0 0 0 0
1 2 26.5 8.0 M2 50 95 M1 6 5 0 0 0
2 3 4.5 6.0 M3 4 40 M4 6 60 0 0 0
3 4 1.2 0.8 M1 8 100 0 0 0 0 0 0
4 5 2.0 0.4 M1 6 10 M2 41 86 M4 2 4
If want replace to 0
only numeric columns:
g = df.groupby(["ID","Agent"]).cumcount().add(1)
df = df.set_index(["ID","Agent","OV", g]).unstack().sort_index(axis=1, level=1)
idx = pd.IndexSlice
df.loc[:, idx[['Value','PTC']]] = df.loc[:, idx[['Value','PTC']]].fillna(0).astype(int)
df.columns = ["{}{}".format(a, b) for a, b in df.columns]
df = df.fillna('').reset_index()
print (df)
ID Agent OV Zone1 Value1 PTC1 Zone2 Value2 PTC2 Zone3 Value3 PTC3
0110.026.0 M1 1010000001226.58.0 M2 5095 M1 6500234.56.0 M3 440 M4 66000341.20.8 M1 81000000452.00.4 M1 610 M2 4186 M4 24
Solution 2:
You can using cumcount
create the help key , then we do unstack
with multiple index flatten (PS : you can add fillna(0) at the end , I did not add it cause I do not think for Zone value 0 is correct )
df['New']=df.groupby(['ID','Agent','OV']).cumcount()+1
new_df=df.set_index(['ID','Agent','OV','New']).unstack('New').sort_index(axis=1 , level=1)
new_df.columns=new_df.columns.map('{0[0]}{0[1]}'.format)
new_df
Out[40]:
Zone1 Value1 PTC1 Zone2 Value2 PTC2 Zone3 Value3 PTC3
ID Agent OV
110.026.0 M1 10.0100.0None NaN NaN None NaN NaN
226.58.0 M2 50.095.0 M1 6.05.0None NaN NaN
34.56.0 M3 4.040.0 M4 6.060.0None NaN NaN
41.20.8 M1 8.0100.0None NaN NaN None NaN NaN
52.00.4 M1 6.010.0 M2 41.086.0 M4 2.04.0
Post a Comment for "Pivot Duplicates Rows Into New Columns Pandas"