Skip to content Skip to sidebar Skip to footer

Pandas: Concatenate Dataframe And Keep Duplicate Indices

I have two dataframes that I would like to concatenate column-wise (axis=1) with an inner join. One of the dataframes has some duplicate indices, but the rows are not duplicates, a

Solution 1:

You can perform a merge and set the params to use the index from the lhs and rhs:

In[4]:    
df1.merge(df2, left_index=True, right_index=True)
Out[4]:
   bca125135246[3 rows x 2 columns]

Concat should've worked, it worked for me:

In[5]:

pd.concat([df1,df2], join='inner', axis=1)
Out[5]:
   bca125135246[3 rows x 2 columns]

Solution 2:

Coming back to this because I was looking how to merge on columns with different names and keep duplicates:

df1 = pd.DataFrame([{'a':1,'b':2},{'a':1,'b':3},{'a':2,'b':4}],
                   columns = ['a','b'])
df1
   a  b
0  1  2
1  1  3
2  2  4

df2 = pd.DataFrame([{'c':1,'d':5},{'c':2,'d':6}],
                   columns = ['c','d'])
df2
   c  d
0  1  5
1  2  6

And found that pd.merge(df1, df2.set_index('c'), left_on='a', right_index=True) accomplished this:

df3
   ab  d
012511352246

You could also .set_index('a'), left_on='a' if the column names are the same (as per OP example)

Post a Comment for "Pandas: Concatenate Dataframe And Keep Duplicate Indices"