Pandas: Concatenate Dataframe And Keep Duplicate Indices
I have two dataframes that I would like to concatenate column-wise (axis=1) with an inner join. One of the dataframes has some duplicate indices, but the rows are not duplicates, a
Solution 1:
You can perform a merge and set the params to use the index from the lhs and rhs:
In[4]:
df1.merge(df2, left_index=True, right_index=True)
Out[4]:
bca125135246[3 rows x 2 columns]
Concat should've worked, it worked for me:
In[5]:
pd.concat([df1,df2], join='inner', axis=1)
Out[5]:
bca125135246[3 rows x 2 columns]
Solution 2:
Coming back to this because I was looking how to merge on columns with different names and keep duplicates:
df1 = pd.DataFrame([{'a':1,'b':2},{'a':1,'b':3},{'a':2,'b':4}],
columns = ['a','b'])
df1
a b
0 1 2
1 1 3
2 2 4
df2 = pd.DataFrame([{'c':1,'d':5},{'c':2,'d':6}],
columns = ['c','d'])
df2
c d
0 1 5
1 2 6
And found that pd.merge(df1, df2.set_index('c'), left_on='a', right_index=True)
accomplished this:
df3
ab d
012511352246
You could also .set_index('a'), left_on='a'
if the column names are the same (as per OP example)
Post a Comment for "Pandas: Concatenate Dataframe And Keep Duplicate Indices"