Merging Data Frames In Pandas
Solution 1:
You have a few issues going on. First your merge statements are not constructed correctly. You shouldn't be using both a left_on
and left_index
or right_on
and right_index
at the same time. You should use only one left option and one right option.
The reason you get an error in your second statement is because the index levels do not match. In your left merge, the left index is a single level, and you while you specify both right_index=True
and right_on='event1'
, the right_on
attribute is taking precedence. Since both are single level integers, there is no problem. I should point out that the merge, if constructed correctly, (pd.merge(left, right, left_index=True, right_on='event1', how='left')
) does not produce an empty DataFrame... See code below.
In your right merge, you specify using the right index with right_index=True
and left_on
takes precedence over left_index=True
. The issue here is that the right index is 2 levels, where as your 'key1` field is a single level string.
In [1]:importpandasaspdIn [2]:importnumpyasnpIn [3]:right=pd.DataFrame(data=np.arange(12).reshape((6,2)),index=[['Nevada','Nevada','Ohio','Ohio','Ohio','Ohio'],[2001,2000,2000,2000,2001,2002]],columns=['event1','event2'])In [4]:left=pd.DataFrame(data={'key1':['Ohio','Ohio','Ohio','Nevada','Nevada'],'key2':[2000,2001,2002,2001,2002],'data':np.arange(5.)})In [5]:leftOut[5]:datakey1key200Ohio200011Ohio200122Ohio200233Nevada200144Nevada2002In [6]:rightOut[6]:event1event2Nevada2001 012000 23Ohio2000 452000 672001 892002 1011In [5]:left_merge=left.merge(right,left_index=True,right_on='event1',how='left')In [7]:left_mergeOut[7]:datakey1key2event1event2Nevada2001 0Ohio2000 01Ohio2002 1Ohio2001 1NaNNevada2000 2Ohio2002 23Ohio2002 3Nevada2001 3NaN2000 4Nevada2002 45
Post a Comment for "Merging Data Frames In Pandas"