Skip to content Skip to sidebar Skip to footer

Iterating Through Pandas Groupby And Merging DataFrames

This seems like it should be straightforward but is stumping me. Really love being able to iterate through the groups of a groupby operation and I am getting the result I want from

Solution 1:

Why not just:

pd.concat(dfs, axis=1, join='outer')

Solution 2:

Because this had to work on a very large dataset I went with the following implementation which doesn't seem very elegant at all but performs pretty well on large datasets:

for name, group in df.groupby('date', sort=False):
    dfs.append(pd.DataFrame(pd.DataFrame(list(chain.from_iterable(group['values'])),
                              columns=['key']).groupby('key').size(),
                               columns=[name.strftime('%Y-%m-%d')]).reset_index())

df2 = pd.concat(dfs, axis=1, join='outer')

df3 = pd.merge(pd.merge(pd.merge(pd.merge(pd.merge(pd.merge(pd.merge(pd.merge(df2.iloc[:, :2], 
        pd.DataFrame(list(set(chain.from_iterable(df['values']))), columns=['key']), how='right'),
            df2.iloc[:, 2:4], how='left'),
            df2.iloc[:, 4:6], how='left'),
            df2.iloc[:, 6:8], how='left'),
            df2.iloc[:, 8:10], how='left'),
            df2.iloc[:, 10:12], how='left'),
            df2.iloc[:, 12:14], how='left'),
            df2.iloc[:, 14:16], how='left').fillna(0).set_index('key').sort_index(axis=1)

Post a Comment for "Iterating Through Pandas Groupby And Merging DataFrames"