Create Dataframes From Unique Value Pairs By Filtering Across Multiple Columns
I want to filter values across multiple columns creating dataframes for the unique value combinations. Any help would be appreciated. Here is my code that is failing (given datafra
Solution 1:
Use pandas groupby
functionality to extract the unique indices and the corresponding rows of your dataframe.
import pandas as pd
from collections import defaultdict
df = pd.DataFrame({'col1': ['A']*4 + ['B']*4,
'col2': [0,1]*4,
'col3': np.arange(8),
'col4': np.arange(10, 18)})
dd = defaultdict(dict)
grouped = df.groupby(['col1', 'col2'])
for (c1, c2), g in grouped:
dd[c1][c2] = g
This is the generated df
:
col1 col2 col3 col4
0A00101A11112A02123A13134B04145B15156B06167B1717
And this is the extracted dd
(well, dict(dd)
really)
{'B': {0:col1col2col3col44B04146B0616,
1:col1col2col3col45B15157B1717},
'A': {0:col1col2col3col40A00102A0212,
1:col1col2col3col41A11113A1313}}
(I don't know what your use case for this is, but you may be better off not parsing the groupby
object to a dictionary anyway).
Post a Comment for "Create Dataframes From Unique Value Pairs By Filtering Across Multiple Columns"