Get Unique Values And Their Occurrence Out Of One Dataframe Into A New Dataframe Using Pandas DataFrame
I want to turn my dataframe with non-distinct values underneath each column header into a dataframe with distinct values underneath each column header with next to it their occurre
Solution 1:
The difficult part is keeping values of columns in each row aligned. To do this, you need to construct a new dataframe from unique
, and pd.concat
on with value_counts
map to each column of this new dataframe.
new_df = (pd.DataFrame([df[c].unique() for c in df], index=df.columns).T
.dropna(how='all'))
df_final = pd.concat([new_df, *[new_df[c].map(df[c].value_counts()).rename(f'{c}_Count')
for c in df]], axis=1).reset_index(drop=True)
Out[1580]:
A B C D A_Count B_Count C_Count D_Count
0 0 CEN T2 56 2.0 4.0 4.0 1
1 2 DECEN T1 45 1.0 1.0 3.0 1
2 3 ONBEK NaN 84 2.0 1.0 NaN 1
3 NaN NaN NaN 59 NaN NaN NaN 1
4 NaN NaN NaN 87 NaN NaN NaN 1
5 NaN NaN NaN 98 NaN NaN NaN 1
6 NaN NaN NaN 23 NaN NaN NaN 1
7 NaN NaN NaN 65 NaN NaN NaN 1
If you only need to keep alignment between each pair of column and its count such as A
- A_Count
, B
- B_Count
..., it simply just use value_counts
with reset_index
some commands to change axis names
cols = df.columns.tolist() + (df.columns + '_Count').tolist()
new_df = pd.concat([df[col].value_counts(sort=False).rename_axis(col).reset_index(name=f'{col}_Count')
for col in df], axis=1).reindex(new_cols, axis=1)
Out[1501]:
A B C D A_Count B_Count C_Count D_Count
0 0.0 ONBEK T2 56.0 2.0 1.0 4.0 1
1 2.0 CEN T1 45.0 1.0 4.0 3.0 1
2 3.0 DECEN NaN 84.0 2.0 1.0 NaN 1
3 NaN NaN NaN 59.0 NaN NaN NaN 1
4 NaN NaN NaN 87.0 NaN NaN NaN 1
5 NaN NaN NaN 98.0 NaN NaN NaN 1
6 NaN NaN NaN 23.0 NaN NaN NaN 1
7 NaN NaN NaN 65.0 NaN NaN NaN 1
Post a Comment for "Get Unique Values And Their Occurrence Out Of One Dataframe Into A New Dataframe Using Pandas DataFrame"