Dividing Each Column By Every Other Column And Creating A New Dataframe From The Results
in a a pandas df (data from a csv file) I am trying to add new columns (ratios) by dividing each column by every other column. So far I am stuck in the process of dividing all colu
Solution 1:
You're on the right track, but a join is not the right operation. You should be able to do this using pd.concat
.
pd.concat([df.div(df[col], axis=0) for col in df.columns], axis=1) # eachcolumnwithevery other column
If you want to avoid dividing a column with itself, you could use df.columns.difference
:
pd.concat([df[df.columns.difference([col])].div(df[col], axis=0) \
for col in df.columns], axis=1)
You can also use df.add_suffix('_new_ratio')
to add suffixes to your columns.
MCVE:
import pandas as pd
import numpy as np
np.random.seed([3, 14])
df = pd.DataFrame(np.random.randn(10, 3), columns=list('ABC'))
df
A B C
0 -0.602923 -0.402655 0.302329
1 -0.524349 0.543843 0.013135
2 -0.326498 1.385076 -0.132454
3 -0.407863 1.302895 -0.604236
4 -0.243362 -0.211261 -2.056621
5 0.517868 -0.040749 -1.051875
6 0.607092 -2.230437 -0.610389
7 0.223345 0.841994 -1.564391
8 0.031653 0.655489 -0.288834
9 -0.467438 0.119117 1.519430
df_new = pd.concat([df[df.columns.difference([col])].div(df[col], axis=0)\
.add_suffix('_n_r') for col in df.columns], axis=1)
df_new
B_n_r C_n_r A_n_r C_n_r A_n_r B_n_r
0 0.667838 -0.501438 1.497369 -0.750838 -1.994263 -1.331845
1 -1.037176 -0.025050 -0.964156 0.024152 -39.919620 41.403685
2 -4.242213 0.405682 -0.235726 -0.095630 2.464987 -10.457000
3 -3.194442 1.481468 -0.313044 -0.463764 0.675006 -2.156269
4 0.868095 8.450867 1.151948 9.734958 0.118331 0.102723
5 -0.078686 -2.031166 -12.708707 25.813488 -0.492328 0.038739
6 -3.673971 -1.005432 -0.272185 0.273663 -0.994598 3.654123
7 3.769924 -7.004363 0.265257 -1.857959 -0.142768 -0.538225
8 20.708576 -9.125012 0.048289 -0.440639 -0.109589 -2.269430
9 -0.254830 -3.250547 -3.924192 12.755771 -0.307641 0.078396
Post a Comment for "Dividing Each Column By Every Other Column And Creating A New Dataframe From The Results"