Skip to content Skip to sidebar Skip to footer

Pandas New Column Calculation Based On Existing Columns Values

Please see above my source table: I am interesting to calculate new column 'Group' based on list of interest lots: Value of column 'Group' based on condition if lots in source ta

Solution 1:

Because this question is tagged Pandas, I assume we are talking dataframes and series instead of plain lists. You can use loc to locate the rows and columns that match you criteria (e.g. whether each element in the LOT column isin the series of lots of interest).

df = pd.DataFrame({'LOT': ['A1111', 'A2222', 'A3333', 'B1111', 'B2222', 'B3333'], 
                   'LOT_VIRTUAL_LINE': ['AAA'] * 3 + ['BBB'] * 3})
s = pd.Series(['A1111', 'B2222'], name='Lots Of Interest')
# or... df2 = pd.read_csv('file_path/file_name.csv')# Value of 'GROUP' defaults to 'LOT_VIRTUAL_LINE'.df['GROUP'] = df.LOT_VIRTUAL_LINE

# But gets overwritten by 'LOT' if it is in the 'Lots of Interest' series.
mask = df.LOT.isin(s)
# or... mask = df.LOT.isin(df2['Lots of Interest'])  # Whatever the column name is.
df.loc[mask, 'GROUP'] = df.loc[mask, 'LOT']

# Confirm results.
>>> df
     LOT LOT_VIRTUAL_LINE  GROUP
0  A1111              AAA  A1111
1  A2222              AAA    AAA
2  A3333              AAA    AAA
3  B1111              BBB    BBB
4  B2222              BBB  B2222
5  B3333              BBB    BBB

Solution 2:

Assuming we have a list named lots_of_interest, for instance as the result of read_csv(path).loc[:, 'lots_of_interest'].tolist():

df['Group'] = df.apply(lambda x: x['LOT'] if x['LOT'].isin(lots_of_interest) else x['LOT_VIRTUAL_LINE'], axis=1)

Post a Comment for "Pandas New Column Calculation Based On Existing Columns Values"