Skip to content Skip to sidebar Skip to footer

Replace Values In Pandas Column When N Number Of Nans Exist In Another Column

I have the foll. pandas dataframe: 2018-05-25 0.000381 0.264318 land 2018-05-25 2018-05-26 0.000000 0.264447 land 2018-05-26 2018-05-27 0.000000 0.264791 Na

Solution 1:

Here's an approach where the consecutive appearance of null is n i.e

n=3# create a maskx=df[3].isnull()# counter to restart the count of nan once there is a no nan consecutively se=(x.cumsum()-x.cumsum().where(~x).fillna(method='pad').fillna(0))df.loc[se>=n,2]=np.nan0123402018-05-25  0.0003810.264318land2018-05-2512018-05-26  0.0000000.264447land2018-05-2622018-05-27  0.0000000.264791NaNNaT32018-05-28  0.0000000.265253NaNNaT42018-05-29  0.000000NaNNaNNaT52018-05-30  0.0000000.266066land2018-05-3062018-05-31  0.0000000.266150NaNNaT72018-06-01  0.0000000.265816NaNNaT82018-06-02  0.0000000.264892land2018-06-0292018-06-03  0.0000000.263191NaNNaT102018-06-04  0.0000000.260508land2018-06-04112018-06-05  0.0000000.256619NaNNaT122018-06-06  0.0000000.251286NaNNaT132018-06-07  0.000000NaNNaNNaT142018-06-08  0.000000NaNNaNNaT152018-06-09  0.0000000.223932land2018-06-09

Solution 2:

Edit, more versatile approach for any threshold of consecutive NaN's:

threshold = 3
mask = df.d.notna()
df.loc[(~mask).groupby(mask.cumsum()).transform('cumsum') >= threshold, 'c'] = np.nan

You can simply check if the row, as well as shifting the row twice are all null (I named your columns a-e:

df.loc[df.d.isnull()&df.d.shift().isnull()&df.d.shift(2).isnull(),'c']=np.nan# Result:abcde02018-05-25  0.0003810.264318land2018-05-2512018-05-26  0.0000000.264447land2018-05-2622018-05-27  0.0000000.264791NaNNaT32018-05-28  0.0000000.265253NaNNaT42018-05-29  0.000000NaNNaNNaT52018-05-30  0.0000000.266066land2018-05-3062018-05-31  0.0000000.266150NaNNaT72018-06-01  0.0000000.265816NaNNaT82018-06-02  0.0000000.264892land2018-06-0292018-06-03  0.0000000.263191NaNNaT102018-06-04  0.0000000.260508land2018-06-04112018-06-05  0.0000000.256619NaNNaT122018-06-06  0.0000000.251286NaNNaT132018-06-07  0.000000NaNNaNNaT142018-06-08  0.000000NaNNaNNaT152018-06-09  0.0000000.223932land2018-06-09

Post a Comment for "Replace Values In Pandas Column When N Number Of Nans Exist In Another Column"