Skip to content Skip to sidebar Skip to footer

Python Pandas: Dataframe Filter Negative Values

I was wondering how I can remove all indexes that containing negative values inside their column. I am using Pandas DataFrames. Documentation Pandas DataFrame Format: Myid - value

Solution 1:

You can use all to check an entire row or column is True:

In [11]: df = pd.DataFrame(np.random.randn(10, 3))

In [12]: df
Out[12]:
          0120-1.0037350.7924790.7875381-2.056750-1.5089800.67637821.3555280.3070630.36950531.2010930.994041-1.1693234-0.3053590.044360-0.0853465-0.684149-0.482129-0.59815561.7950111.231198-0.4656837-0.632216-0.0755750.8127358-0.479523-1.900072-0.9664309-1.441645-1.1894081.338681In [13]: (df >0).all(1)
Out[13]:
0False1False2True3False4False5False6False7False8False9False
dtype: bool

In [14]: df[(df >0).all(1)]
Out[14]:
          01221.3555280.3070630.369505

If you only want to look at a subset of the columns, e.g.[0, 1]:

In [15]: df[(df[[0, 1]] > 0).all(1)]
Out[15]:
          01221.3555280.3070630.36950531.2010930.994041-1.16932361.7950111.231198-0.465683

Solution 2:

You could loop over the column names

for cols indata.columns.tolist()[1:]:
    data = data.ix[data[cols] > 0]

Solution 3:

To use and statements inside a data-frame you just have to use a single & character and separate each condition with parenthesis.

For example:

data = data[(data['col1']>0) & (data['valuecol2']>0) & (data['valuecol3']>0)]

Solution 4:

If you want to check the values of an adjacent group of columns, for example from the second to the tenth:

df[(df.ix[:,2:10] > 0).all(1)]

You can also use a range

df[(df.ix[:,range(1,10,3)] > 0).all(1)]

and an own list of indices

mylist=[1,2,4,8]
df[(df.ix[:, mylist] > 0).all(1)]

Post a Comment for "Python Pandas: Dataframe Filter Negative Values"