Scan Subset Of Pd Dataframe To Obtain Indices Matching Certain Values
Solution 1:
Sample:
data = pd.DataFrame({
'A':list('abcdef'),
'B':[4,5,4,5,5,4],
'D':[1,0,1,0,1,0],
'E':[1,0,0,1,2,4],
})
print (data)
A B D E
0 a 4111 b 5002 c 4103 d 5014 e 5125 f 404
If need only 1
and 0
values use DataFrame.isin
with DataFrame.all
for test if all True
s per rows:
subset = data.iloc[:,2:]
data3 = data[subset.isin([0,1]).all(axis=1)]
print (data3)
A B D E
0 a 4111 b 5002 c 4103 d 501
Details:
print (subset.isin([0,1]))
D E
0TrueTrue1TrueTrue2TrueTrue3TrueTrue4TrueFalse5TrueFalseprint (subset.isin([0,1]).all(axis=1))
0True1True2True3True4False5False
dtype: bool
Solution 2:
Your subset
is a pd.DataFrame, not a pd.Series. The conditional testing you are doing for index
would work if subset
were a Series (i.e. if you were only checking the condition on a single column, not multiple columns).
So having subset
as a DataFrame is fine, but it changes how the conditional slice works. My testing shows your index
var returns NaN for 0s and 1s, (rather than leaving them out like a slice of a Series would). Adding dropna() as below should fix your code:
#find indices:index = subset[ (subset!= 0) & (subset!= 1)].dropna().index
#remove rows from orig data set:data = data.drop(index)
Solution 3:
From you code I made a calculated guess that you want to compare for more than 1
columns.
This should do the trick
# Selects only elements that are 0 or 1val = np.isin(subset, np.array([0, 1]))
# Generate indexindex = np.prod(val, axis=1) > 0# Select only desired columnsdata = data[index]
Example
# Data
a b c01111222231334334531# Removing rows that have elements other than 1 or 2
a b c01111222
Solution 4:
Without your data from DataSet.csv
, I tried to make a guess.
subset[ (subset!= 0) & (subset!= 1)]
basically returns the subset
dataframe with values False
on (subset!= 0) & (subset!= 1)
turning to NaN
while those True
keeping same values. I.e. this is equivalent to map
. It is not a filter.
Therefore, subset[ (subset!= 0) & (subset!= 1)].index
is the whole index of your data
dataframe
You drop it, so it returns empty dataframe
Post a Comment for "Scan Subset Of Pd Dataframe To Obtain Indices Matching Certain Values"