Select Rows From A DataFrame Based On String Values In A Column In Pandas
Solution 1:
You can write a function to be applied to each value in the States/cities
column. Have the function return either True or False, and the result of applying the function can act as a Boolean filter on your DataFrame.
This is a common pattern when working with pandas. In your particular case, you could check for each value in States/cities
whether it's made of only uppercase letters.
So for example:
def is_state_abbrev(string):
return string.isupper()
filter = d['States/cities'].apply(is_state_abbrev)
filtered_df = d[filter]
Here filter
will be a pandas Series with True
and False
values.
You can also achieve the same result by using a lambda expression, as in:
filtered_df = d[d['States/cities'].apply(lambda x: x.isupper())]
This does essentially the same thing.
Solution 2:
Consider pandas.Series.str.match passing a regex for only [A-Z]
states[states['States/cities'].str.match('^.*[A-Z]$')]
# States/cities B C D
# 0 FL 3 5 6
# 4 CA 8 3 2
# 7 WA 4 2 1
Data
from io import StringIO
import pandas as pd
txt = '''"States/cities" B C D
0 FL 3 5 6
1 Orlando 1 2 3
2 Miami 1 1 3
3 Jacksonville 1 2 0
4 CA 8 3 2
5 "San diego" 3 1 0
6 "San Francisco" 5 2 2
7 WA 4 2 1
8 Seattle 3 1 0
9 Tacoma 1 1 1'''
states = pd.read_table(StringIO(txt), sep="\s+")
Solution 3:
You can get the rows with all uppercase values in the column States/cities
like this:
df.loc[df['States/cities'].str.isupper()]
States/cities B C D
0 FL 3 5 6
4 CA 8 3 2
7 WA 4 2 1
Just to be safe, you can add a condition so that it only returns the rows where 'States/cities'
is uppercase and only 2 characters long (in case you had a value that was SEATTLE
or something like that):
df.loc[(df['States/cities'].str.isupper()) & (df['States/cities'].apply(len) == 2)]
Solution 4:
You can use str.contains to filter any row that contains small alphabets
df[~df['States/cities'].str.contains('[a-z]')]
States/cities B C D
0 FL 3 5 6
4 CA 8 3 2
7 WA 4 2 1
Solution 5:
If we assuming the order is always State followed by the city from the state , we can using where
and dropna
df['States/cities']=df['States/cities'].where(df['States/cities'].isin(['FL','CA','WA']))
df.dropna()
df
States/cities B C D
0 FL 3 5 6
4 CA 8 3 2
7 WA 4 2 1
Or we do str.len
df[df['States/cities'].str.len()==2]
Out[39]:
States/cities B C D
0 FL 3 5 6
4 CA 8 3 2
7 WA 4 2 1
Post a Comment for "Select Rows From A DataFrame Based On String Values In A Column In Pandas"