Skip to content Skip to sidebar Skip to footer

Selecting Multiple (neighboring) Rows Conditionally

I'd like to return the rows which qualify to a certain condition. I can do this for a single row, but I need this for multiple rows combined. For example 'light green' qualifies to

Solution 1:

Here's a try. You would maybe want to use rolling or expanding (for speed and elegance) instead of explicitly looping with range, but I did it that way so as to be able to print out the rows being used to calculate each boolean.

df = df[['X','Y','Z']]    # remove the "total" column in order
                          # to make the syntax a little cleaner

df = df.head(4)           # keep the example more manageable

for i in range(len(df)):
    for k in range( i+1, len(df)+1 ):
        df_sum = df[i:k].sum()
        print( "rows", i, "to", k, (df_sum>0).all() & (df_sum.sum()>10) )

rows 0 to 1 True
rows 0 to 2 True
rows 0 to 3 True
rows 0 to 4 True
rows 1 to 2 False
rows 1 to 3 True
rows 1 to 4 True
rows 2 to 3 True
rows 2 to 4 True
rows 3 to 4 True

Solution 2:

I am not too sure if I understood your question correctly, but if you are looking to put multiple conditions within a dataframe, you can consider this approach:

new_df = df[(df["X"] > 0) & (df["Y"] < 0)]

The & condition is for AND, while replacing that with | is for OR condition. Do remember to put the different conditions in ().

Lastly, if you want to remove duplicates, you can use this

new_df.drop_duplicates()

You can find more information about this function at here: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.drop_duplicates.html

Hope my answer is useful to you.


Post a Comment for "Selecting Multiple (neighboring) Rows Conditionally"