Skip to content Skip to sidebar Skip to footer

Iterating Across Multiple Columns In Pandas Df And Slicing Dynamically

TLDR: How to iterate across all options of multiple columns in a pandas dataframe without specifying the columns or their values explicitly? Long Version: I have a pandas dataframe

Solution 1:

You can use itertools.product to generate all possible dosage combinations, and DataFrame.query to do the selection:

from itertools import product

for dosage_comb in product(*dict_of_dose_ranges.values()):
    dosage_items = zip(dict_of_dose_ranges.keys(), dosage_comb)
    query_str = ' & '.join('{} == {}'.format(*x) for x in dosage_items)
    sub_df = dosage_df.query(query_str)

    # Do Stuff...

Solution 2:

What about using the underlying numpy array and some boolean logic to build an array containing only the lines you want ?

dosage_df = pd.DataFrame((np.random.rand(40000,10)*100).astype(np.int))
dict_of_dose_ranges={3:[10,11,12,13,15,20],4:[20,22,23,24]}

#combined_doses will be bool array that will select all the lines that match the wanted combinations of doses

combined_doses=np.ones(dosage_df.shape[0]).astype(np.bool)
for item in dict_of_dose_ranges.items():
    #item[0] is the kind of dose#item[1] are the values of that kind of dose

    next_dose=np.zeros(dosage_df.shape[0]).astype(np.bool)

    #we then iterate over the wanted valuesfor value in item[1]:
        # we select and "logical or" all lines matching the values
        next_dose|=(dosage_df[item[0]] == value)
    # we "logical and" all the kinds of dose
    combined_doses&=next_dose

print(dosage_df[combined_doses])

Post a Comment for "Iterating Across Multiple Columns In Pandas Df And Slicing Dynamically"