Pandas Conditional Groupby Count
Given this data frame: import pandas as pd df = pd.DataFrame( {'A' : ['foo', 'foo', 'foo', 'foo', 'bar', 'bar', 'bar', 'bar'], 'D' : [2, 4, 4, 2, 5, 4, 3, 2]})
Solution 1:
Does this warning matter in this case?
I see that warning for a lot of things, and it's never once made a difference to me. I just ignore it.
Also, how does pandas know to match the rows up correctly if it's taking them from another dataframe?
pandas is using the index of the DataFrame. Here's your example, rewritten slightly for clarity:
df2 = df.query('A=="foo" and D==2')
df2['Dcount'] = len(df2)
The resulting DataFrame is
A D Dcount
0 foo 223 foo 22
Notice the 0
and 3
in the index? That's what pandas uses to the line everything up. So I could just use the above with
df['Dcount'] = df2['Dcount']
and I will get your same result. The right-hand side of that assignment is a Series, so the index is built-in.
On the other hand, I would get an error is I had tried to assign an array:
df['Dcount'] = df2['Dcount'].values # length error
Post a Comment for "Pandas Conditional Groupby Count"