How To Group And Aggregate Data Starting From Constant And Ending On Changing Date?
I need to aggregate data between constant date, like first day of year, and all the other dates through the year. There are two variants of this problem: easier - sum: created_at
Solution 1:
Try with groupby
:
Cumulative sum:
df["created_at"] = pd.to_datetime(df["created_at"], format="%d-%m-%Y")
df["Month to date sum"] = df.groupby(df["created_at"].dt.month)["value"].transform('cumsum')
df["Year to date sum"] = df.groupby(df["created_at"].dt.year)["value"].transform('cumsum')
>>> df
created_at value Month to date sum Year to date sum
0 2012-01-01 5 5 5
1 2012-01-02 6 11 11
2 2012-01-05 1 12 12
3 2012-02-01 3 3 15
4 2012-02-02 2 5 17
5 2012-02-05 1 6 18
Cumulative unique count:
df2["created_at"] = pd.to_datetime(df2["created_at"], format="%d-%m-%Y")
df2["Month to date unique"] = df2.groupby(df2["created_at"].dt.month)["value"].apply(lambda x: (~x.duplicated()).cumsum())
df2["Year to date unique"] = df2.groupby(df2["created_at"].dt.year)["value"].apply(lambda x: (~x.duplicated()).cumsum())
>>> df2
created_at value Month to date unique Year to date unique
0 2012-01-01 a 1 1
1 2012-01-02 b 2 2
2 2012-01-05 c 3 3
3 2012-02-01 a 1 3
4 2012-02-02 a 1 3
5 2012-02-05 d 2 4
Post a Comment for "How To Group And Aggregate Data Starting From Constant And Ending On Changing Date?"