Skip to content Skip to sidebar Skip to footer

How To Group And Aggregate Data Starting From Constant And Ending On Changing Date?

I need to aggregate data between constant date, like first day of year, and all the other dates through the year. There are two variants of this problem: easier - sum: created_at

Solution 1:

Try with groupby:

Cumulative sum:
df["created_at"] = pd.to_datetime(df["created_at"], format="%d-%m-%Y")

df["Month to date sum"] = df.groupby(df["created_at"].dt.month)["value"].transform('cumsum')
df["Year to date sum"] = df.groupby(df["created_at"].dt.year)["value"].transform('cumsum')

>>> df
  created_at  value  Month to date sum  Year to date sum
0 2012-01-01      5                  5                 5
1 2012-01-02      6                 11                11
2 2012-01-05      1                 12                12
3 2012-02-01      3                  3                15
4 2012-02-02      2                  5                17
5 2012-02-05      1                  6                18
Cumulative unique count:
df2["created_at"] = pd.to_datetime(df2["created_at"], format="%d-%m-%Y")
df2["Month to date unique"] = df2.groupby(df2["created_at"].dt.month)["value"].apply(lambda x: (~x.duplicated()).cumsum())
df2["Year to date unique"] = df2.groupby(df2["created_at"].dt.year)["value"].apply(lambda x: (~x.duplicated()).cumsum())

>>> df2
  created_at value  Month to date unique  Year to date unique
0 2012-01-01     a                     1                    1
1 2012-01-02     b                     2                    2
2 2012-01-05     c                     3                    3
3 2012-02-01     a                     1                    3
4 2012-02-02     a                     1                    3
5 2012-02-05     d                     2                    4

Post a Comment for "How To Group And Aggregate Data Starting From Constant And Ending On Changing Date?"