Skip to content Skip to sidebar Skip to footer

Plot Year Over Year On 12 Month Axis

I want to plot 6 years of 12 month period data on one 12 month axis from Dec - Jan. import pandas as pd import numpy as np import matplotlib as mpl import matplotlib.pyplot as plt

Solution 1:

There's probably a better way than this:

In [44]: vals = df.groupby(lambda x: (x.year, x.month)).sum()

In [45]: vals
Out[45]: 
(2000, 1)    -0.235044
(2000, 2)    -1.196815
(2000, 3)    -0.370850
(2000, 4)     0.719915
(2000, 5)    -1.228286
(2000, 6)    -0.192108
(2000, 7)    -0.337032
(2000, 8)    -0.174219
(2000, 9)     0.605742
(2000, 10)    1.061558
(2000, 11)   -0.683674
(2000, 12)   -0.813779
(2001, 1)     2.103178
(2001, 2)    -1.099845
(2001, 3)     0.366811
...
(2004, 10)   -0.905740
(2004, 11)   -0.143628
(2004, 12)    2.166758
(2005, 1)     0.944993
(2005, 2)    -0.741785
(2005, 3)     1.531754
(2005, 4)    -1.106024
(2005, 5)    -1.925078
(2005, 6)     0.400930
(2005, 7)     0.321962
(2005, 8)    -0.851656
(2005, 9)     0.371305
(2005, 10)   -0.868836
(2005, 11)   -0.932977
(2005, 12)   -0.530207
Length: 72, dtype: float64

Now change the index on vals to a MultiIndex

In [46]: vals.index = pd.MultiIndex.from_tuples(vals.index)

In [47]: vals.head()
Out[47]: 
20001   -0.2350442   -1.1968153   -0.37085040.7199155   -1.228286
dtype: float64

Then unstack and plot:

In [48]: vals.unstack(0).plot()
Out[48]: <matplotlib.axes.AxesSubplot at 0x1171a2dd0>

enter image description here

Solution 2:

  1. I think it is more clear, and easier to transform, if the data is a pandas.DataFrame, not a pandas.Series.
    • The sample data in the OP is a pandas.Series, but it's going to be more typical for people looking to solve this question, if we begin with a pandas.DataFrame, so we'll begin by using .to_frame()
  2. Extract the month and year component of the datetime index.
    • This index is already a datetime dtype; if your data is not, use pd.to_datetime() to convert the date index / column
    • If the data is a column, and not the index, then use the .dt accessor to get month and year (e.g. df[col].dt.year or df.index.year)
  3. Use pandas.pivot_table to transform the dataframe from a long to wide format, and aggregate the data (e.g. 'sum', 'mean', etc.)
    • This puts the dataframe into the correct shape to easily plot, without unstacking and further manipulation.
    • The index will always be the x-axis, and the columns will be plotted.
    • If there is not repeated data for a given 'month', so no aggregation is required, then use pandas.DataFrame.pivot.
  4. Plot the pivoted dataframe with pandas.DataFrame.plot
  • Tested in python 3.8.11, pandas 1.3.2, matplotlib 3.4.3
import pandas as pd

# for this OP convert the Series to a DataFrame
df = df.to_frame()

# extract month and year from the index and create columns
df['month'] = df.index.month
df['year'] = df.index.year

# display(df.head(3))0  month  year
2000-01-310.167921120002000-02-290.523505220002000-03-310.81737632000# transform the dataframe to a wide format
dfp = pd.pivot_table(data=df, index='month', columns='year', values=0, aggfunc='sum')

# display(dfp.head(3))
year       200020012002200320042005
month                                                            
10.1679210.637999 -0.1741220.620622 -0.854315 -1.52357920.523505 -0.344658 -0.2808190.8455430.782439 -0.59373230.817376 -0.004282 -0.9074240.3526551.258275 -0.624112# plot
ax = dfp.plot(ylabel='Aggregated Sum', figsize=(6, 4))
ax.set_xticks(dfp.index)  # so every month number is displayed
ax.legend(bbox_to_anchor=(1, 1.02), loc='upper left')

enter image description here

  • To get month names on the axis, create the 'month' column with:
    • df['month'] = df.index.strftime('%b'), which get the month abbreviation
from calendar import month_abbr  # this is a sorted list of month name abbreviations# extract the month abbreviation
df['month'] = df.index.strftime('%b')
df['year'] = df.index.year

# transform
dfp = pd.pivot_table(data=df, index='month', columns='year', values=0, aggfunc='sum')

# the dfp index so the x-axis will be in order
dfp = dfp.loc[month_abbr[1:]]

# display(dfp.head(3))
year       200020012002200320042005
month                                                            
Jan    0.1679210.637999 -0.1741220.620622 -0.854315 -1.523579
Feb    0.523505 -0.344658 -0.2808190.8455430.782439 -0.593732
Mar    0.817376 -0.004282 -0.9074240.3526551.258275 -0.624112# plot
ax = dfp.plot(ylabel='Aggregated Sum', figsize=(6, 4))
ax.set_xticks(range(12))  # set ticks for all months
ax.set_xticklabels(dfp.index)  # label all the ticks
ax.legend(bbox_to_anchor=(1, 1.02), loc='upper left')

enter image description here

  • This data is discrete data, because it's aggregated, so it really should be plotted as a bar plot.
ax = dfp.plot(kind='bar', ylabel='Aggregated Sum', figsize=(12, 4), rot=0)
ax.legend(bbox_to_anchor=(1, 1.02), loc='upper left')

enter image description here

Post a Comment for "Plot Year Over Year On 12 Month Axis"