The Histograms' Color And Its Labels Are Inconsistent
I'm trying to analyze the wine-quality dataset. There are two datasets: the red wine dataset and the white wine. I combine them together to form the wine_df. I want to plot it. And
Solution 1:
The colors are a level of your index, so use that to specify colors. Change your line of code to:
counts.plot(kind='bar', title='Counts by Wine Color and quality',
color=counts.index.get_level_values(1), alpha=.7)
In this case it just turns out that matplotlib
could interpret the values in your index as colors. In general, you could have mapped the unique values to recognizable colors, for instance:
color = counts.index.get_level_values(1).map({'red': 'green', 'white': 'black'})
pandas
is doing something with the plotting order, but you could always fall back to matplotlib
to cycle the colors more reliably. The trick here is to convert color
to a categorical variable so it's always represented after the groupby
allowing you to specify only the list ['red', 'white']
import matplotlib.pyplot as plt
wine_df['color'] = wine_df.color.astype('category')
counts = wine_df.groupby(['quality', 'color']).count()['pH'].fillna(0)
ind = np.arange(len(counts))
plt.bar(ind, height=counts.values, color=['red', 'white'])
_ = plt.xticks(ind, counts.index.values, rotation=90)
plt.ylim(0,150) # So we an see (9, white)
plt.show()
Post a Comment for "The Histograms' Color And Its Labels Are Inconsistent"