在xlabels上用总行和日期绘制堆积条形图
我正在使用熊猫图来生成堆积的条形图,该条形图的行为与matplotlib的行为不同,但是日期总是以错误的格式出现,因此我无法更改它. 我也想在图表上画一条总计"线.但是,当我尝试添加它时,先前的条被删除了. 我想制作一张下面的图表(由excel生成).黑线是条形的总和.
I am using the pandas plot to generate a stacked bar chart, which has a different behaviour from matplotlib's, but the dates always come out with a bad format and I could not change it. I would also like to a "total" line on the chart. But when I try to add it, the previous bars are erased. I want to make a chart like the one below (generated by excel). The black line is the sum of the bars.
我已经在线查看了一些解决方案,但是它们仅在没有太多条形的情况下才看起来不错,因此标签之间会留出一些空间.
I've looked at some solutions online, but they only look good when there are not many bars, so you get some space between the labels.
这是我能做的最好的事情,下面是我使用的代码.
Here is the best I could do and below there is the code I used.
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.ticker as plticker
# DATA (not the full series from the chart)
dates = ['2016-10-31', '2016-11-30', '2016-12-31', '2017-01-31', '2017-02-28', '2017-03-31',
'2017-04-30', '2017-05-31', '2017-06-30', '2017-07-31', '2017-08-31', '2017-09-30',
'2017-10-31', '2017-11-30', '2017-12-31', '2018-01-31', '2018-02-28', '2018-03-31',
'2018-04-30', '2018-05-31', '2018-06-30', '2018-07-31', '2018-08-31', '2018-09-30',
'2018-10-31', '2018-11-30', '2018-12-31', '2019-01-31', '2019-02-28', '2019-03-31']
variables = {'quantum ex sa': [6.878011, 6.557054, 3.229360, 3.739318, 1.006442, -0.117945,
-1.854614, -2.882032, -1.305225, 0.280100, 0.524068, 1.847649,
5.315940, 4.746596, 6.650303, 6.809901, 8.135243, 8.127328,
9.202209, 8.146417, 6.600906, 6.231881, 5.265775, 3.971435,
2.896829, 4.307549, 4.695687, 4.696656, 3.747793, 3.366878],
'price ex sa': [-11.618681, -9.062433, -6.228452, -2.944336, 0.513788, 4.068517,
6.973203, 8.667524, 10.091766, 10.927501, 11.124805, 11.368854,
11.582204, 10.818471, 10.132152, 8.638781, 6.984159, 5.161404,
3.944813, 3.723371, 3.808564, 4.576303, 5.170760, 5.237303,
5.121998, 5.502981, 5.159970, 4.772495, 4.140812, 3.568077]}
df = pd.DataFrame(index=pd.to_datetime(dates), data=variables)
# PLOTTING
ax = df.plot(kind='bar', stacked=True, width=1)
# df['Total'] = df.sum(axis=1)
# df['Total'].plot(ax=ax)
ax.axhline(0, linewidth=1)
ax.yaxis.set_major_formatter(plticker.PercentFormatter())
plt.tight_layout()
plt.show()
编辑
这是最适合我的方法.这比使用熊猫df.plot(kind='bar', stacked=True)
更好,因为它可以更好地在x轴上设置日期标签的格式,还可以为条形图提供任意数量的系列.
Edit
This is what work best for me. This works better than using the pandas df.plot(kind='bar', stacked=True)
because it allows for better formatting of the date labels in the x axis and also allows for any number of series for the bars.
for count, col in enumerate(df.columns):
old = df.iloc[:, :count].sum(axis=1)
bottom_series = ((old >= 0) == (df[col] >= 0)) * old
ax.bar(df.index, df[col], label=col, bottom=bottom_series, width=31)
df['Total'] = df.sum(axis=1)
ax.plot(df.index, df['Total'], color='black', label='Total')
这就是您想要的:
fig, ax = plt.subplots(1,1, figsize=(16,9))
# PLOTTING
ax.bar(df.index, df['price ex sa'], bottom=df['quantum ex sa'],width=31, label='price ex sa')
ax.bar(df.index, df['quantum ex sa'], width=31, label='quantum ex sa')
total = df.sum(axis=1)
ax.plot(total.index, total, color='r', linewidth=3, label='total')
ax.legend()
plt.show()
在使用日期时间进行绘图时似乎存在一个错误(功能).我试图将索引转换为字符串,并且可以正常工作:
There seems to be a bug (features) on plotting with datetime. I tried to convert the index to string and it works:
df.index=df.index.strftime('%Y-%m')
ax = df.plot(kind='bar', stacked=True, width=1)
df['Total'] = df.sum(axis=1)
df['Total'].plot(ax=ax, label='total')
ax.legend()
我想我知道发生了什么事.问题是
Edit 2: I think I know what's going on. The problem is that
ax = df.plot(kind='bar', stacked=True)
将ax
的x轴返回/设置为range(len(df))
,并用df.index
中的相应值标记,而不是df.index
本身.这就是为什么如果我们在相同的ax
上绘制第二个序列,则它不会显示(由于xaxis的比例不同).所以我尝试了:
returns/sets x-axis of ax
to range(len(df))
labeled by the corresponding values from df.index
, but not df.index
itself. That's why if we plot the second series on the same ax
, it doesn't show (due to different scale of xaxis). So I tried:
# PLOTTING
colums = df.columns
ax = df.plot(kind='bar', stacked=True, width=1, figsize=(10, 6))
ax.plot(range(len(df)), df.sum(1), label='Total')
ax.legend()
plt.show()
它可以按预期工作