用Matplotlib在对数刻度上绘制直方图
我有一个在系列中具有以下值的pandas DataFrame
I have a pandas DataFrame that has the following values in a Series
x = [2, 1, 76, 140, 286, 267, 60, 271, 5, 13, 9, 76, 77, 6, 2, 27, 22, 1, 12, 7, 19, 81, 11, 173, 13, 7, 16, 19, 23, 197, 167, 1]
我被指示在使用Python 3.6的Jupyter笔记本中绘制两个直方图.没汗吧?
I was instructed to plot two histograms in a Jupyter notebook with Python 3.6. No sweat right?
x.plot.hist(bins=8)
plt.show()
我选择了8个垃圾箱,因为这对我来说看起来最好. 我还被指示要用x的对数绘制另一个直方图.
I chose 8 bins because that looked best to me. I have also been instructed to plot another histogram with the log of x.
x.plot.hist(bins=8)
plt.xscale('log')
plt.show()
此直方图看起来很糟糕.我做错了吗?我尝试摆弄该图,但是我尝试过的一切似乎只会使直方图看起来更糟.示例:
This histogram looks TERRIBLE. Am I not doing something right? I've tried fiddling around with the plot, but everything I've tried just seems to make the histogram look even worse. Example:
x.plot(kind='hist', logx=True)
除了将X的对数绘制为直方图之外,我没有得到任何其他说明.
I was not given any instructions other than plot the log of X as a histogram.
我非常感谢您的帮助!!!
I really appreciate any help!!!
为了记录,我已经导入了pandas,numpy和matplotlib,并指定该图应为内联.
For the record, I have imported pandas, numpy, and matplotlib and specified that the plot should be inline.
在hist
调用中指定bins=8
意味着最小值和最大值之间的范围平均分为8个bin.线性标度上的相等值在对数标度上失真.
Specifying bins=8
in the hist
call means that the range between the minimum and maximum value is divided equally into 8 bins. What is equal on a linear scale is distorted on a log scale.
您可以做的是指定直方图的bin,以使它们的宽度不相等,从而使它们在对数刻度上看起来相等.
What you could do is specify the bins of the histogram such that they are unequal in width in a way that would make them look equal on a logarithmic scale.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
x = [2, 1, 76, 140, 286, 267, 60, 271, 5, 13, 9, 76, 77, 6, 2, 27, 22, 1, 12, 7,
19, 81, 11, 173, 13, 7, 16, 19, 23, 197, 167, 1]
x = pd.Series(x)
# histogram on linear scale
plt.subplot(211)
hist, bins, _ = plt.hist(x, bins=8)
# histogram on log scale.
# Use non-equal bin sizes, such that they look equal on log scale.
logbins = np.logspace(np.log10(bins[0]),np.log10(bins[-1]),len(bins))
plt.subplot(212)
plt.hist(x, bins=logbins)
plt.xscale('log')
plt.show()