I have a pandas DataFrame that has the following values in a Series
x = [2, 1, 76, 140, 286, 267, 60, 271, 5, 13, 9, 76, 77, 6, 2, 27, 22, 1, 12, 7, 19, 81,
plot another histogram with the log of x.
is not the same as plotting x on the logarithmic scale. Plotting the logarithm of x would be
np.log(x).plot.hist(bins=8)
plt.show()
The difference is that the values of x themselves were transformed: we are looking at their logarithm.
This is different from plotting on the logarithmic scale, where we keep x the same but change the way the horizontal axis is marked up (which squeezes the bars to the right, and stretches those to the left).
Here is one more solution without using a subplot or plotting two things in the same image.
import numpy as np
import matplotlib.pyplot as plt
def plot_loghist(x, bins):
hist, bins = np.histogram(x, bins=bins)
logbins = np.logspace(np.log10(bins[0]),np.log10(bins[-1]),len(bins))
plt.hist(x, bins=logbins)
plt.xscale('log')
plot_loghist(np.random.rand(200), 10)
Specifying bins=8
in the hist
call means that the range between the minimum and maximum value is divided equally into 8 bins. What is equal on a linear scale is distorted on a log scale.
What you could do is specify the bins of the histogram such that they are unequal in width in a way that would make them look equal on a logarithmic scale.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
x = [2, 1, 76, 140, 286, 267, 60, 271, 5, 13, 9, 76, 77, 6, 2, 27, 22, 1, 12, 7,
19, 81, 11, 173, 13, 7, 16, 19, 23, 197, 167, 1]
x = pd.Series(x)
# histogram on linear scale
plt.subplot(211)
hist, bins, _ = plt.hist(x, bins=8)
# histogram on log scale.
# Use non-equal bin sizes, such that they look equal on log scale.
logbins = np.logspace(np.log10(bins[0]),np.log10(bins[-1]),len(bins))
plt.subplot(212)
plt.hist(x, bins=logbins)
plt.xscale('log')
plt.show()