Histogram manipulation to remove unwanted data

喜你入骨 提交于 2019-12-11 00:14:11

问题


How do I remove data from a histogram in python under a certain frequency count?

Say I have 10 bins, the first bin has a count of 4, the second has 2, the third has 1, fourth has 5, etc... Now I want to get rid of the data that has a count of 2 or less. So the second bin would go to zero, as would the third.

Example:

import numpy as np
import matplotlib.pyplot as plt

gaussian_numbers = np.random.randn(1000)
plt.hist(gaussian_numbers, bins=12)
plt.title("Gaussian Histogram")
plt.xlabel("Value")
plt.ylabel("Frequency")

fig = plt.gcf()

Gives:

and I want to get rid of the bins with fewer than a frequency of say 'X' (could be frequency = 100 for example).

want:

thank you.


回答1:


Une np.histogram to create the histogram.

Then use np.where. Given a condition, it yields an array of booleans you can use to index your histogram.

import numpy as np
import matplotlib.pyplot as plt

gaussian_numbers = np.random.randn(1000)

# Get histogram
hist, bins = np.histogram(gaussian_numbers, bins=12)

# Threshold frequency
freq = 100

# Zero out low values
hist[np.where(hist <= freq)] = 0

# Plot
width = 0.7 * (bins[1] - bins[0])
center = (bins[:-1] + bins[1:]) / 2
plt.bar(center, hist, align='center', width=width)
plt.title("Gaussian Histogram")
plt.xlabel("Value")
plt.ylabel("Frequency")

(Plot part inspired from here.)



来源:https://stackoverflow.com/questions/36341774/histogram-manipulation-to-remove-unwanted-data

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!