How to make a histogram from this nc file?

核能气质少年 提交于 2020-01-06 12:47:52

问题


I'm a research assistant and I've recently started to learn python to interpret model output in netCDF file format. Let me give a quick background on my question:

I have already searched through a certain grid area of a netCDF file using the netCDF4 module and stored an array of times, which I then converted to a list of dates using netCDF4's num2date feature. I have shown my code below. Please note that restrictedrange is a subset of a variable from an nc file and rmduplicates() is not shown.

import netCDF4 as nc
import numpy as np
import matplotlib.pyplot as pyp
import matplotlib as mpl
import datetime as dtm
flor = nc.Dataset('FLOR.slp_subset1.nc','r')    

times = []
timecounter = .25
for i in restrictedrange:
     for j in np.nditer(i):
         if(j <= 975):
              times.append(timecounter)
    timecounter += .25
uniquetimes = rmduplicates(times)
dates = nc.num2date(uniquetimes,'days since 0001-01-01 00:00:00','julian')

stacked_dates = []
for date in dates:
    stacked_dates.append(date.replace(year=0001))
stacked_dates = mpl.dates.date2num(stacked_dates)

fig = pyp.figure()
ax = pyp.subplot(111)
ax.xaxis.set_major_locator(mpl.dates.MonthLocator())
format = mpl.dates.DateFormatter('%m/%d')
ax.xaxis.set_major_formatter(format)

ax.hist(stacked_dates)

pyp.xticks(rotation='vertical')

pyp.show()

Now I have a list of dates in the format "(y)yy-mm-dd hh:mm:ss". I would now like to take those dates and make a histogram (possibly using matplotlib or whatever is best for this) by month. So, bars = frequency, bins are months. Also, if it wasn't clear from my format, some years have three numbers, some only two, but actually none that have 1.

Again, I'm quite new to python so I appreciate any help and I apologize if this question is poorly formatted, as I have never used this website.

Thanks!


回答1:


I don't know what you have for data, but here's an mock example of how to make a histogram with months\days on x axis.

I can only assume that you start with a list of datetime objects, but I can't figure out what nc is (is that matplotlib.date module?) or what kind of times can exactly be found in the unique times. So generally this is the approach.

These modules you will need and use.

import matplotlib as mpl
import matplotlib.pyplot as plt
import datetime

These are the mock dates I've used. for this example. There are only 11 months on there, so mostly all bins will be 1 in the end.

for i in range(1, 12):
    dates.append(datetime.datetime(i*5+1960, i, i, i, i, i))

[datetime.datetime(1965, 1, 1, 1, 1, 1), datetime.datetime(1970, 2, 2, 2, 2, 2), datetime.datetime(1975, 3, 3, 3, 3, 3), datetime.datetime(1980, 4, 4, 4, 4, 4), datetime.datetime(1985, 5, 5, 5, 5, 5), datetime.datetime(1990, 6, 6, 6, 6, 6), datetime.datetime(1995, 7, 7, 7, 7, 7), datetime.datetime(2000, 8, 8, 8, 8, 8), datetime.datetime(2005, 9, 9, 9, 9, 9), datetime.datetime(2010, 10, 10, 10, 10, 10), datetime.datetime(2015, 11, 11, 11, 11, 11)]

If like in the above example you're dealing with different years, you're going to have to "stack" them yourself. Otherwise the date2num function I'll use later will produce wildly different numbers. To "stack" them means convert them as if they all happened in the same year.

stacked_dates = []
for date in dates:
    stacked_dates.append( date.replace(year=2000)  )

>>> stacked_dates
[datetime.datetime(2000, 1, 1, 1, 1, 1), datetime.datetime(2000, 2, 2, 2, 2, 2), datetime.datetime(2000, 3, 3, 3, 3, 3), datetime.datetime(2000, 4, 4, 4, 4, 4), datetime.datetime(2000, 5, 5, 5, 5, 5), datetime.datetime(2000, 6, 6, 6, 6, 6), datetime.datetime(2000, 7, 7, 7, 7, 7), datetime.datetime(2000, 8, 8, 8, 8, 8), datetime.datetime(2000, 9, 9, 9, 9, 9), datetime.datetime(2000, 10, 10, 10, 10, 10), datetime.datetime(2000, 11, 11, 11, 11, 11)]

Ok. Now we can use the date2num function to get something mpl actually understands. (Btw, if you want to plot just this data you can with plt.plot_dates function, that function understands datetime objects)

stacked_dates = mpl.dates.date2num(stacked_dates)

>>> stacked_dates
array([ 730120.04237269,  730152.08474537,  730182.12711806,
        730214.16949074,  730245.21186343,  730277.25423611,
        730308.2966088 ,  730340.33898148,  730372.38135417,
        730403.42372685,  730435.46609954])

Ok now for the plotting itself. mpl can understand these numbers, but it will not automatically assume they are dates. It will treat them as normal numbers. That's why we've got to tell the x axis that they're actually dates. Do that with major_axis_formatter and set_major_locator

fig = plt.figure()
ax = plt.subplot(111)
ax.xaxis.set_major_locator(mpl.dates.MonthLocator())
format = mpl.dates.DateFormatter('%m/%d') #explore other options of display
ax.xaxis.set_major_formatter(format)

ax.hist(stacked_dates) #plot the damned thing

plt.xticks(rotation='vertical') #avoid overlapping numbers
                           #make sure you do this AFTER .hist function

plt.show()

This code produces following graph:

Do note that there's a chance you won't be able to see dates on your original graph because they'll run off screen (formats like these can be long, and don't fit on the graph). In that case press the "configure subplots" button and adjust value for "bottom". In the script you can do that by plt.subplots_adjust(bottom=.3) or some other value.

You should also take care to specify that there are 12 bins in ax.hist(stacked_dates, bins=12) because default is 10, and will look funky like my graph.

Also there's a simpler, albeit less modifiable/personofiable etc... possibility by using a bar plot, instead of a histogram. Read about it HERE But it really depends on what kind of information you have. If it's a lot of dates, it's probably easier to let the hist function calculate bin heights than doing it by yourself. If it's some other info, it's worthwhile to consider using a bar plot.

Complete script would be something like:

import matplotlib as mpl
import matplotlib.pyplot as plt
import datetime

stacked_dates = []
for date in dates:
    stacked_dates.append( date.replace(year=2000)  )

stacked_dates = mpl.dates.date2num(stacked_dates)

fig = plt.figure()
ax = plt.subplot(111)
ax.xaxis.set_major_locator(mpl.dates.MonthLocator())
format = mpl.dates.DateFormatter('%m/%d')
ax.xaxis.set_major_formatter(format)

ax.hist(stacked_dates)

plt.xticks(rotation='vertical')  
plt.show()


来源:https://stackoverflow.com/questions/28618831/how-to-make-a-histogram-from-this-nc-file

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!