问题
These days I'm trying to run my climate model with new meteorology data (which is given in netcdf format instead of the old cray format). The model is compiled smoothly, however when it's time for the simulation the model runs well the first day but it stops in the second day of simulation, always at same time step, no matter what start date I use. The error is:
forrtl: severe (408): fort: (2): Subscript #1 of the array TIMEVALS has value 141 which is greater than the upper bound of 140.
So I did some research and went through my source code to see what kind of array timevals refers to and I found out that it refers to this new meteorology data which has a time dimension, and such time array is made of 140 elements. Each element is a specific date and time of the meteo data that the model is supposed to use for the simulation...so I started to believe that it's a problem of my code, but my collegue has been able to run the model with no issues, which was strange to me. He compiled the model with some different settings in the Makefile, I don't know if this matters, I'm still not very familiar with fortran etc. However here below is the part of the code that uses this TIMEVALS array:
CASE(2) ! nudging data is in netcdf-format
cfile = str_filter(ndg_file_nc,yr,mo,dy,hr,mi,se,ndgblock)
CALL message(' Adjust date using file: ',TRIM(cfile))
IF (p_parallel_io) THEN
INQUIRE(file=cfile,exist=found)
IF (.NOT.found) &
CALL finish('NudgingInit','Nudging data file not found.')
ndgfile%format = NETCDF
CALL IO_open (cfile, ndgfile, IO_READ)
CALL IO_INQ_DIMID(ndgfile%file_id, 'time', ndimid)
CALL IO_INQ_DIMLEN(ndgfile%file_id, ndimid, nts)
CALL IO_INQ_VARID(ndgfile%file_id, 'time', nvarid)
ALLOCATE (timevals(nts))
CALL IO_GET_VAR_DOUBLE (ndgfile%file_id, nvarid, timevals)
ihead_nc(1) = FLOOR(timevals(1)) ! ihead_nc(1) is YYYYMMDD
ihead_nc(2) = INT((timevals(1)-ihead_nc(1))*24._dp) ! ihead_nc(2) is HH
DEALLOCATE (timevals)
ENDIF
IF (p_parallel) CALL p_bcast(ihead_nc, p_io)
CALL inp_convert_date(ihead_nc(1),ihead_nc(2)*10000, ndg_date0)
IF (p_parallel_io) THEN
! skip first record and read second header
ALLOCATE (timevals(nts))
CALL IO_GET_VAR_DOUBLE (ndgfile%file_id, nvarid, timevals)
ihead_nc(1) = FLOOR(timevals(2)) ! ihead_nc(1) is YYYYMMDD
ihead_nc(2) = INT((timevals(2)-ihead_nc(1))*24._dp) ! ihead_nc(2) is HH
DEALLOCATE (timevals)
CALL IO_close(ndgfile)
ENDIF
IF (p_parallel) CALL p_bcast(ihead_nc, p_io)
CALL inp_convert_date(ihead_nc(1),ihead_nc(2)*10000, ndg_date1)
ndg_file
and ndg_date
refer to nudging (meteo data)
Do you guys have any idea of what might cause this error?
回答1:
I've got some time now to elaborate on my earlier comment. (Note that I use italics denote terms you might care to read about.)
The error you report is a run-time error, not one that the compiler is able to see at compile-time. If you don't understand the difference between run-time (ie when the code executes) and compile-time (ie when the compiler turns your sources into executable code) do some research. Furthermore it's evident that you (or someone) has instructed the compiler to create a version of the code which checks that array element accesses are within array bounds. This is a very important safety feature when testing new software, but imposes a performance penalty when the code executes so many codes are, once they've passed their tests, compiled without this checking.
I don't know what compiler you're using but look at its documentation to find an option that turns on array bounds checking at run-time.
The error message is quite explicit -- at some point in your code it has tried to access element 141 of an array with only 140 elements. We can't tell you how this has happened, probably not even if we saw your entire code. This kind of thing often happens when data is loaded that doesn't conform to the programmer's expectations. It also often happens when programmers make off-by-one errors in writing loops. We might spot that from looking at your whole code, but you're in a much better position to do that than we are.
You write
but my collegue has been able to run the model with no issues, which was strange to me. He compiled the model with some different settings in the Makefile, I don't know if this matters,
Well, yes, this matters, it matters a lot. If you write code that accesses element 141 of an array with 140 elements Fortran, like many other compiled languages, will happily access the next location in memory after element 140. In general you haven't a clue what data the program is interfering with. If you are lucky the next location in memory is outside the address space the operating system has allocated to the program and the operating system stops the program immediately and reports a segmentation fault.
If you're unlucky the program carries on blithely reading from, and writing to, element 141, whatever the heck it is.
I speculate that your colleague has not implemented array-bounds checking for his version of the code. It's up to you whether or not you tell him his code's (very probably) broken.
So what do you do about it ? You debug the program. You can do this in a variety of ways, the easiest of which is (in my opinion) to insert some write
statements to print out variable values at critical points in the code to test your assumptions about what values they might, can, or actually do take. More difficult, but worth the initial effort in terms of future problem-solving, would be to run the code under the control of a debugger. There are several good debuggers available for Fortran programs on all major platforms.
来源:https://stackoverflow.com/questions/30627954/time-array-out-of-bounds-in-modelling