Counting values by day/hour with timeseries in MATLAB

后端 未结 3 728
死守一世寂寞
死守一世寂寞 2020-12-12 00:21

So, I\'m beginning to use timeseries in MATLAB and I\'m kinda stuck.

I have a list of timestamps of events which I imported into MATLAB. It\'s now a 3000x25 array wh

相关标签:
3条回答
  • 2020-12-12 01:18

    As others have suggested, you should convert the string dates to serial date numbers. This makes it easy to work with the numeric data.

    An efficient way to count number of events per interval (days, hours, minutes, etc...) is to use functions like HISTC and ACCUMARRAY. The process will involve manipulating the serial dates into units/format required by such functions (for example ACCUMARRAY requires integers, whereas HISTC needs to be given the bin edges to specify the ranges).

    Here is a vectorized solution (no-loop) that uses ACCUMARRAY to count number of events. This is a very efficient function (even of large input). In the beginning I generate some sample data of 5000 timestamps unevenly spaced over a period of 4 days. You obviously want to replace it with your own:

    %# lets generate some random timestamp between two points (unevenly spaced)
    %# 1000 timestamps over a period of 4 days
    dStart = datenum('2000-01-01');     % inclusive
    dEnd = datenum('2000-01-5');        % exclusive
    t = sort(dStart + (dEnd-dStart).*rand(5000,1));
    %#disp( datestr(t) )
    
    %# shift values, by using dStart as reference point
    dRange = (dEnd-dStart);
    tt = t - dStart;
    
    %# number of events by day/hour/minute
    numEventsDays = accumarray(fix(tt)+1, 1, [dRange*1 1]);
    numEventsHours = accumarray(fix(tt*24)+1, 1, [dRange*24 1]);
    numEventsMinutes = accumarray(fix(tt*24*60)+1, 1, [dRange*24*60 1]);
    
    %# corresponding datetime range/interval label
    days = cellstr(datestr(dStart:1:dEnd-1));
    hours = cellstr(datestr(dStart:1/24:dEnd-1/24));
    minutes = cellstr(datestr(dStart:1/24/60:dEnd-1/24/60));
    
    %# display results
    [days num2cell(numEventsDays)]
    [hours num2cell(numEventsHours)]
    [minutes num2cell(numEventsMinutes)]
    

    Here is the output for the number of events per day:

    '01-Jan-2000'    [1271]
    '02-Jan-2000'    [1258]
    '03-Jan-2000'    [1243]
    '04-Jan-2000'    [1228]
    

    And an extract of the number of events per hour:

    '02-Jan-2000 09:00:00'    [50]
    '02-Jan-2000 10:00:00'    [54]
    '02-Jan-2000 11:00:00'    [53]
    '02-Jan-2000 12:00:00'    [74]
    '02-Jan-2000 13:00:00'    [49]
    '02-Jan-2000 14:00:00'    [59]
    

    similarly for minutes:

    '03-Jan-2000 08:54:00'    [1]
    '03-Jan-2000 08:55:00'    [1]
    '03-Jan-2000 08:56:00'    [1]
    '03-Jan-2000 08:57:00'    [0]
    '03-Jan-2000 08:58:00'    [0]
    '03-Jan-2000 08:59:00'    [0]
    '03-Jan-2000 09:00:00'    [1]
    '03-Jan-2000 09:01:00'    [2]
    
    0 讨论(0)
  • 2020-12-12 01:21

    You can convert those timestamps to a number with datenum:

    A serial date number represents the whole and fractional number of days from a specific date and time, where datenum('Jan-1-0000 00:00:00') returns the number 1. (The year 0000 is merely a reference point and is not intended to be interpreted as a real year in time.)

    This way, it's easier to check where a period starts and end. Eg: the week your looking for starts at x and ends at x+7.999... ; all you have to do to find events in that period is checking if the datenum value is between x and x+8:

    week_x_events = find(dn_timestamp>=x & dn_timestamp<x+8)
    

    The difficulty is in converting your timestamp to datenum acceptable format, which is doable using regexp, good luck!

    0 讨论(0)
  • 2020-12-12 01:26

    I don't know what +00:00 means (maybe time zone?), but you can simply convert your string timestamps into numerical format:

    >> t = datenum('2000-01-01T00:01:04+00:00', 'yyyy-mm-ddTHH:MM:SS')
    
    t =
    
      7.3049e+005
    
    >> datestr(t)
    
    ans =
    
    01-Jan-2000 00:01:04
    
    0 讨论(0)
提交回复
热议问题