Efficient way to sum measurements / time series by given interval in php

烈酒焚心 提交于 2019-12-08 05:50:44

问题


I have a series of measurement data / time series in the same interval of 15 minutes. Furthermore, I have a given period (e.g. one day, current week, month, year, (...) and I need to summarize values by hour, day, month, (...).

E.g. summarize all values of the last month, by day.

My approach is to generate a temporary array with the needed interval per period in the first step. E.g. here in PHP (PHP is not that necessary, I would prefer Python or Javascript if it provides a faster method)

$this->tempArray = array(
'2014-10-01T00:00:00+0100' => array(),
'2014-10-02T00:00:00+0100' => array(),
'2014-10-03T00:00:00+0100' => array(),
'2014-10-04T00:00:00+0100' => array(),
(...)
'2014-10-31T00:00:00+0100' => array()
);

In the second step, I loop through each date/value pair (in this example 4*24*31, (96 per day)) and assign them to my temporary array. For each date, I override some values from the datetime object. In this example the hour and the minutes to match the keys in the temp array.

$insert = array(
    'datetime' => $datetime,
    'value' => $value
);

if ($interval == "d") {

    $this->tempArray[date('Y-m-d\T00:00:sO', $datetime)][] = $insert;
}

At the last step, I loop through the temp array and summarize each array. As the result, I receive an array with 31 new date/values pairs, summarized by each day. This works fine. However is there a faster way or more efficient way? It takes nearly 0.5 seconds with this approach for one month. (If someone is interested in the source code, I will add a gist). The data are stored within a mysql database with 15 mio entries.

// Edit: I think the best way is to group this with mysql.

My current SQL query to fetch data from one year:

SELECT
FROM_UNIXTIME(PointOfTime)) as `date`,
value
FROM data
WHERE EnergyMeterId="0ca64479-bddf-4b91-9e35-bf81f4bfa84c"
and PointOfTime >= unix_timestamp('2013-01-01T00:00:00')
and PointOfTime <= unix_timestamp('2013-12-31T23:45:00')
order by `date` asc;

回答1:


If the data lies in MySQL, then that is where I would implement my solution. It is trivial to use various MySQL date/time functions to aggregate this data. Let's take a simplistic example assuming a table structure like this:

id:  autoincrement primary key
your_datetime: datetime or timestamp field
the_data: the data items you are trying to summarize

A query to summarize by day (most recent first) would look like this:

SELECT
    DATE(your_datetime) as `day`,
    SUM(the_data) as `data_sum`
FROM table
GROUP BY `day`
ORDER BY `day` DESC

If you wanted to limit it by some period of time (last 7 days for example) you can simply add a where condition

SELECT
    DATE(your_datetime) as `day`,
    SUM(the_data) as `data_sum`
FROM table
WHERE your_datetime > DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY)
GROUP BY `day`
ORDER BY `day` DESC

Here is another example where you specify a range of datetimes

SELECT
    DATE(your_datetime) as `day`,
    SUM(the_data) as `data_sum`
FROM table
WHERE your_datetime BETWEEN '2014-08-01 00:00:00' AND '2014-08-31 23:59:59'
GROUP BY `day`
ORDER BY `day` DESC

Sum by hour:

SELECT
    DATE(your_datetime) as `day`,
    HOUR(your_datetime) as `hour`
    SUM(the_data) as `data_sum`
FROM table
WHERE your_datetime BETWEEN '2014-08-01 00:00:00' AND '2014-08-31 23:59:59'
GROUP BY `day`, `hour`
ORDER BY `day` DESC, `hour` DESC

Sum by month:

SELECT
    YEAR(your_datetime) as `year`,
    MONTH(your_datetime) as `month`
    SUM(the_data) as `data_sum`
FROM table
GROUP BY `year`, `month`
ORDER BY `year` DESC, `month` DESC

Here is a reference to the MySQL Date/Time functions:

http://dev.mysql.com/doc/refman/5.5/en/date-and-time-functions.html#function_date-sub



来源:https://stackoverflow.com/questions/26240718/efficient-way-to-sum-measurements-time-series-by-given-interval-in-php

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!