I have a series of measurement data / time series in the same interval of 15 minutes. Furthermore, I have a given period (e.g. one day, current week, month, year, (...) and I need to summarize values by hour, day, month, (...).
E.g. summarize all values of the last month, by day.
My approach is to generate a temporary array with the needed interval per period in the first step. E.g. here in PHP (PHP is not that necessary, I would prefer Python or Javascript if it provides a faster method)
$this->tempArray = array(
'2014-10-01T00:00:00+0100' => array(),
'2014-10-02T00:00:00+0100' => array(),
'2014-10-03T00:00:00+0100' => array(),
'2014-10-04T00:00:00+0100' => array(),
(...)
'2014-10-31T00:00:00+0100' => array()
);
In the second step, I loop through each date/value pair (in this example 4*24*31, (96 per day)) and assign them to my temporary array. For each date, I override some values from the datetime object. In this example the hour and the minutes to match the keys in the temp array.
$insert = array(
'datetime' => $datetime,
'value' => $value
);
if ($interval == "d") {
$this->tempArray[date('Y-m-d\T00:00:sO', $datetime)][] = $insert;
}
At the last step, I loop through the temp array and summarize each array. As the result, I receive an array with 31 new date/values pairs, summarized by each day. This works fine. However is there a faster way or more efficient way? It takes nearly 0.5 seconds with this approach for one month. (If someone is interested in the source code, I will add a gist). The data are stored within a mysql database with 15 mio entries.
// Edit: I think the best way is to group this with mysql.
My current SQL query to fetch data from one year:
SELECT
FROM_UNIXTIME(PointOfTime)) as `date`,
value
FROM data
WHERE EnergyMeterId="0ca64479-bddf-4b91-9e35-bf81f4bfa84c"
and PointOfTime >= unix_timestamp('2013-01-01T00:00:00')
and PointOfTime <= unix_timestamp('2013-12-31T23:45:00')
order by `date` asc;
If the data lies in MySQL, then that is where I would implement my solution. It is trivial to use various MySQL date/time functions to aggregate this data. Let's take a simplistic example assuming a table structure like this:
id: autoincrement primary key
your_datetime: datetime or timestamp field
the_data: the data items you are trying to summarize
A query to summarize by day (most recent first) would look like this:
SELECT
DATE(your_datetime) as `day`,
SUM(the_data) as `data_sum`
FROM table
GROUP BY `day`
ORDER BY `day` DESC
If you wanted to limit it by some period of time (last 7 days for example) you can simply add a where condition
SELECT
DATE(your_datetime) as `day`,
SUM(the_data) as `data_sum`
FROM table
WHERE your_datetime > DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY)
GROUP BY `day`
ORDER BY `day` DESC
Here is another example where you specify a range of datetimes
SELECT
DATE(your_datetime) as `day`,
SUM(the_data) as `data_sum`
FROM table
WHERE your_datetime BETWEEN '2014-08-01 00:00:00' AND '2014-08-31 23:59:59'
GROUP BY `day`
ORDER BY `day` DESC
Sum by hour:
SELECT
DATE(your_datetime) as `day`,
HOUR(your_datetime) as `hour`
SUM(the_data) as `data_sum`
FROM table
WHERE your_datetime BETWEEN '2014-08-01 00:00:00' AND '2014-08-31 23:59:59'
GROUP BY `day`, `hour`
ORDER BY `day` DESC, `hour` DESC
Sum by month:
SELECT
YEAR(your_datetime) as `year`,
MONTH(your_datetime) as `month`
SUM(the_data) as `data_sum`
FROM table
GROUP BY `year`, `month`
ORDER BY `year` DESC, `month` DESC
Here is a reference to the MySQL Date/Time functions:
http://dev.mysql.com/doc/refman/5.5/en/date-and-time-functions.html#function_date-sub
来源:https://stackoverflow.com/questions/26240718/efficient-way-to-sum-measurements-time-series-by-given-interval-in-php