问题
horrible title, but let me explain: i've got this django model containing a timestamp (date) and the attribute to log - f.e. the number of users consuming some ressource - (value).
class Viewers(models.Model):
date = models.DateTimeField()
value = models.IntegerField()
for each 10seconds the table contains the number of users. something like this:
| date | value |
|------|-------|
| t1 | 15 |
| t2 | 18 |
| t3 | 27 |
| t4 | 25 |
| .. | .. |
| t30 | 38 |
| t31 | 36 |
| .. | .. |
now i want to generate different statistics from this data, each with another resolution. f.e. for a chart of the last day i don't need the 10second resolution, so i want 5 minute steps (that are build by averaging the values (and maybe also the date) of the rows from t1 to t29, t30 to t59, ...), so that i'll get:
| date | value |
|------|-------|
| t15 | 21 |
| t45 | 32 |
| .. | .. |
the attributes to keep variable are start & end timestamp and the resolution (like 5 minutes). is there a way using the django orm/queryset api and if not, how to reach this with custom sql?
回答1:
I've been trying to solve this problem in the most 'django' way possible. I've settled for the following. It averages the values for 15minute time slots between start_date and end_date where the column name is'date':
readings = Reading.objects.filter(date__range=(start_date, end_date)) \
.extra(select={'date_slice': "FLOOR (EXTRACT (EPOCH FROM date) / '900' )"}) \
.values('date_slice') \
.annotate(value_avg=Avg('value'))
It returns a dictionary:
{'value_avg': 1116.4925373134329, 'date_slice': 1546512.0}
{'value_avg': 1001.2028985507246, 'date_slice': 1546513.0}
{'value_avg': 1180.6285714285714, 'date_slice': 1546514.0}
The core of the idea comes from this answer to the same question for PHP/SQL. The code passed to extra is for a Postgres DB.
回答2:
from django.db.models import Avg
Viewers.objects.filter(date__range=(start_time, end_time)).aggregate(average=Avg('value'))
That will get you the average of all the values
between start_time
and end_time
, returned as a dictionary in the form of { 'average': <the average> }
.
start_time
and end_time
need to be Python datetime objects. So if you have a timestamp, or something, you'll need to convert it first. You can also use datetime.timedelta
to calculate the end_time
based on the start_time. For a five minute resolution, something like this:
from datetime import timedelta
end_time = start_time + timedelta(minutes=5)
回答3:
have you looked at the range filter?
https://docs.djangoproject.com/en/dev/ref/models/querysets/#range
The example given in the doc's seems similar to your situation.
回答4:
After long trying i made it as SQL-statement:
SELECT FROM_UNIXTIME(AVG(UNIX_TIMESTAMP(date))), SUM(value)
FROM `my_table`
WHERE date BETWEEN SUBTIME(NOW( ), '0:30:00') AND NOW()
GROUP BY UNIX_TIMESTAMP(date) DIV 300
ORDER BY date DESC
with
start_time = SUBTIME(NOW( ), '0:30:00')
end_time = NOW()
period = 300 # in seconds
in the end - not really hard - and indeed independent from the time resolution of the samplings in the origin table.
来源:https://stackoverflow.com/questions/6266397/django-aggregation-to-lower-resolution-using-grouping-by-a-date-range