Grouping by week, and padding out 'missing' weeks

2018-06-20 13:14:04

In my Django model, I've got a very simple model which represents a single occurrence of an event (such as a server alert occurring):

class EventOccurrence:
    event = models.ForeignKey(Event)
    time = models.DateTimeField()

My end goal is to produce a table or graph that shows how many times an event occurred over the past n weeks.

So my question has two parts:

How can I group_by the week of the time field?

How can I "pad out" the result of this group_by to add a zero-value for any missing weeks?

For example, for the second part, I'd like transform a result like this:

| week | count |                   | week | count |
| 2    | 3     |                   | 2    | 3     |
| 3    | 5     |   —— becomes —>   | 3    | 5     |
| 5    | 1     |                   | 4    | 0     |
                                   | 5    | 1     |

What's the best way to do this in Django? General Python solutions are also OK.

Django's DateField as well as datetime doesn't support week attribute. To fetch everything in one query you need to do:

from django.db import connection

cursor = connection.cursor()
cursor.execute(" SELECT WEEK(`time`) AS 'week', COUNT(*) AS 'count' FROM %s GROUP BY WEEK(`time`) ORDER BY WEEK(`time`)" % EventOccurrence._meta.db_table, [])

data = []
results = cursor.fetchall()
for i, row in enumerate(results[:-1]):
    data.append(row)

    week = row[0] + 1
    next_week = results[i+1][0]
    while week < next_week:
        data.append( (week, 0) )
        week += 1
data.append( results[-1] )

print data

After digging django query api doc, I have don't found a way to make query through django ORM system. Cursor is a workaround, if your database brand is MySQL:

from django.db import connection, transaction
cursor = connection.cursor()

cursor.execute("""
   select 
      week(time) as `week`, 
      count(*) as `count` 
   from EventOccurrence 
   group by week(time)
   order by 1;""")

myData = dictfetchall(cursor)

This is, in my opinion, the best performance solution. But notice that this don't pad missing weeks.

EDITED Indepedent database brand solution via python (less performance)

If you are looking for database brand independece code then you should take dates day by day and aggregate it via python. If this is your case code may looks like:

#get all weeks:
import datetime
weeks = set()
d7 = datetime.timedelta( days = 7)
iterDay = datetime.date(2012,1,1)
while iterDay <= datetime.now():
    weeks.add( iterDay.isocalendar()[1] )
    iterDay += d7

#get all events
allEvents = EventOccurrence.objects.value_list( 'time', flat=True )

#aggregate event by week
result = dict()
for w in weeks:
    result.setdefault( w ,0)

for e in allEvents:
    result[ e.isocalendar()[1] ] += 1

(Disclaimer: not tested)

Since I have to query multiple tables by join them, I'm using db view to solve these requirements.

CREATE VIEW my_view
  AS
  SELECT
    *, // <-- other fields goes here
    YEAR(time_field) as year,
    WEEK(time_field) as week
  FROM my_table;

and the model as:

from django.db import models

class MyView(models.Model):
    # other fields goes here
    year = models.IntegerField()
    week = models.IntegerField()

    class Meta:
        managed = False
        db_table = 'my_view'

    def query():
        rows = MyView.objects.filter(week__range=[2, 5])
        # to handle the rows

after get rows from this db view, use the way by @danihp to padding 0 for "hole" weeks/months.

NOTE: this is only tested for MySQL backend, I'm not sure if it's OK for MS SQL Server or other.

链接地址: http://www.djcxy.com/p/57786.html

上一篇: APNS令牌是否应该加密？

下一篇: 按周分组，并填补'缺失'周