如何在Django的ORM中使用annotate和aggregate进行GROUP BY查询

问题描述:

我真的不知道如何将 GROUP BY HAVING 转换为Django的 QuerySet.annotate QuerySet.aggregate 。我试图将这个SQL查询转换成ORM说

I don't really have groked how to translate GROUP BY and HAVING to Django's QuerySet.annotate and QuerySet.aggregate. I'm trying to translate this SQL query into ORM speak

SELECT EXTRACT(year FROM pub_date) as year, EXTRACT(month from pub_date) as month, COUNT(*) as article_count FROM articles_article GROUP BY year,month;

其中输出:

[(2008.0, 10.0, 1L), # year, month, number of articles
(2009.0, 2.0, 1L),
(2009.0, 7.0, 1L),
(2008.0, 5.0, 3L),
(2008.0, 9.0, 1L),
(2008.0, 7.0, 1L),
(2009.0, 5.0, 1L),
(2008.0, 8.0, 1L),
(2009.0, 12.0, 2L),
(2009.0, 3.0, 1L),
(2007.0, 12.0, 1L),
(2008.0, 6.0, 1L),
(2009.0, 4.0, 2L),
(2008.0, 3.0, 1L)]

我的Django模型:

My Django model:

class Article(models.Model):
    title = models.CharField(max_length=150, verbose_name=_("title"))
    # ... more 
    pub_date = models.DateTimeField(verbose_name=_('publishing date'))

该项目应该运行在几个不同的数据库系统上,所以我试图远离纯SQL

This project should run on a couple of different DB systems, so I'm trying to stay away from pure SQL as much as possible.

我想在一个查询中做,可能需要有mon th和year作为单独的字段...

I think to do it in one query you might have to have month and year as separate fields...

Article.objects.values('pub_date').annotate(article_count=Count('title'))

group by 由pub_date。但是没有办法我可以想到这样做相当于 extract function clause inline there。

That would group by by pub_date. But there is no way I can think of to do the equivalent of the extract function clause inline there.

如果你的模型是:

class Article(models.Model):
    title = models.CharField(max_length=150, verbose_name=_("title"))
    # ... more 
    pub_date = models.DateTimeField(verbose_name=_('publishing date'))
    pub_year = models.IntegerField()
    pub_month = models.IntegerField()

然后你可以做:

Article.objects.values('pub_year', 'pub_month').annotate(article_count=Count('title'))

如果你要这样做,我建议你有 pub_year pub_month 可以通过覆盖文章的 save()方法自动填充,并从中提取值pub_date

If you are going to do this, I would recommend having pub_year and pub_month be automatically populated by overriding the save() method for Article and extracting the values from pub_date.

编辑

开这样做的方法是使用 extra ;但它不会授予您数据库的独立性...

One way to do it is to use extra; but it won't grant you database independence...

models.Issue.objects.extra(select={'year': "EXTRACT(year FROM pub_date)", 'month': "EXTRACT(month from pub_date)"}).values('year', 'month').annotate(Count('title'))

虽然这将工作,我认为(未经测试),它将要求您修改 extra 字段,如果您更改数据库服务器。例如,在SQL Server中,您将执行 year(pub_date)而不是 extract(从pub_date开始的年份) ...

While this will work, I think (untested), it will require you to modify the extra fields if you ever change database servers. For instance, in SQL Server you would do year(pub_date) instead of extract(year from pub_date)...

如果您想出了一个自定义模型管理器,您将显着标记为需要这样的数据库引擎相关更改,可能不会那么糟糕。

This might not be so bad if you come up with a custom model manager that you prominently tag as requiring such database engine dependent changes.