从表的不同条件中选择 COUNT

问题描述：

我有一个名为jobs"的表.对于特定用户，作业可以是活动的、存档的、过期的、待处理的或关闭的.现在每个页面请求都会生成 5 个 COUNT 查询，并且在尝试优化时，我试图将其减少为单个查询.这是我到目前为止所拥有的，但它仅比 5 个单独的查询快.请注意，我简化了每个子查询的条件以使其更易于理解，但是完整查询的行为相同.

I have a table named 'jobs'. For a particular user a job can be active, archived, overdue, pending, or closed. Right now every page request is generating 5 COUNT queries and in an attempt at optimization I'm trying to reduce this to a single query. This is what I have so far but it is barely faster than the 5 individual queries. Note that I've simplified the conditions for each subquery to make it easier to understand, the full query acts the same however.

有没有办法在不使用低效子查询的情况下在同一个查询中获得这 5 个计数?

Is there a way to get these 5 counts in the same query without using the inefficient subqueries?

SELECT
  (SELECT count(*)
    FROM "jobs"
    WHERE
      jobs.creator_id = 5 AND
      jobs.status_id NOT IN (8,3,11) /* 8,3,11 being 'inactive' related statuses */
  ) AS active_count, 
  (SELECT count(*)
    FROM "jobs"
    WHERE
      jobs.creator_id = 5 AND
      jobs.due_date < '2011-06-14' AND
      jobs.status_id NOT IN(8,11,5,3) /* Grabs the overdue active jobs
                                      ('5' means completed successfully) */
  ) AS overdue_count,
  (SELECT count(*)
    FROM "jobs"
    WHERE
      jobs.creator_id = 5 AND
      jobs.due_date BETWEEN '2011-06-14' AND '2011-06-15 06:00:00.000000'
  ) AS due_today_count

这会继续进行 2 个子查询，但我想你明白了.

This goes on for 2 more subqueries but I think you get the idea.

是否有更简单的方法来收集这些数据，因为它基本上是从作业表中的同一数据子集分离出 5 个不同的 COUNT?

Is there an easier way to collect this data since it's basically 5 different COUNT's off of the same subset of data from the jobs table?

数据的子集是'creator_id = 5'，之后每个计数基本上只是1-2个附加条件.请注意，现在我们正在使用 Postgres，但在不久的将来可能会转移到 MySQL.因此，如果您能提供与 ANSI 兼容的解决方案，我将不胜感激:)

The subset of data is 'creator_id = 5', after that each count is basically just 1-2 additional conditions. Note that right now we're using Postgres but may be moving to MySQL in the near future. So if you can provide an ANSI-compatible solution I'd be gratetful :)

答

这是典型的解决方案.使用 case 语句来区分不同的条件.如果一条记录满足，则为 1，否则为 0.然后对值进行 SUM

This is the typical solution. Use a case statement to break out the different conditions. If a record meets it gets a 1 else a 0. Then do a SUM on the values

  SELECT
    SUM(active_count) active_count,
    SUM(overdue_count) overdue_count
    SUM(due_today_count) due_today_count
  FROM 
  (

  SELECT 
    CASE WHEN jobs.status_id NOT IN (8,3,11) THEN 1 ELSE 0 END active_count,
    CASE WHEN jobs.due_date < '2011-06-14' AND jobs.status_id NOT IN(8,11,5,3)  THEN 1 ELSE 0 END  overdue_count,
    CASE WHEN jobs.due_date BETWEEN '2011-06-14' AND '2011-06-15 06:00:00.000000' THEN 1 ELSE 0 END  due_today_count

    FROM "jobs"
    WHERE
      jobs.creator_id = 5 ) t

更新正如所指出的，当返回 0 条记录时，这会导致所有值中的单个结果为 Null.你有三个选择

UPDATE As noted when 0 records are returned as t this result in as single result of Nulls in all the values. You have three options

1) 添加一个 Have 子句，以便您没有返回任何记录，而不是所有 NULLS 的结果

1) Add A Having clause so that you have No records returned rather than result of all NULLS

   HAVING SUM(active_count) is not null

2) 如果您希望返回所有零，则可以将合并添加到所有总和中

2) If you want all zeros returned than you could add coalesce to all your sums

例如

 SELECT
      COALESCE(SUM(active_count)) active_count,
       COALESCE(SUM(overdue_count)) overdue_count
      COALESCE(SUM(due_today_count)) due_today_count

3) 利用 COUNT(NULL) = 0 的优势，正如 sbarro 所展示的那样.你应该注意到非空值可以是任何它不必是 1

3) Take advantage of the fact that COUNT(NULL) = 0 as sbarro's demonstrated. You should note that the not-null value could be anything it doesn't have to be a 1

例如

 SELECT
      COUNT(CASE WHEN 
            jobs.status_id NOT IN (8,3,11) THEN 'Manticores Rock' ELSE NULL
       END) as [active_count]

从表的不同条件中选择 COUNT

相关推荐