Oracle SQL查询按日期对连续记录进行分组

问题描述:

使用以下示例数据,我试图以相同的速率对记录进行分组.

With the below sample data, I am trying to group record with same rate.

   id   start_date              end_date                rate
  -----------------------------------------------------------------

    1   01/01/2017 12:00:00 am  01/01/2017 12:00:00 am  300
    1   02/01/2017 12:00:00 am  02/01/2017 12:00:00 am  300
    1   03/01/2017 12:00:00 am  03/01/2017 12:00:00 am  300
    1   04/01/2017 12:00:00 am  04/01/2017 12:00:00 am  1000
    1   05/01/2017 12:00:00 am  05/01/2017 12:00:00 am  500
    1   06/01/2017 12:00:00 am  06/01/2017 12:00:00 am  500
    1   07/01/2017 12:00:00 am  07/01/2017 12:00:00 am  1000
    1   08/01/2017 12:00:00 am  08/01/2017 12:00:00 am  1000
    1   09/01/2017 12:00:00 am  09/01/2017 12:00:00 am  300

我尝试过的:

select distinct id, mn_date, mx_date,rate
from (
    select id, min(start_date) over (partition by grp order by start_date) mn_date,
    max(end_date) over(partition by grp order by start_date desc) mx_date, rate
    from (
    select t.*, row_number() over(partition by id order by start_date) -row_number() over(partition by rate order by start_date)grp
    from t
        )
    )
order by mn_date;

输出:

id  mn_date                 mx_date                 rate
--------------------------------------------------------
1   01/01/2017 12:00:00 am  03/01/2017 12:00:00 am  300
1   04/01/2017 12:00:00 am  04/01/2017 12:00:00 am  1000
1   05/01/2017 12:00:00 am  06/01/2017 12:00:00 am  500
1   07/01/2017 12:00:00 am  09/01/2017 12:00:00 am  300
1   07/01/2017 12:00:00 am  09/01/2017 12:00:00 am  1000

所需的输出:

id  mn_date                 mx_date                 rate
--------------------------------------------------------
1   01/01/2017 12:00:00 am  03/01/2017 12:00:00 am 300
1   04/01/2017 12:00:00 am  04/01/2017 12:00:00 am 1000
1   05/01/2017 12:00:00 am  06/01/2017 12:00:00 am 500
1   07/01/2017 12:00:00 am  08/01/2017 12:00:00 am 1000
1   09/01/2017 12:00:00 am  09/01/2017 12:00:00 am 300

按连续日期分组的最终结果:(感谢Gordon)

Final result to group by consecutive dates: (Thanks to Gordon )

select id, min(start_date), max(end_date), rate
from (
select id, start_date, end_date, rate, seqnum_i-seqnum_ir grp, sum(x) over(partition by id order by start_date) grp1
from (
select t.*,
             row_number() over (partition by id order by start_date) as seqnum_i,
             row_number() over (partition by id, rate order by start_date) as seqnum_ir,
             case when LEAD(start_date) over (partition by id order by start_date)= end_date + 1 
             then 0 
             else 1
             end x
from t
)
)
group by id, grp+grp1, rate
order by min(start_date);

假定我们可以仅使用start_date来标识相邻记录(即没有空格),则可以使用行数差法:

Assuming we can just use start_date to identify the adjacent records (i.e., there are no gaps), then you can use the difference of row numbers approach:

select id, min(start_date) as mn_date, max(end_date) as mx_date, rate
from (select t.*,
             row_number() over (partition by id order by start_date) as seqnum_i,
             row_number() over (partition by id, rate order by start_date) as seqnum_ir
      from t
     ) t
group by id (seqnum_i - seqnum_ir), rate;

要查看其工作原理,请查看子查询的结果.您应该能够看到"两个行号的差值如何以相同的速率定义相邻记录的组.

To see how this works, look at the results of the subquery. You should be able to "see" how the difference of the two row numbers defines the groups of adjacent records with the same rate.