数据湖探索 DLI-窗口:OVER WINDOW

时间:2023-11-10 09:20:50

OVER WINDOW

Over Window与Group Window区别在于Over window每一行都会输出一条记录。

语法格式

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
SELECT agg1(attr1) OVER (
  [PARTITION BY partition_name]
  ORDER BY proctime|rowtime 
  ROWS  
 BETWEEN (UNBOUNDED|rowCOUNT) PRECEDING AND CURRENT ROW FROM TABLENAME

SELECT agg1(attr1) OVER (
  [PARTITION BY partition_name]
  ORDER BY proctime|rowtime 
  RANGE  
  BETWEEN (UNBOUNDED|timeInterval) PRECEDING AND CURRENT ROW FROM TABLENAME

语法说明

表5 参数说明

参数

参数说明

PARTITION BY

指定分组的主键,每个分组各自进行计算。

ORDER BY

指定数据按processing time或event time作为时间戳。

ROWS

个数窗口。

RANGE

时间窗口。

注意事项

  • 所有的聚合必须定义到同一个窗口中,即相同的分区、排序和区间。
  • 当前仅支持 PRECEDING (无界或有界) 到 CURRENT ROW 范围内的窗口、FOLLOWING 所描述的区间并未支持。
  • ORDER BY 必须指定于单个的时间属性。

示例

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
// 计算从规则启动到目前为止的计数及总和(in proctime)
insert into temp SELECT name,
    count(amount) OVER (PARTITION BY name ORDER BY proctime RANGE UNBOUNDED preceding) as cnt1,
    sum(amount) OVER (PARTITION BY name ORDER BY proctime RANGE UNBOUNDED preceding) as cnt2
    FROM Orders;
  
// 计算最近四条记录的计数及总和(in proctime)
insert into temp SELECT name,
    count(amount) OVER (PARTITION BY name ORDER BY proctime ROWS BETWEEN 4 PRECEDING AND CURRENT ROW) as cnt1,
    sum(amount) OVER (PARTITION BY name ORDER BY proctime ROWS BETWEEN 4 PRECEDING AND CURRENT ROW) as cnt2
    FROM Orders;

// 计算最近60s的计数及总和(in eventtime),基于事件时间处理,事件时间为Orders中的timeattr字段。
insert into temp SELECT name,
    count(amount) OVER (PARTITION BY name ORDER BY timeattr RANGE BETWEEN INTERVAL '60' SECOND PRECEDING AND CURRENT ROW) as cnt1,
    sum(amount) OVER (PARTITION BY name ORDER BY timeattr RANGE BETWEEN INTERVAL '60' SECOND PRECEDING AND CURRENT ROW) as cnt2
    FROM Orders;
support.huaweicloud.com/sqlref-flink-dli/dli_08_0419.html