MAPREDUCE服务 MRS-FlinkServer作业对接Hudi表:FlinkSQL Lookup Join Hudi使用须知
时间:2025-06-10 14:43:26
FlinkSQL Lookup Join Hudi使用须知
适用于 MRS 3.5.0及以后版本。
- 使用lookup.join.cache.ttl参数来控制维表数据的加载周期,默认值为60min。
- Hudi维表数据会被加载到Flink TaskManager Heap中,所以不推荐大于10万行记录的Hudi表作为维表。
- 维表的新增、更新数据需要等到下一次加载周期后,才能被加载进来参与计算。
SQL示例如下:
CREATE TABLE hudimor( uuid VARCHAR(20), name VARCHAR(10), age INT, ts INT, `p` VARCHAR(20), PRIMARY KEY (uuid) NOT ENFORCED ) PARTITIONED BY (`p`) WITH ( 'connector' = 'hudi', 'path' = 'hdfs://hacluster/tmp/hudimor', 'table.type' = 'MERGE_ON_READ', 'hoodie.datasource.write.recordkey.field' = 'uuid', 'write.precombine.field' = 'ts', 'lookup.join.cache.ttl' = '60min' ); CREATE TABLE datagen(uuid varchar(20), proctime as PROCTIME()) WITH ( 'connector' = 'datagen', 'rows-per-second' = '1' ); CREATE TABLE blackhole ( uuid VARCHAR(20), name VARCHAR(10), age INT, ts INT, `p` VARCHAR(20) ) WITH ('connector' = 'blackhole'); insert into blackhole select t1.uuid as uuid, t2.name as name, t2.age as age, t2.ts as ts, t2.p as p FROM datagen AS t1 left JOIN hudimor FOR SYSTEM_TIME AS OF t1.proctime AS t2 ON t1.uuid = t2.uuid;
support.huaweicloud.com/cmpntguide-lts-mrs/mrs_01_24180.html