示例 下面这个例子,你可以看到每个阶段(Stage)的CPU时间消耗,每个计划节点相应的代价。 这个代价是基于现实时间(wall time),而非CPU的相关时间。 对每一个计划节点,都可以看到额外的统计信息,例如每个节点实例的输入平均值,哈希碰撞(hash collisions)的平均次数。这些统计信息对于分析一条SQL语句中的数据异常情况(skewness数据倾斜,abnormal hash collisions)非常有用。 EXPLAIN ANALYZE SELECT count(*),sum(totalprice) FROM new_orders GROUP BY orderstatus;
Query Plan
-------------------------------------------------------------------------------------------------------------------------------
Fragment 1 [HASH]
CPU: 29.19ms, Scheduled: 134.78ms, Input: 2 rows (77B); per task: avg.: 1.00 std.dev.: 1.00, Output: 2 rows (36B)
Output layout: [count, sum]
Output partitioning: SINGLE []
Stage Execution Strategy: UNGROUPED_EXECUTION
Project[]
│ Layout: [count:bigint, sum:double]
│ Estimates: {rows: ? (?), cpu: ?, memory: ?, network: ?}
│ CPU: 4.00ms (2.34%), Scheduled: 10.00ms (33.33%), Output: 2 rows (36B)
│ Input avg.: 0.06 rows, Input std.dev.: 387.30%
└─ Aggregate(FINAL)[orderstatus][$hashvalue]
│ Layout: [orderstatus:varchar, $hashvalue:bigint, count:bigint, sum:double]
│ Estimates: {rows: ? (?), cpu: ?, memory: ?, network: ?}
│ CPU: 6.00ms (3.51%), Scheduled: 17.00ms (56.67%), Output: 2 rows (77B)
│ Input avg.: 0.06 rows, Input std.dev.: 387.30%
│ count := count("count_9")
│ sum := sum("sum_10")
└─ LocalExchange[HASH][$hashvalue] ("orderstatus")
│ Layout: [orderstatus:varchar, sum_10:double, count_9:bigint, $hashvalue:bigint]
│ Estimates: {rows: ? (?), cpu: ?, memory: ?, network: ?}
│ CPU: 2.00ms (1.17%), Scheduled: 3.00ms (10.00%), Output: 2 rows (77B)
│ Input avg.: 0.06 rows, Input std.dev.: 556.78%
└─ RemoteSource[2]
Layout: [orderstatus:varchar, sum_10:double, count_9:bigint, $hashvalue_11:bigint]
CPU: 1.00ms (0.58%), Scheduled: 3.00ms (10.00%), Output: 2 rows (77B)
Input avg.: 0.06 rows, Input std.dev.: 556.78%
Fragment 2 [SOURCE]
CPU: 17.35ms, Scheduled: 80.04ms, Input: 4 rows (81B); per task: avg.: 4.00 std.dev.: 0.00, Output: 2 rows (77B)
Output layout: [orderstatus, sum_10, count_9, $hashvalue_12]
Output partitioning: HASH [orderstatus][$hashvalue_12]
Stage Execution Strategy: UNGROUPED_EXECUTION
Aggregate(PARTIAL)[orderstatus][$hashvalue_12]
│ Layout: [orderstatus:varchar, $hashvalue_12:bigint, sum_10:double, count_9:bigint]
│ CPU: 1.00ms (0.58%), Scheduled: 6.00ms (20.00%), Output: 2 rows (77B)
│ Input avg.: 4.00 rows, Input std.dev.: 0.00%
│ sum_10 := sum("totalprice")
│ count_9 := count(*)
└─ ScanProject[table = hive:default:new_orders, grouped = false]
Layout: [orderstatus:varchar, totalprice:double, $hashvalue_12:bigint]
Estimates: {rows: 4 (292B), cpu: 256, memory: 0B, network: 0B}/{rows: 4 (292B), cpu: 548, memory: 0B, network: 0B}
CPU: 16.00ms (9.36%), Scheduled: 132.00ms (440.00%), Output: 4 rows (117B)
Input avg.: 4.00 rows, Input std.dev.: 0.00%
$hashvalue_12 := "combine_hash"(bigint '0', COALESCE("$operator$hash_code"("orderstatus"), 0))
orderstatus := orderstatus:string:1:REGULAR
totalprice := totalprice:double:2:REGULAR
Input: 4 rows (81B), Filtered: 0.00%
(1 row)