华为云用户手册

  • TPC-DS测试结果 本测试主要包括使用存算一体和存算分离两种部署架构下,TPC-DS 1T规模数据集的开箱查询性能,共99个查询。存算一体查询总耗时为622.21s,存算分离查询总耗时为645.42s。详细结果见下表。 表1 TPC-DS测试结果 TPC-DS查询 8.3.0 9.1.0.210 - 存算一体(s) 存算一体(s) 存算分离(s) Q1 2.03 0.513 0.546 Q2 9.14 4.036 4.286 Q3 1.52 1.228 1.426 Q4 200.44 125.877 122.118 Q5 3.11 1.403 1.509 Q6 0.39 0.399 0.366 Q7 1.85 1.665 2.227 Q8 0.72 0.481 0.483 Q9 11.14 13.604 12.354 Q10 1.08 0.711 0.876 Q11 92.20 62.143 63.736 Q12 0.20 0.131 0.132 Q13 2.50 3.502 3.853 Q14 74.98 17.622 18.407 Q15 2.11 0.412 0.434 Q16 6.51 3.148 3.256 Q17 2.93 1.168 2.618 Q18 2.16 1.191 1.443 Q19 0.70 0.665 0.771 Q20 0.18 0.108 0.086 Q21 0.18 0.084 0.084 Q22 5.84 1.089 1.286 Q23 144.71 85.005 90.962 Q24 8.55 9.033 9.438 Q25 3.10 1.262 2.580 Q26 0.63 0.449 0.516 Q27 2.03 1.739 2.219 Q28 13.32 6.293 6.623 Q29 2.50 2.049 2.364 Q30 0.73 0.404 0.402 Q31 3.73 3.045 2.902 Q32 0.18 0.089 0.096 Q33 1.28 0.821 0.815 Q34 2.13 2.653 3.011 Q35 3.00 2.132 1.743 Q36 6.21 1.766 1.971 Q37 0.34 0.231 0.268 Q38 52.24 10.150 11.110 Q39 5.57 5.057 5.475 Q40 0.66 0.128 0.126 Q41 0.04 0.036 0.033 Q42 0.70 0.389 0.436 Q43 1.99 1.256 1.410 Q44 3.25 2.218 2.381 Q45 0.82 0.399 0.404 Q46 4.63 3.569 4.154 Q47 6.91 4.542 5.156 Q48 2.37 3.129 3.508 Q49 3.01 2.291 2.478 Q50 5.82 3.694 4.204 Q51 10.67 4.368 4.630 Q52 0.70 0.384 0.457 Q53 0.88 0.715 0.794 Q54 3.96 0.609 4.123 Q55 0.66 0.402 0.421 Q56 0.84 0.751 0.707 Q57 3.12 1.643 1.826 Q58 0.76 0.548 0.449 Q59 17.30 8.824 9.302 Q60 1.75 0.920 0.988 Q61 1.10 0.980 1.013 Q62 1.29 0.813 0.779 Q63 0.86 0.697 0.799 Q64 14.20 6.504 7.643 Q65 6.92 3.988 3.630 Q66 1.50 1.018 1.077 Q67 153.90 54.645 56.221 Q68 3.65 2.764 3.395 Q69 0.93 0.720 0.689 Q70 23.13 3.016 3.255 Q71 2.33 2.211 2.196 Q72 3.69 2.270 2.521 Q73 1.40 1.743 2.068 Q74 37.27 34.956 29.475 Q75 11.85 5.517 5.769 Q76 3.37 2.364 2.730 Q77 1.30 1.112 1.141 Q78 152.21 12.058 14.874 Q79 4.49 3.766 4.449 Q80 3.56 1.696 1.770 Q81 0.64 0.390 0.409 Q82 0.75 0.655 0.735 Q83 0.15 0.085 0.216 Q84 0.24 0.221 0.234 Q85 2.78 1.151 1.512 Q86 2.59 0.345 0.364 Q87 78.17 10.557 11.300 Q88 6.90 9.081 9.688 Q89 2.61 0.918 1.058 Q90 0.64 0.569 0.787 Q91 0.15 0.135 0.114 Q92 0.20 0.118 0.127 Q93 7.53 3.977 4.413 Q94 3.49 1.746 2.109 Q95 29.57 27.629 25.971 Q96 1.49 1.844 2.062 Q97 8.49 3.341 3.674 Q98 1.09 0.938 0.865 Q99 2.37 1.503 1.516 SUM 1321.757 645.424 622.214 父主题: TPC-DS性能测试
  • Q8 1 2 3 4 5 6 7 8 910111213141516171819202122232425262728293031323334353637 SELECTo_year,sum(casewhen nation = 'BRAZIL' then volumeelse 0end) / sum(volume) as mkt_shareFROM(SELECTextract(year FROM o_orderdate) as o_year,l_extendedprice * (1 - l_discount) as volume,n2.n_name as nationFROMpart,supplier,lineitem,orders,customer,nation n1,nation n2,regionWHEREp_partkey = l_partkeyand s_suppkey = l_suppkeyand l_orderkey = o_orderkeyand o_custkey = c_custkeyand c_nationkey = n1.n_nationkeyand n1.n_regionkey = r_regionkeyand r_name = 'AMERICA'and s_nationkey = n2.n_nationkeyand o_orderdate between date '1995-01-01' and date '1996-12-31'and p_type = 'ECONOMY ANODIZED STEEL') as all_nationsGROUP BYo_yearORDER BYo_year;
  • Q13 1 2 3 4 5 6 7 8 91011121314151617181920 SELECTc_count,count(*) as custdistFROM(SELECTc_custkey,count(o_orderkey)FROMcustomer left outer join orders onc_custkey = o_custkeyand o_comment not like '%special%requests%'GROUP BYc_custkey) as c_orders (c_custkey, c_count)GROUP BYc_countORDER BYcustdist desc,c_count desc;
  • Q9 1 2 3 4 5 6 7 8 91011121314151617181920212223242526272829303132 SELECTnation,o_year,sum(amount) as sum_profitFROM(SELECTn_name as nation,extract(year FROM o_orderdate) as o_year,l_extendedprice * (1 - l_discount) - ps_supplycost * l_quantity as amountFROMpart,supplier,lineitem,partsupp,orders,nationWHEREs_suppkey = l_suppkeyand ps_suppkey = l_suppkeyand ps_partkey = l_partkeyand p_partkey = l_partkeyand o_orderkey = l_orderkeyand s_nationkey = n_nationkeyand p_name like '%green%') as profitGROUP BYnation,o_yearORDER BYnation,o_year desc;
  • Q2 1 2 3 4 5 6 7 8 91011121314151617181920212223242526272829303132333435363738394041424344 SELECTs_acctbal,s_name,n_name,p_partkey,p_mfgr,s_address,s_phone,s_commentFROMpart,supplier,partsupp,nation,regionWHEREp_partkey = ps_partkeyand s_suppkey = ps_suppkeyand p_size = 15and p_type like '%BRASS'and s_nationkey = n_nationkeyand n_regionkey = r_regionkeyand r_name = 'EUROPE'and ps_supplycost = (SELECTmin(ps_supplycost)FROMpartsupp,supplier,nation,regionWHEREp_partkey = ps_partkeyand s_suppkey = ps_suppkeyand s_nationkey = n_nationkeyand n_regionkey = r_regionkeyand r_name = 'EUROPE')ORDER BYs_acctbal desc,n_name,s_name,p_partkeyLIMIT 100;
  • Q7 1 2 3 4 5 6 7 8 9101112131415161718192021222324252627282930313233343536373839 SELECTsupp_nation,cust_nation,l_year,sum(volume) as revenueFROM(SELECTn1.n_name as supp_nation,n2.n_name as cust_nation,extract(year FROM l_shipdate) as l_year,l_extendedprice * (1 - l_discount) as volumeFROMsupplier,lineitem,orders,customer,nation n1,nation n2WHEREs_suppkey = l_suppkeyand o_orderkey = l_orderkeyand c_custkey = o_custkeyand s_nationkey = n1.n_nationkeyand c_nationkey = n2.n_nationkeyand ((n1.n_name = 'FRANCE' and n2.n_name = 'GERMANY')or (n1.n_name = 'GERMANY' and n2.n_name = 'FRANCE'))and l_shipdate between date '1995-01-01' and date '1996-12-31') as shippingGROUP BYsupp_nation,cust_nation,l_yearORDER BYsupp_nation,cust_nation,l_year;
  • 目的 GaussDB (DWS)是云端托管的PB级高并发实时 数据仓库 ,专注于服务OLAP领域,提供按量付费和包年包月两种付费方式。 本次性能测试基于华为云基础环境,分别在同等硬件配置和同等数据规模下,基于TPC-H、TPC-DS标准测试集,对DWS 9.1.0版本和8.3.0版本进行性能对比测试。基于SSB-Flat测试集,对DWS 9.1.0版本和开源OLAP产品ClickHouse进行对比测试。本次性能测试时间为2024年11月。
  • TPC-H测试数据 表1 TPC-H测试数据 序号 表名 行数 表大小 1 region 5 294KB 2 nation 25 298KB 3 supplier 10,000,000 1020MB 4 customer 150,000,000 8226MB 5 part 200,000,000 5216MB 6 partsupp 800,000,000 29GB 7 orders 1,500,000,000 43GB 8 lineitem 5,999,989,709 171GB 父主题: TPC-H测试过程
  • TPC-H测试结果 DWS测试了使用存算一体和存算分离两种部署架构下,TPC-H 1T规模数据集的开箱查询性能,共22个查询。存算一体查询总耗时为170.08s,存算分离查询总耗时为172.62s。 图1 TPC-H 1000X开箱性能 详细性能数据见下表。 表1 TPC-H测试结果 TPC-H查询 8.3.0 9.1.0.200 - 存算一体 存算一体 存算分离 Q1 16.97 5.82 6.25 Q2 1.44 1.11 1.15 Q3 13.02 6.17 6.64 Q4 65.56 15.39 15.26 Q5 21.90 12.72 13.40 Q6 0.84 0.42 0.46 Q7 11.08 5.55 5.65 Q8 87.18 8.62 9.37 Q9 55.98 23.46 24.69 Q10 12.20 6.75 6.76 Q11 3.57 3.52 3.41 Q12 5.92 5.05 3.60 Q13 14.28 12.16 12.31 Q14 1.87 1.29 1.26 Q15 2.24 1.25 1.44 Q16 4.52 2.19 2.14 Q17 10.54 8.40 9.42 Q18 47.97 18.34 15.16 Q19 110.03 6.88 7.50 Q20 10.07 5.19 5.60 Q21 28.82 16.25 17.37 Q22 7.04 3.53 3.76 总时长(s) 533.05 170.08 172.62 父主题: TPC-H性能测试
  • 导入TPC-H数据 执行以下SQL语句导入数据。 12345678 INSERT INTO region SELECT * FROM region_load;INSERT INTO nation SELECT * FROM nation_load;INSERT INTO supplier SELECT * FROM supplier_load;INSERT INTO customer SELECT * FROM customer_load;INSERT INTO part SELECT * FROM part_load;INSERT INTO partsupp SELECT * FROM partsupp_load;INSERT INTO orders SELECT * FROM orders_load;INSERT INTO lineitem SELECT * FROM lineitem_load;
  • 设置GUC参数 在执行测试之前,可以设置如下几个参数,这将给性能带来5-10%的提升。 123456789 cpu_tuple_cost=0.0058401054702699184cpu_operator_cost=0.002450424712151289cpu_index_tuple_cost=0.004948411136865616seq_page_cost=0.3447857201099396random_page_cost=2.3901453018188477allocate_mem_cost=0.5275554060935974effective_cache_size=502smp_thread_cost=278.525634765625stream_multiple=0.5654585361480713
  • 创建TPC-H数据集的GDS外表 连接DWS数据库后执行以下SQL语句创建。 以下每个外表的“gsfs://192.168.0.90:500x/xxx | gsfs://192.168.0.90:500x/xxx”中的IP地址和端口,请替换成安装和启动GDS中的对应的GDS的监听IP和端口。如启动两个GDS,则使用“|”区分。如果启动多个GDS,需要将所有GDS的监听IP和端口配置到外表中。 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147 DROP FOREIGN TABLE IF EXISTS region_load;CREATE FOREIGN TABLE region_load(R_REGIONKEY INT,R_NAME CHAR(25),R_COMMENT VARCHAR(152)) SERVER gsmpp_serverOPTIONS(location 'gsfs://192.168.0.90:5000/region.tbl* | gsfs://192.168.0.90:5001/region.tbl*',format 'text',deLIMITer '|',encoding 'utf8',mode 'Normal');DROP FOREIGN TABLE IF EXISTS nation_load;CREATE FOREIGN TABLE nation_load(N_NATIONKEY INT,N_NAME CHAR(25),N_REGIONKEY INT,N_COMMENT VARCHAR(152)) SERVER gsmpp_serverOPTIONS(location 'gsfs://192.168.0.90:5000/nation.tbl* | gsfs://192.168.0.90:5001/nation.tbl*',format 'text',deLIMITer '|',encoding 'utf8',mode 'Normal');DROP FOREIGN TABLE IF EXISTS supplier_load;CREATE FOREIGN TABLE supplier_load(S_SUPPKEY INT,S_NAME CHAR(25),S_ADDRESS VARCHAR(40),S_NATIONKEY INT,S_PHONE CHAR(15),S_ACCTBAL DECIMAL(15,2),S_COMMENT VARCHAR(101))SERVER gsmpp_serverOPTIONS(location 'gsfs://192.168.0.90:5000/supplier.tbl* | gsfs://192.168.0.90:5001/supplier.tbl*',format 'text',deLIMITer '|',encoding 'utf8',mode 'Normal');DROP FOREIGN TABLE IF EXISTS customer_load;CREATE FOREIGN TABLE customer_load(C_CUSTKEY INT,C_NAME VARCHAR(25),C_ADDRESS VARCHAR(40),C_NATIONKEY INT,C_PHONE CHAR(15),C_ACCTBAL DECIMAL(15,2),C_MKTSEGMENT CHAR(10),C_COMMENT VARCHAR(117))SERVER gsmpp_serverOPTIONS(location 'gsfs://192.168.0.90:5000/customer.tbl* | gsfs://192.168.0.90:5001/customer.tbl*',format 'text',deLIMITer '|',encoding 'utf8',mode 'Normal');DROP FOREIGN TABLE IF EXISTS part_load;CREATE FOREIGN TABLE part_load(P_PARTKEY INT,P_NAME VARCHAR(55),P_MFGR CHAR(25),P_BRAND CHAR(10),P_TYPE VARCHAR(25),P_SIZE INT,P_CONTAINER CHAR(10),P_RETAILPRICE DECIMAL(15,2),P_COMMENT VARCHAR(23))SERVER gsmpp_serverOPTIONS(location 'gsfs://192.168.0.90:5000/part.tbl* | gsfs://192.168.0.90:5001/part.tbl*',format 'text',deLIMITer '|',encoding 'utf8',mode 'Normal');DROP FOREIGN TABLE IF EXISTS partsupp_load;CREATE FOREIGN TABLE partsupp_load(PS_PARTKEY INT,PS_SUPPKEY INT,PS_AVAILQTY INT,PS_SUPPLYCOST DECIMAL(15,2),PS_COMMENT VARCHAR(199))SERVER gsmpp_serverOPTIONS(location 'gsfs://192.168.0.90:5000/partsupp.tbl* | gsfs://192.168.0.90:5001/partsupp.tbl*',format 'text',deLIMITer '|',encoding 'utf8',mode 'Normal');DROP FOREIGN TABLE IF EXISTS orders_load;CREATE FOREIGN TABLE orders_load(O_ORDERKEY BIGINT,O_CUSTKEY INT,O_ORDERSTATUS CHAR(1),O_TOTALPRICE DECIMAL(15,2),O_ORDERDATE DATE,O_ORDERPRIORITY CHAR(15),O_CLERK CHAR(15),O_SHIPPRIORITY INT,O_COMMENT VARCHAR(79))SERVER gsmpp_serverOPTIONS(location 'gsfs://192.168.0.90:5000/orders.tbl* | gsfs://192.168.0.90:5001/orders.tbl*',format 'text',deLIMITer '|',encoding 'utf8',mode 'Normal');DROP FOREIGN TABLE IF EXISTS lineitem_load;CREATE FOREIGN TABLE lineitem_load(L_ORDERKEY BIGINT,L_PARTKEY INT,L_SUPPKEY INT,L_LINENUMBER INT,L_QUANTITY DECIMAL(15,2),L_EXTENDEDPRICE DECIMAL(15,2),L_DISCOUNT DECIMAL(15,2),L_TAX DECIMAL(15,2),L_RETURNFLAG CHAR(1),L_LINESTATUS CHAR(1),L_SHIPDATE DATE,L_COMMITDATE DATE,L_RECEIPTDATE DATE,L_SHIPINSTRUCT CHAR(25),L_SHIPMODE CHAR(10),L_COMMENT VARCHAR(44))SERVER gsmpp_serverOPTIONS(location 'gsfs://192.168.0.90:5000/lineitem.tbl* | gsfs://192.168.0.90:5001/lineitem.tbl*',format 'text',deLIMITer '|',encoding 'utf8',mode 'Normal');
  • 安装和启动GDS 参见工具下载下载GDS客户端(与gsql客户端在一个包)。 将GDS工具包上传至E CS 的/opt目录中,本例以上传Euler Kunpeng版本的工具包为例。 在工具包所在目录下,解压工具包。 cd /opt/unzip dws_client_8.1.x_euler_kunpeng_x64.zip 创建用户gds_user及其所属的用户组gdsgrp。此用户用于启动GDS,且需要拥有读取数据源文件目录的权限。 groupadd gdsgrpuseradd -g gdsgrp gds_user 修改工具包以及数据源文件目录属主为创建的用户gds_user及其所属的用户组gdsgrp。 chown -R gds_user:gdsgrp /opt/chown -R gds_user:gdsgrp /data1chown -R gds_user:gdsgrp /data2 切换到gds_user用户。 su - gds_user 执行环境依赖脚本(仅8.1.x版本适用)。 cd /opt/gds/binsource gds_env 启动GDS。 /opt/gds/bin/gds -d /data1/script/tpch-kit/tpch1000X -p 192.168.0.90:5000 -H 192.168.0.0/24 -l /opt/gds/gds01_log.txt -D #TPC-H使用/opt/gds/bin/gds -d /data2/script/tpch-kit/tpch1000X -p 192.168.0.90:5001 -H 192.168.0.0/24 -l /opt/gds/gds02_log.txt -D #TPC-H使用/opt/gds/bin/gds -d /data1/script/tpcds-kit/tpcds1000X/ -p 192.168.0.90:5002 -H 192.168.0.0/24 -l /opt/gds/gds03_log.txt -D #TPC-DS使用/opt/gds/bin/gds -d /data2/script/tpcds-kit/tpcds1000X/ -p 192.168.0.90:5003 -H 192.168.0.0/24 -l /opt/gds/gds04_log.txt -D #TPC-DS使用/opt/gds/bin/gds -d /data1/script/ssb-kit/ssb100X/ -p 192.168.0.90:5004 -H 192.168.0.0/24 -l /opt/gds/gds05_log.txt -D #SSB使用 命令中的斜体部分请根据实际填写,如果数据分片存放至多个数据盘目录,需要启动对应目录数量的GDS。 如果TPC-H和TPC-DS数据同时测试,需要启动以上4个GDS,如果只测试TPC-DS或TPC-H数据,请根据后面的“#xxx”备注启动对应的GDS服务即可。 -d dir:保存有待导入数据的数据文件所在目录。 -p ip:port:GDS监听IP和监听端口。IP替换为ECS的内网IP,确保GaussDB(DWS)能通过此IP与GDS的通讯;端口对于TPC-H取5000、5001,对于TPC-DS取5002、5003。 -H address_string:允许哪些主机连接和使用GDS服务。参数需为CIDR格式。此地址配置成GaussDB(DWS)的集群内网网段(即GDS所在的ECS与GaussDB(DWS)在同一个VPC下,以内网通讯即可),例如192.168.0.0/24。 -l log_file:存放GDS的日志文件路径及文件名。 -D:后台运行GDS。仅支持Linux操作系统下使用。
  • 创建TPC-H目标表 连接DWS数据库后执行以下命令创建目标表。 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198 CREATE TABLE REGION( R_REGIONKEY INT NOT NULL , R_NAME VARCHAR(25) NOT NULL , R_COMMENT VARCHAR(152)) WITH (orientation=column, colversion=2.0, enable_hstore=true, enable_hstore_opt=true,bucketnums=16384)DISTRIBUTE BY replication; CREATE TABLE NATION( N_NATIONKEY INT NOT NULL , N_NAME VARCHAR(25) NOT NULL , N_REGIONKEY INT NOT NULL , N_COMMENT VARCHAR(152)) WITH (orientation=column, colversion=2.0, enable_hstore=true, enable_hstore_opt=true,bucketnums=16384)DISTRIBUTE BY replication; CREATE TABLE SUPPLIER( S_SUPPKEY BIGINT NOT NULL , S_NAME VARCHAR(25) NOT NULL , S_ADDRESS VARCHAR(40) NOT NULL , S_NATIONKEY INT NOT NULL , S_PHONE VARCHAR(15) NOT NULL , S_ACCTBAL DECIMAL(15,2) NOT NULL , S_COMMENT VARCHAR(101) NOT NULL) WITH (orientation=column, colversion=2.0,enable_hstore=true,enable_hstore_opt=true,bucketnums=16384,secondary_part_column='s_suppkey',secondary_part_num=72)DISTRIBUTE BY hash(S_SUPPKEY)PARTITION BY RANGE(S_NATIONKEY)(PARTITION S_NATIONKEY_1 VALUES LESS THAN(1),PARTITION S_NATIONKEY_2 VALUES LESS THAN(2),PARTITION S_NATIONKEY_3 VALUES LESS THAN(3),PARTITION S_NATIONKEY_4 VALUES LESS THAN(4),PARTITION S_NATIONKEY_5 VALUES LESS THAN(5),PARTITION S_NATIONKEY_6 VALUES LESS THAN(6),PARTITION S_NATIONKEY_7 VALUES LESS THAN(7),PARTITION S_NATIONKEY_8 VALUES LESS THAN(8),PARTITION S_NATIONKEY_9 VALUES LESS THAN(9),PARTITION S_NATIONKEY_10 VALUES LESS THAN(10),PARTITION S_NATIONKEY_11 VALUES LESS THAN(11),PARTITION S_NATIONKEY_12 VALUES LESS THAN(12),PARTITION S_NATIONKEY_13 VALUES LESS THAN(13),PARTITION S_NATIONKEY_14 VALUES LESS THAN(14),PARTITION S_NATIONKEY_15 VALUES LESS THAN(15),PARTITION S_NATIONKEY_16 VALUES LESS THAN(16),PARTITION S_NATIONKEY_17 VALUES LESS THAN(17),PARTITION S_NATIONKEY_18 VALUES LESS THAN(18),PARTITION S_NATIONKEY_19 VALUES LESS THAN(19),PARTITION S_NATIONKEY_20 VALUES LESS THAN(20),PARTITION S_NATIONKEY_21 VALUES LESS THAN(21),PARTITION S_NATIONKEY_22 VALUES LESS THAN(22),PARTITION S_NATIONKEY_23 VALUES LESS THAN(23),PARTITION S_NATIONKEY_24 VALUES LESS THAN(24),PARTITION S_NATIONKEY_25 VALUES LESS THAN(25)); CREATE TABLE CUSTOMER( C_CUSTKEY BIGINT NOT NULL , C_NAME VARCHAR(25) NOT NULL , C_ADDRESS VARCHAR(40) NOT NULL , C_NATIONKEY INT NOT NULL , C_PHONE VARCHAR(15) NOT NULL , C_ACCTBAL DECIMAL(15,2) NOT NULL , C_MKTSEGMENT VARCHAR(10) NOT NULL , C_COMMENT VARCHAR(117) NOT NULL) WITH (orientation=column, colversion=2.0,enable_hstore=true,enable_hstore_opt=true,bucketnums=16384,secondary_part_column='c_custkey',secondary_part_num=72)DISTRIBUTE BY hash(C_CUSTKEY)PARTITION BY RANGE(C_NATIONKEY)(PARTITION C_NATIONKEY_1 VALUES LESS THAN(1),PARTITION C_NATIONKEY_2 VALUES LESS THAN(2),PARTITION C_NATIONKEY_3 VALUES LESS THAN(3),PARTITION C_NATIONKEY_4 VALUES LESS THAN(4),PARTITION C_NATIONKEY_5 VALUES LESS THAN(5),PARTITION C_NATIONKEY_6 VALUES LESS THAN(6),PARTITION C_NATIONKEY_7 VALUES LESS THAN(7),PARTITION C_NATIONKEY_8 VALUES LESS THAN(8),PARTITION C_NATIONKEY_9 VALUES LESS THAN(9),PARTITION C_NATIONKEY_10 VALUES LESS THAN(10),PARTITION C_NATIONKEY_11 VALUES LESS THAN(11),PARTITION C_NATIONKEY_12 VALUES LESS THAN(12),PARTITION C_NATIONKEY_13 VALUES LESS THAN(13),PARTITION C_NATIONKEY_14 VALUES LESS THAN(14),PARTITION C_NATIONKEY_15 VALUES LESS THAN(15),PARTITION C_NATIONKEY_16 VALUES LESS THAN(16),PARTITION C_NATIONKEY_17 VALUES LESS THAN(17),PARTITION C_NATIONKEY_18 VALUES LESS THAN(18),PARTITION C_NATIONKEY_19 VALUES LESS THAN(19),PARTITION C_NATIONKEY_20 VALUES LESS THAN(20),PARTITION C_NATIONKEY_21 VALUES LESS THAN(21),PARTITION C_NATIONKEY_22 VALUES LESS THAN(22),PARTITION C_NATIONKEY_23 VALUES LESS THAN(23),PARTITION C_NATIONKEY_24 VALUES LESS THAN(24),PARTITION C_NATIONKEY_25 VALUES LESS THAN(25)); CREATE TABLE PART( P_PARTKEY BIGINT NOT NULL , P_NAME VARCHAR(55) NOT NULL , P_MFGR VARCHAR(25) NOT NULL , P_BRAND VARCHAR(10) NOT NULL , P_TYPE VARCHAR(25) NOT NULL , P_SIZE BIGINT NOT NULL , P_CONTAINER VARCHAR(10) NOT NULL , P_RETAILPRICE DECIMAL(15,2) NOT NULL , P_COMMENT VARCHAR(23) NOT NULL ) WITH (orientation=column, colversion=2.0,enable_hstore=true,enable_hstore_opt=true,bucketnums=16384,secondary_part_column='p_partkey',secondary_part_num=72)DISTRIBUTE BY hash(P_PARTKEY)PARTITION BY RANGE(P_SIZE)(PARTITION P_SIZE_1 VALUES LESS THAN(11),PARTITION P_SIZE_2 VALUES LESS THAN(21),PARTITION P_SIZE_3 VALUES LESS THAN(31),PARTITION P_SIZE_4 VALUES LESS THAN(41),PARTITION P_SIZE_5 VALUES LESS THAN(51)); CREATE TABLE PARTSUPP( PS_PARTKEY BIGINT NOT NULL , PS_SUPPKEY BIGINT NOT NULL , PS_AVAILQTY BIGINT NOT NULL , PS_SUPPLYCOST DECIMAL(15,2) NOT NULL , PS_COMMENT VARCHAR(199) NOT NULL) WITH (orientation=column, colversion=2.0,enable_hstore=true,enable_hstore_opt=true,bucketnums=16384,secondary_part_column='ps_partkey',secondary_part_num=72)DISTRIBUTE BY hash(PS_PARTKEY)PARTITION BY RANGE(PS_AVAILQTY)(PARTITION PS_AVAILQTY_1 VALUES LESS THAN(1000),PARTITION PS_AVAILQTY_2 VALUES LESS THAN(2000),PARTITION PS_AVAILQTY_3 VALUES LESS THAN(3000),PARTITION PS_AVAILQTY_4 VALUES LESS THAN(4000),PARTITION PS_AVAILQTY_5 VALUES LESS THAN(5000),PARTITION PS_AVAILQTY_6 VALUES LESS THAN(6000),PARTITION PS_AVAILQTY_7 VALUES LESS THAN(7000),PARTITION PS_AVAILQTY_8 VALUES LESS THAN(8000),PARTITION PS_AVAILQTY_9 VALUES LESS THAN(9000),PARTITION PS_AVAILQTY_10 VALUES LESS THAN(10000)) ; CREATE TABLE ORDERS( O_ORDERKEY BIGINT NOT NULL , O_CUSTKEY BIGINT NOT NULL , O_ORDERSTATUS VARCHAR(1) NOT NULL , O_TOTALPRICE DECIMAL(15,2) NOT NULL , O_ORDERDATE DATE NOT NULL , O_ORDERPRIORITY VARCHAR(15) NOT NULL , O_CLERK VARCHAR(15) NOT NULL , O_SHIPPRIORITY BIGINT NOT NULL , O_COMMENT VARCHAR(79) NOT NULL) WITH (orientation=column, colversion=2.0,enable_hstore=true,enable_hstore_opt=true,bucketnums=16384,secondary_part_column='o_orderkey',secondary_part_num=72)DISTRIBUTE BY hash(O_ORDERKEY)PARTITION BY RANGE(O_ORDERDATE)(PARTITION O_ORDERDATE_1 VALUES LESS THAN('1993-01-01 00:00:00'),PARTITION O_ORDERDATE_2 VALUES LESS THAN('1994-01-01 00:00:00'),PARTITION O_ORDERDATE_3 VALUES LESS THAN('1995-01-01 00:00:00'),PARTITION O_ORDERDATE_4 VALUES LESS THAN('1996-01-01 00:00:00'),PARTITION O_ORDERDATE_5 VALUES LESS THAN('1997-01-01 00:00:00'),PARTITION O_ORDERDATE_6 VALUES LESS THAN('1998-01-01 00:00:00'),PARTITION O_ORDERDATE_7 VALUES LESS THAN('1999-01-01 00:00:00')) ; CREATE TABLE LINEITEM( L_ORDERKEY BIGINT NOT NULL , L_PARTKEY BIGINT NOT NULL , L_SUPPKEY BIGINT NOT NULL , L_LINENUMBER BIGINT NOT NULL , L_QUANTITY DECIMAL(15,2) NOT NULL , L_EXTENDEDPRICE DECIMAL(15,2) NOT NULL , L_DISCOUNT DECIMAL(15,2) NOT NULL , L_TAX DECIMAL(15,2) NOT NULL , L_RETURNFLAG VARCHAR(1) NOT NULL , L_LINESTATUS VARCHAR(1) NOT NULL , L_SHIPDATE DATE NOT NULL , L_COMMITDATE DATE NOT NULL , L_RECEIPTDATE DATE NOT NULL , L_SHIPINSTRUCT VARCHAR(25) NOT NULL , L_SHIPMODE VARCHAR(10) NOT NULL , L_COMMENT VARCHAR(44) NOT NULL) WITH (orientation=column, colversion=2.0,enable_hstore=true,enable_hstore_opt=true,bucketnums=16384,secondary_part_column='l_orderkey',secondary_part_num=72)DISTRIBUTE BY hash(L_ORDERKEY)PARTITION BY RANGE(L_SHIPDATE)(PARTITION L_SHIPDATE_1 VALUES LESS THAN('1993-01-01 00:00:00'),PARTITION L_SHIPDATE_2 VALUES LESS THAN('1994-01-01 00:00:00'),PARTITION L_SHIPDATE_3 VALUES LESS THAN('1995-01-01 00:00:00'),PARTITION L_SHIPDATE_4 VALUES LESS THAN('1996-01-01 00:00:00'),PARTITION L_SHIPDATE_5 VALUES LESS THAN('1997-01-01 00:00:00'),PARTITION L_SHIPDATE_6 VALUES LESS THAN('1998-01-01 00:00:00'),PARTITION L_SHIPDATE_7 VALUES LESS THAN('1999-01-01 00:00:00')) ;
  • Q4.1 1 2 3 4 5 6 7 8 910 SELECT div(LO_ORDERDATE,10000) AS YEAR,C_NATION,SUM(LO_REVENUE - LO_SUPPLYCOST) AS profitFROM lineorder_flatWHEREC_REGION = 'AMERICA'AND S_REGION = 'AMERICA'AND P_MFGR IN ('MFGR#1', 'MFGR#2')GROUP BY YEAR, C_NATIONORDER BY YEAR ASC, C_NATION ASC;
  • SQL18 1 2 3 4 5 6 7 8 910111213141516171819202122232425262728293031 select i_item_id, ca_country, ca_state, ca_county, avg( cast(cs_quantity as decimal(12,2))) agg1, avg( cast(cs_list_price as decimal(12,2))) agg2, avg( cast(cs_coupon_amt as decimal(12,2))) agg3, avg( cast(cs_sales_price as decimal(12,2))) agg4, avg( cast(cs_net_profit as decimal(12,2))) agg5, avg( cast(c_birth_year as decimal(12,2))) agg6, avg( cast(cd1.cd_dep_count as decimal(12,2))) agg7 from catalog_sales, customer_demographics cd1, customer_demographics cd2, customer, customer_address, date_dim, item where cs_sold_date_sk = d_date_sk and cs_item_sk = i_item_sk and cs_bill_cdemo_sk = cd1.cd_demo_sk and cs_bill_customer_sk = c_customer_sk and cd1.cd_gender = 'M' and cd1.cd_education_status = 'Primary' and c_current_cdemo_sk = cd2.cd_demo_sk and c_current_addr_sk = ca_address_sk and c_birth_month in (10,1,8,7,3,5) and d_year = 1998 and ca_state in ('NE','OK','NC' ,'CO','ID','AR','MO') group by rollup (i_item_id, ca_country, ca_state, ca_county) order by ca_country, ca_state, ca_county,i_item_id limit 100;
  • SQL17 1 2 3 4 5 6 7 8 9101112131415161718192021222324252627282930313233343536373839404142 select i_item_id ,i_item_desc ,s_state ,count(ss_quantity) as store_sales_quantitycount ,avg(ss_quantity) as store_sales_quantityave ,stddev_samp(ss_quantity) as store_sales_quantitystdev ,stddev_samp(ss_quantity)/avg(ss_quantity) as store_sales_quantitycov ,count(sr_return_quantity) as store_returns_quantitycount ,avg(sr_return_quantity) as store_returns_quantityave ,stddev_samp(sr_return_quantity) as store_returns_quantitystdev ,stddev_samp(sr_return_quantity)/avg(sr_return_quantity) as store_returns_quantitycov ,count(cs_quantity) as catalog_sales_quantitycount ,avg(cs_quantity) as catalog_sales_quantityave ,stddev_samp(cs_quantity) as catalog_sales_quantitystdev ,stddev_samp(cs_quantity)/avg(cs_quantity) as catalog_sales_quantitycov from store_sales ,store_returns ,catalog_sales ,date_dim d1 ,date_dim d2 ,date_dim d3 ,store ,item where d1.d_quarter_name = '2000Q1' and d1.d_date_sk = ss_sold_date_sk and i_item_sk = ss_item_sk and s_store_sk = ss_store_sk and ss_customer_sk = sr_customer_sk and ss_item_sk = sr_item_sk and ss_ticket_number = sr_ticket_number and sr_returned_date_sk = d2.d_date_sk and d2.d_quarter_name in ('2000Q1','2000Q2','2000Q3') and sr_customer_sk = cs_bill_customer_sk and sr_item_sk = cs_item_sk and cs_sold_date_sk = d3.d_date_sk and d3.d_quarter_name in ('2000Q1','2000Q2','2000Q3') group by i_item_id ,i_item_desc ,s_state order by i_item_id ,i_item_desc ,s_statelimit 100;
  • SQL20 1 2 3 4 5 6 7 8 9101112131415161718192021222324252627 select i_item_id ,i_item_desc ,i_category ,i_class ,i_current_price ,sum(cs_ext_sales_price) as itemrevenue ,sum(cs_ext_sales_price)*100/sum(sum(cs_ext_sales_price)) over (partition by i_class) as revenueratio fromcatalog_sales ,item ,date_dim where cs_item_sk = i_item_sk and i_category in ('Sports', 'Shoes', 'Women') and cs_sold_date_sk = d_date_sk and d_date between cast('2001-03-21' as date) and (cast('2001-03-21' as date) + 30) group by i_item_id ,i_item_desc ,i_category ,i_class ,i_current_price order by i_category ,i_class ,i_item_id ,i_item_desc ,revenueratiolimit 100;
  • SQL10 1 2 3 4 5 6 7 8 91011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556 select cd_gender, cd_marital_status, cd_education_status, count(*) cnt1, cd_purchase_estimate, count(*) cnt2, cd_credit_rating, count(*) cnt3, cd_dep_count, count(*) cnt4, cd_dep_employed_count, count(*) cnt5, cd_dep_college_count, count(*) cnt6 from customer c,customer_address ca,customer_demographics where c.c_current_addr_sk = ca.ca_address_sk and ca_county in ('Clark County','Richardson County','Tom Green County','Sullivan County','Cass County') and cd_demo_sk = c.c_current_cdemo_sk and exists (select * from store_sales,date_dim where c.c_customer_sk = ss_customer_sk and ss_sold_date_sk = d_date_sk and d_year = 2000 and d_moy between 1 and 1+3) and (exists (select * from web_sales,date_dim where c.c_customer_sk = ws_bill_customer_sk and ws_sold_date_sk = d_date_sk and d_year = 2000 and d_moy between 1 ANd 1+3) or exists (select * from catalog_sales,date_dim where c.c_customer_sk = cs_ship_customer_sk and cs_sold_date_sk = d_date_sk and d_year = 2000 and d_moy between 1 and 1+3)) group by cd_gender, cd_marital_status, cd_education_status, cd_purchase_estimate, cd_credit_rating, cd_dep_count, cd_dep_employed_count, cd_dep_college_count order by cd_gender, cd_marital_status, cd_education_status, cd_purchase_estimate, cd_credit_rating, cd_dep_count, cd_dep_employed_count, cd_dep_college_countlimit 100;
  • SQL12 1 2 3 4 5 6 7 8 910111213141516171819202122232425262728293031 select i_item_id ,i_item_desc ,i_category ,i_class ,i_current_price ,sum(ws_ext_sales_price) as itemrevenue ,sum(ws_ext_sales_price)*100/sum(sum(ws_ext_sales_price)) over (partition by i_class) as revenueratiofromweb_sales,item ,date_dimwhere ws_item_sk = i_item_sk and i_category in ('Music', 'Shoes', 'Children')and ws_sold_date_sk = d_date_skand d_date between cast('2000-05-14' as date) and (cast('2000-05-14' as date) + 30 )group by i_item_id ,i_item_desc ,i_category ,i_class ,i_current_priceorder by i_category ,i_class ,i_item_id ,i_item_desc ,revenueratiolimit 100;
  • SQL7 1 2 3 4 5 6 7 8 9101112131415161718 select i_item_id, avg(ss_quantity) agg1, avg(ss_list_price) agg2, avg(ss_coupon_amt) agg3, avg(ss_sales_price) agg4 from store_sales, customer_demographics, date_dim, item, promotion where ss_sold_date_sk = d_date_sk and ss_item_sk = i_item_sk and ss_cdemo_sk = cd_demo_sk and ss_promo_sk = p_promo_sk and cd_gender = 'M' and cd_marital_status = 'U' and cd_education_status = 'College' and (p_channel_email = 'N' or p_channel_event = 'N') and d_year = 1999 group by i_item_id order by i_item_id limit 100;
  • SQL13 1 2 3 4 5 6 7 8 910111213141516171819202122232425262728293031323334353637383940414243444546474849 select avg(ss_quantity) ,avg(ss_ext_sales_price) ,avg(ss_ext_wholesale_cost) ,sum(ss_ext_wholesale_cost) from store_sales ,store ,customer_demographics ,household_demographics ,customer_address ,date_dim where s_store_sk = ss_store_sk and ss_sold_date_sk = d_date_sk and d_year = 2001 and((ss_hdemo_sk=hd_demo_sk and cd_demo_sk = ss_cdemo_sk and cd_marital_status = 'U' and cd_education_status = '4 yr Degree' and ss_sales_price between 100.00 and 150.00 and hd_dep_count = 3 )or (ss_hdemo_sk=hd_demo_sk and cd_demo_sk = ss_cdemo_sk and cd_marital_status = 'D' and cd_education_status = '2 yr Degree' and ss_sales_price between 50.00 and 100.00 and hd_dep_count = 1 ) or (ss_hdemo_sk=hd_demo_sk and cd_demo_sk = ss_cdemo_sk and cd_marital_status = 'S' and cd_education_status = 'Advanced Degree' and ss_sales_price between 150.00 and 200.00 and hd_dep_count = 1 )) and((ss_addr_sk = ca_address_sk and ca_country = 'United States' and ca_state in ('IL', 'WI', 'TN') and ss_net_profit between 100 and 200 ) or (ss_addr_sk = ca_address_sk and ca_country = 'United States' and ca_state in ('MO', 'OK', 'WA') and ss_net_profit between 150 and 300 ) or (ss_addr_sk = ca_address_sk and ca_country = 'United States' and ca_state in ('NE', 'VA', 'GA') and ss_net_profit between 50 and 250 ));
  • SQL3 1 2 3 4 5 6 7 8 9101112131415161718 select dt.d_year ,item.i_brand_id brand_id ,item.i_brand brand ,sum(ss_ext_sales_price) sum_agg from date_dim dt ,store_sales ,item where dt.d_date_sk = store_sales.ss_sold_date_sk and store_sales.ss_item_sk = item.i_item_sk and item.i_manufact_id = 125 and dt.d_moy=11 group by dt.d_year ,item.i_brand ,item.i_brand_id order by dt.d_year ,sum_agg desc ,brand_id limit 100;
  • SQL5 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99100101102103104105106107108109110111112113114115116117118119120121122123124125 with ssr as (select s_store_id, sum(sales_price) as sales, sum(profit) as profit, sum(return_amt) as returns, sum(net_loss) as profit_loss from ( select ss_store_sk as store_sk, ss_sold_date_sk as date_sk, ss_ext_sales_price as sales_price, ss_net_profit as profit, cast(0 as decimal(7,2)) as return_amt, cast(0 as decimal(7,2)) as net_loss from store_sales union all select sr_store_sk as store_sk, sr_returned_date_sk as date_sk, cast(0 as decimal(7,2)) as sales_price, cast(0 as decimal(7,2)) as profit, sr_return_amt as return_amt, sr_net_loss as net_loss from store_returns ) salesreturns, date_dim, store where date_sk = d_date_sk and d_date between cast('2002-08-05' as date) and (cast('2002-08-05' as date) + 14 ) and store_sk = s_store_sk group by s_store_id) , csr as (select cp_catalog_page_id, sum(sales_price) as sales, sum(profit) as profit, sum(return_amt) as returns, sum(net_loss) as profit_loss from ( select cs_catalog_page_sk as page_sk, cs_sold_date_sk as date_sk, cs_ext_sales_price as sales_price, cs_net_profit as profit, cast(0 as decimal(7,2)) as return_amt, cast(0 as decimal(7,2)) as net_loss from catalog_sales union all select cr_catalog_page_sk as page_sk, cr_returned_date_sk as date_sk, cast(0 as decimal(7,2)) as sales_price, cast(0 as decimal(7,2)) as profit, cr_return_amount as return_amt, cr_net_loss as net_loss from catalog_returns ) salesreturns, date_dim, catalog_page where date_sk = d_date_sk and d_date between cast('2002-08-05' as date) and (cast('2002-08-05' as date) + 14 ) and page_sk = cp_catalog_page_sk group by cp_catalog_page_id) , wsr as (select web_site_id, sum(sales_price) as sales, sum(profit) as profit, sum(return_amt) as returns, sum(net_loss) as profit_loss from ( select ws_web_site_sk as wsr_web_site_sk, ws_sold_date_sk as date_sk, ws_ext_sales_price as sales_price, ws_net_profit as profit, cast(0 as decimal(7,2)) as return_amt, cast(0 as decimal(7,2)) as net_loss from web_sales union all select ws_web_site_sk as wsr_web_site_sk, wr_returned_date_sk as date_sk, cast(0 as decimal(7,2)) as sales_price, cast(0 as decimal(7,2)) as profit, wr_return_amt as return_amt, wr_net_loss as net_loss from web_returns left outer join web_sales on ( wr_item_sk = ws_item_sk and wr_order_number = ws_order_number) ) salesreturns, date_dim, web_site where date_sk = d_date_sk and d_date between cast('2002-08-05' as date) and (cast('2002-08-05' as date) + 14 ) and wsr_web_site_sk = web_site_sk group by web_site_id) select channel , id , sum(sales) as sales , sum(returns) as returns , sum(profit) as profit from (select 'store channel' as channel , 'store' || s_store_id as id , sales , returns , (profit - profit_loss) as profit from ssr union all select 'catalog channel' as channel , 'catalog_page' || cp_catalog_page_id as id , sales , returns , (profit - profit_loss) as profit from csr union all select 'web channel' as channel , 'web_site' || web_site_id as id , sales , returns , (profit - profit_loss) as profit from wsr ) x group by rollup (channel, id) order by channel ,id limit 100;
  • 命令生成方法 TPC-DS标准99个SQL查询语句可用如下方法生成: 准备工作。生成TPC-DS查询语句前需要修改query_templates目录下的文件: 登录测试过程申请的ECS,进入/data1/script/tpcds-kit/DSGen-software-code-3.2.0rc1/query_templates目录: 1 cd /data1/script/tpcds-kit/DSGen-software-code-3.2.0rc1/query_templates 新建文件hwdws.tpl,内容为: 12345 define __LIMITA = "";define __LIMITB = "";define __LIMITC = "limit %d";define _BEGIN = "-- begin query " + [_QUERY] + " in stream " + [_STREAM] + " using template " + [_TEMPLATE];define _END = "-- end query " + [_QUERY] + " in stream " + [_STREAM] + " using template " + [_TEMPLATE]; 因TPC-DS工具中SQL语句生成模板有语法错误,需修改query77.tpl,将135行的‘, coalesce(returns, 0) returns’改为‘, coalesce(returns, 0) as returns’。 执行以下命令生成查询语句: 12 cd /data1/script/tpcds-kit/DSGen-software-code-3.2.0rc1/tools./dsqgen -input ../query_templates/templates.lst -directory ../query_templates/ -scale 1000 -dialect hwdws 执行后会生成query_0.sql文件,里面放着99个标准SQL语句,需要手动去切分成99个文件。 生成的标准查询中如下日期函数语法在GaussDB(DWS)暂不支持,需要手动进行修改: 1 2 3 4 5 6 7 8 910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667 Q5: and (cast('2001-08-19' as date) + 14 days)修改为and (cast('2001-08-19' as date) + 14)Q12:and (cast('1999-02-28' as date) + 30 days)修改为and (cast('1999-02-28' as date) + 30)Q16:(cast('1999-4-01' as date) + 60 days)修改为(cast('1999-4-01' as date) + 60)Q20:and (cast('1998-05-05' as date) + 30 days)修改为and (cast('1998-05-05' as date) + 30)Q21:and d_date between (cast ('2000-05-19' as date) - 30 days)修改为and d_date between (cast ('2000-05-19' as date) - 30)and (cast ('2000-05-19' as date) + 30 days)修改为and (cast ('2000-05-19' as date) + 30)Q32:(cast('1999-02-22' as date) + 90 days)修改为(cast('1999-02-22' as date) + 90)Q37:and d_date between cast('1998-04-29' as date) and (cast('1998-04-29' as date) + 60 days)修改为and d_date between cast('1998-04-29' as date) and (cast('1998-04-29' as date) + 60)Q40:and d_date between (cast ('2002-05-10' as date) - 30 days)修改为and d_date between (cast ('2002-05-10' as date) - 30)and (cast ('2002-05-10' as date) + 30 days)修改为and (cast ('2002-05-10' as date) + 30)Q77:and (cast('1999-08-29' as date) + 30 days)修改为and (cast('1999-08-29' as date) + 30)Q80:and (cast('2002-08-04' as date) + 30 days)修改为and (cast('2002-08-04' as date) + 30)Q82:and d_date between cast('1998-01-18' as date) and (cast('1998-01-18' as date) + 60 days)修改为and d_date between cast('1998-01-18' as date) and (cast('1998-01-18' as date) + 60)Q92:(cast('2001-01-26' as date) + 90 days)修改为(cast('2001-01-26' as date) + 90)Q94:(cast('1999-5-01' as date) + 60 days)修改为(cast('1999-5-01' as date) + 60)Q95:(cast('1999-4-01' as date) + 60 days)修改为(cast('1999-4-01' as date) + 60)Q98:and (cast('2002-04-01' as date) + 30 days)修改为and (cast('2002-04-01' as date) + 30)
  • SQL2 1 2 3 4 5 6 7 8 9101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657 with wscs as (select sold_date_sk ,sales_price from (select ws_sold_date_sk sold_date_sk ,ws_ext_sales_price sales_price from web_sales union all select cs_sold_date_sk sold_date_sk ,cs_ext_sales_price sales_price from catalog_sales)), wswscs as (select d_week_seq, sum(case when (d_day_name='Sunday') then sales_price else null end) sun_sales, sum(case when (d_day_name='Monday') then sales_price else null end) mon_sales, sum(case when (d_day_name='Tuesday') then sales_price else null end) tue_sales, sum(case when (d_day_name='Wednesday') then sales_price else null end) wed_sales, sum(case when (d_day_name='Thursday') then sales_price else null end) thu_sales, sum(case when (d_day_name='Friday') then sales_price else null end) fri_sales, sum(case when (d_day_name='Saturday') then sales_price else null end) sat_sales from wscs ,date_dim where d_date_sk = sold_date_sk group by d_week_seq) select d_week_seq1 ,round(sun_sales1/sun_sales2,2) ,round(mon_sales1/mon_sales2,2) ,round(tue_sales1/tue_sales2,2) ,round(wed_sales1/wed_sales2,2) ,round(thu_sales1/thu_sales2,2) ,round(fri_sales1/fri_sales2,2) ,round(sat_sales1/sat_sales2,2) from (select wswscs.d_week_seq d_week_seq1 ,sun_sales sun_sales1 ,mon_sales mon_sales1 ,tue_sales tue_sales1 ,wed_sales wed_sales1 ,thu_sales thu_sales1 ,fri_sales fri_sales1 ,sat_sales sat_sales1 from wswscs,date_dim where date_dim.d_week_seq = wswscs.d_week_seq and d_year = 1999) y, (select wswscs.d_week_seq d_week_seq2 ,sun_sales sun_sales2 ,mon_sales mon_sales2 ,tue_sales tue_sales2 ,wed_sales wed_sales2 ,thu_sales thu_sales2 ,fri_sales fri_sales2 ,sat_sales sat_sales2 from wswscs ,date_dim where date_dim.d_week_seq = wswscs.d_week_seq and d_year = 1999+1) z where d_week_seq1=d_week_seq2-53 order by d_week_seq1;
  • 导入SSB数据 执行以下命令导入数据。 1 2 3 4 5 6 7 8 910111213141516171819202122232425262728293031323334353637383940414243444546474849 INSERT INTO customer SELECT * FROM customer_load;INSERT INTO supplier SELECT * FROM supplier_load;INSERT INTO part SELECT * FROM part_load;INSERT INTO lineorder SELECT * FROM lineorder_load; INSERT INTO lineorder_flatSELECT l.LO_ORDERKEY AS LO_ORDERKEY, l.LO_LINENUMBER AS LO_LINENUMBER, l.LO_CUSTKEY AS LO_CUSTKEY, l.LO_PARTKEY AS LO_PARTKEY, l.LO_SUPPKEY AS LO_SUPPKEY, l.LO_ORDERDATE AS LO_ORDERDATE, l.LO_ORDERPRIORITY AS LO_ORDERPRIORITY, l.LO_SHIPPRIORITY AS LO_SHIPPRIORITY, l.LO_QUANTITY AS LO_QUANTITY, l.LO_EXTENDEDPRICE AS LO_EXTENDEDPRICE, l.LO_ORDTOTALPRICE AS LO_ORDTOTALPRICE, l.LO_DISCOUNT AS LO_DISCOUNT, l.LO_REVENUE AS LO_REVENUE, l.LO_SUPPLYCOST AS LO_SUPPLYCOST, l.LO_TAX AS LO_TAX, l.LO_COMMITDATE AS LO_COMMITDATE, l.LO_SHIPMODE AS LO_SHIPMODE, c.C_NAME AS C_NAME, c.C_ADDRESS AS C_ADDRESS, c.C_CITY AS C_CITY, c.C_NATION AS C_NATION, c.C_REGION AS C_REGION, c.C_PHONE AS C_PHONE, c.C_MKTSEGMENT AS C_MKTSEGMENT, s.S_NAME AS S_NAME, s.S_ADDRESS AS S_ADDRESS, s.S_CITY AS S_CITY, s.S_NATION AS S_NATION, s.S_REGION AS S_REGION, s.S_PHONE AS S_PHONE, p.P_NAME AS P_NAME, p.P_MFGR AS P_MFGR, p.P_CATEGORY AS P_CATEGORY, p.P_BRAND AS P_BRAND, p.P_COLOR AS P_COLOR, p.P_TYPE AS P_TYPE, p.P_SIZE AS P_SIZE, p.P_CONTAINER AS P_CONTAINERFROM lineorder AS lINNER JOIN customer AS c ON c.C_CUSTKEY = l.LO_CUSTKEYINNER JOIN supplier AS s ON s.S_SUPPKEY = l.LO_SUPPKEYINNER JOIN part AS p ON p.P_PARTKEY = l.LO_PARTKEY;
  • 创建SSB数据集的GDS外表 连接DWS数据库后执行以下SQL语句。 以下每个外表的“gsfs://192.168.0.90:500x/xxx | gsfs://192.168.0.90:500x/xxx”中的IP地址和端口,请替换成安装和启动GDS中的对应的GDS的监听IP和端口。如启动两个GDS,则使用“|”区分。如果启动多个GDS,需要将所有GDS的监听IP和端口配置到外表中。 1 2 3 4 5 6 7 8 9101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384 DROP FOREIGN TABLE IF EXISTS customer_load;CREATE FOREIGN TABLE customer_load( C_CUSTKEY BIGINT NOT NULL , C_NAME VARCHAR(25) NOT NULL , C_ADDRESS VARCHAR(40) NOT NULL , C_CITY VARCHAR(25) NOT NULL , C_NATION VARCHAR(25) NOT NULL , C_REGION VARCHAR(25) NOT NULL , C_PHONE VARCHAR(15) NOT NULL , C_MKTSEGMENT VARCHAR(10) NOT NULL)SERVER gsmpp_serverOPTIONS(location 'gsfs://192.168.0.90:5004/customer.tbl*',format 'text',delimiter '|',encoding 'utf8',mode 'Normal'); DROP FOREIGN TABLE IF EXISTS supplier_load;CREATE FOREIGN TABLE supplier_load( S_SUPPKEY BIGINT NOT NULL , S_NAME VARCHAR(25) NOT NULL , S_ADDRESS VARCHAR(40) NOT NULL , S_CITY VARCHAR(25) NOT NULL , S_NATION VARCHAR(25) NOT NULL , S_REGION VARCHAR(25) NOT NULL , S_PHONE VARCHAR(15) NOT NULL)SERVER gsmpp_serverOPTIONS(location 'gsfs://192.168.0.90:5004/supplier.tbl*',format 'text',delimiter '|',encoding 'utf8',mode 'Normal'); DROP FOREIGN TABLE IF EXISTS part_load;CREATE FOREIGN TABLE part_load( P_PARTKEY BIGINT NOT NULL , P_NAME VARCHAR(55) NOT NULL , P_MFGR VARCHAR(25) NOT NULL , P_CATEGORY VARCHAR(25) NOT NULL , P_BRAND VARCHAR(10) NOT NULL , P_COLOR VARCHAR(20) NOT NULL , P_TYPE VARCHAR(25) NOT NULL , P_SIZE BIGINT NOT NULL , P_CONTAINER VARCHAR(10) NOT NULL)SERVER gsmpp_serverOPTIONS(location 'gsfs://192.168.0.90:5004/part.tbl*',format 'text',delimiter '|',encoding 'utf8',mode 'Normal'); DROP FOREIGN TABLE IF EXISTS lineorder_load;CREATE FOREIGN TABLE lineorder_load( LO_ORDERKEY BIGINT NOT NULL, LO_LINENUMBER BIGINT NOT NULL, LO_CUSTKEY BIGINT NOT NULL, LO_PARTKEY BIGINT NOT NULL, LO_SUPPKEY BIGINT NOT NULL, LO_ORDERDATE DATE NOT NULL, LO_ORDERPRIORITY VARCHAR(15) NOT NULL, LO_SHIPPRIORITY BIGINT NOT NULL, LO_QUANTITY BIGINT NOT NULL, LO_EXTENDEDPRICE BIGINT NOT NULL, LO_ORDTOTALPRICE BIGINT NOT NULL, LO_DISCOUNT BIGINT NOT NULL, LO_REVENUE BIGINT NOT NULL, LO_SUPPLYCOST BIGINT NOT NULL, LO_TAX BIGINT NOT NULL, LO_COMMITDATE DATE NOT NULL, LO_SHIPMODE VARCHAR(10) NOT NULL)SERVER gsmpp_serverOPTIONS(location 'gsfs://192.168.0.90:5004/lineorder.tbl*',format 'text',delimiter '|',encoding 'utf8',mode 'Normal');
  • 创建SSB目标表 连接DWS数据库后执行以下SQL语句。 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168 CREATE TABLE CUSTOMER(C_CUSTKEY BIGINT NOT NULL,C_NAME VARCHAR(25) NOT NULL,C_ADDRESS VARCHAR(40) NOT NULL,C_CITY VARCHAR(25) NOT NULL,C_NATION VARCHAR(25) NOT NULL,C_REGION VARCHAR(25) NOT NULL,C_PHONE VARCHAR(15) NOT NULL,C_MKTSEGMENT VARCHAR(10) NOT NULL)WITH (orientation=column, colversion=2.0,enable_hstore=true,enable_hstore_opt=true)DISTRIBUTE BY hash(C_CUSTKEY) ;CREATE TABLE SUPPLIER(S_SUPPKEY BIGINT NOT NULL, S_NAME VARCHAR(25) NOT NULL, S_ADDRESS VARCHAR(40) NOT NULL, S_CITY VARCHAR(25) NOT NULL, S_NATION VARCHAR(25) NOT NULL, S_REGION VARCHAR(25) NOT NULL, S_PHONE VARCHAR(15) NOT NULL)WITH (orientation=column, colversion=2.0,enable_hstore=true,enable_hstore_opt=true)DISTRIBUTE BY hash(S_SUPPKEY);CREATE TABLE PART(P_PARTKEY BIGINT NOT NULL, P_NAME VARCHAR(55) NOT NULL, P_MFGR VARCHAR(25) NOT NULL, P_CATEGORY VARCHAR(25) NOT NULL, P_BRAND VARCHAR(10) NOT NULL, P_COLOR VARCHAR(20) NOT NULL, P_TYPE VARCHAR(25) NOT NULL, P_SIZE BIGINT NOT NULL, P_CONTAINER VARCHAR(10) NOT NULL)WITH (orientation=column, colversion=2.0,enable_hstore=true,enable_hstore_opt=true)DISTRIBUTE BY hash(P_PARTKEY);CREATE TABLE lineorder(LO_ORDERKEY BIGINT NOT NULL,LO_LINENUMBER BIGINT NOT NULL,LO_CUSTKEY BIGINT NOT NULL,LO_PARTKEY BIGINT NOT NULL,LO_SUPPKEY BIGINT NOT NULL,LO_ORDERDATE DATE NOT NULL,LO_ORDERPRIORITY VARCHAR(15) NOT NULL,LO_SHIPPRIORITY BIGINT NOT NULL,LO_QUANTITY BIGINT NOT NULL,LO_EXTENDEDPRICE BIGINT NOT NULL,LO_ORDTOTALPRICE BIGINT NOT NULL,LO_DISCOUNT BIGINT NOT NULL,LO_REVENUE BIGINT NOT NULL,LO_SUPPLYCOST BIGINT NOT NULL,LO_TAX BIGINT NOT NULL,LO_COMMITDATE DATE NOT NULL,LO_SHIPMODE VARCHAR(10) NOT NULL)WITH (orientation=column, colversion=2.0,enable_hstore=true,enable_hstore_opt=true)DISTRIBUTE BY hash(LO_ORDERKEY)PARTITION BY RANGE(LO_ORDERDATE)(PARTITION LO_ORDERDATE_1 VALUES LESS THAN('1992-04-01 00:00:00'),PARTITION LO_ORDERDATE_2 VALUES LESS THAN('1992-07-01 00:00:00'),PARTITION LO_ORDERDATE_3 VALUES LESS THAN('1992-10-01 00:00:00'),PARTITION LO_ORDERDATE_4 VALUES LESS THAN('1993-01-01 00:00:00'),PARTITION LO_ORDERDATE_5 VALUES LESS THAN('1993-04-01 00:00:00'),PARTITION LO_ORDERDATE_6 VALUES LESS THAN('1993-07-01 00:00:00'),PARTITION LO_ORDERDATE_7 VALUES LESS THAN('1993-10-01 00:00:00'),PARTITION LO_ORDERDATE_8 VALUES LESS THAN('1994-01-01 00:00:00'),PARTITION LO_ORDERDATE_9 VALUES LESS THAN('1994-04-01 00:00:00'),PARTITION LO_ORDERDATE_10 VALUES LESS THAN('1994-07-01 00:00:00'),PARTITION LO_ORDERDATE_11 VALUES LESS THAN('1994-10-01 00:00:00'),PARTITION LO_ORDERDATE_12 VALUES LESS THAN('1995-01-01 00:00:00'),PARTITION LO_ORDERDATE_13 VALUES LESS THAN('1995-04-01 00:00:00'),PARTITION LO_ORDERDATE_14 VALUES LESS THAN('1995-07-01 00:00:00'),PARTITION LO_ORDERDATE_15 VALUES LESS THAN('1995-10-01 00:00:00'),PARTITION LO_ORDERDATE_16 VALUES LESS THAN('1996-01-01 00:00:00'),PARTITION LO_ORDERDATE_17 VALUES LESS THAN('1996-04-01 00:00:00'),PARTITION LO_ORDERDATE_18 VALUES LESS THAN('1996-07-01 00:00:00'),PARTITION LO_ORDERDATE_19 VALUES LESS THAN('1996-10-01 00:00:00'),PARTITION LO_ORDERDATE_20 VALUES LESS THAN('1997-01-01 00:00:00'),PARTITION LO_ORDERDATE_21 VALUES LESS THAN('1997-04-01 00:00:00'),PARTITION LO_ORDERDATE_22 VALUES LESS THAN('1997-07-01 00:00:00'),PARTITION LO_ORDERDATE_23 VALUES LESS THAN('1997-10-01 00:00:00'),PARTITION LO_ORDERDATE_24 VALUES LESS THAN('1998-01-01 00:00:00'),PARTITION LO_ORDERDATE_25 VALUES LESS THAN('1998-04-01 00:00:00'),PARTITION LO_ORDERDATE_26 VALUES LESS THAN('1998-07-01 00:00:00'),PARTITION LO_ORDERDATE_27 VALUES LESS THAN('1998-10-01 00:00:00'),PARTITION LO_ORDERDATE_28 VALUES LESS THAN('1999-01-01 00:00:00'));SET enable_hstoreopt_auto_bitmap=true;CREATE TABLE lineorder_flat(LO_ORDERKEY BIGINT NOT NULL,LO_LINENUMBER BIGINT NOT NULL,LO_CUSTKEY BIGINT NOT NULL,LO_PARTKEY BIGINT NOT NULL,LO_SUPPKEY BIGINT NOT NULL,LO_ORDERDATE DATE NOT NULL,LO_ORDERPRIORITY VARCHAR(15) NOT NULL,LO_SHIPPRIORITY BIGINT NOT NULL,LO_QUANTITY BIGINT NOT NULL,LO_EXTENDEDPRICE BIGINT NOT NULL,LO_ORDTOTALPRICE BIGINT NOT NULL,LO_DISCOUNT BIGINT NOT NULL,LO_REVENUE BIGINT NOT NULL,LO_SUPPLYCOST BIGINT NOT NULL,LO_TAX BIGINT NOT NULL,LO_COMMITDATE DATE NOT NULL,LO_SHIPMODE VARCHAR(10) NOT NULL,C_NAME VARCHAR(25) NOT NULL, C_ADDRESS VARCHAR(40) NOT NULL, C_CITY VARCHAR(25) NOT NULL, C_NATION VARCHAR(25) NOT NULL, C_REGION VARCHAR(25) NOT NULL, C_PHONE VARCHAR(15) NOT NULL, C_MKTSEGMENT VARCHAR(10) NOT NULL, S_NAME VARCHAR(25) NOT NULL, S_ADDRESS VARCHAR(40) NOT NULL, S_CITY VARCHAR(25) NOT NULL, S_NATION VARCHAR(25) NOT NULL, S_REGION VARCHAR(25) NOT NULL, S_PHONE VARCHAR(15) NOT NULL, P_NAME VARCHAR(55) NOT NULL, P_MFGR VARCHAR(25) NOT NULL, P_CATEGORY VARCHAR(25) NOT NULL, P_BRAND VARCHAR(10) NOT NULL, P_COLOR VARCHAR(20) NOT NULL, P_TYPE VARCHAR(25) NOT NULL, P_SIZE BIGINT NOT NULL, P_CONTAINER VARCHAR(10) NOT NULL, Partial Cluster Key(s_region,s_nation,s_city)) WITH (orientation=column, colversion=2.0,enable_hstore=true,enable_hstore_opt=true,secondary_part_column='p_mfgr', secondary_part_num=8)DISTRIBUTE BY hash(LO_ORDERKEY)PARTITION BY RANGE(LO_ORDERDATE)(PARTITION LO_ORDERDATE_1 VALUES LESS THAN('1992-04-01 00:00:00'),PARTITION LO_ORDERDATE_2 VALUES LESS THAN('1992-07-01 00:00:00'),PARTITION LO_ORDERDATE_3 VALUES LESS THAN('1992-10-01 00:00:00'),PARTITION LO_ORDERDATE_4 VALUES LESS THAN('1993-01-01 00:00:00'),PARTITION LO_ORDERDATE_5 VALUES LESS THAN('1993-04-01 00:00:00'),PARTITION LO_ORDERDATE_6 VALUES LESS THAN('1993-07-01 00:00:00'),PARTITION LO_ORDERDATE_7 VALUES LESS THAN('1993-10-01 00:00:00'),PARTITION LO_ORDERDATE_8 VALUES LESS THAN('1994-01-01 00:00:00'),PARTITION LO_ORDERDATE_9 VALUES LESS THAN('1994-04-01 00:00:00'),PARTITION LO_ORDERDATE_10 VALUES LESS THAN('1994-07-01 00:00:00'),PARTITION LO_ORDERDATE_11 VALUES LESS THAN('1994-10-01 00:00:00'),PARTITION LO_ORDERDATE_12 VALUES LESS THAN('1995-01-01 00:00:00'),PARTITION LO_ORDERDATE_13 VALUES LESS THAN('1995-04-01 00:00:00'),PARTITION LO_ORDERDATE_14 VALUES LESS THAN('1995-07-01 00:00:00'),PARTITION LO_ORDERDATE_15 VALUES LESS THAN('1995-10-01 00:00:00'),PARTITION LO_ORDERDATE_16 VALUES LESS THAN('1996-01-01 00:00:00'),PARTITION LO_ORDERDATE_17 VALUES LESS THAN('1996-04-01 00:00:00'),PARTITION LO_ORDERDATE_18 VALUES LESS THAN('1996-07-01 00:00:00'),PARTITION LO_ORDERDATE_19 VALUES LESS THAN('1996-10-01 00:00:00'),PARTITION LO_ORDERDATE_20 VALUES LESS THAN('1997-01-01 00:00:00'),PARTITION LO_ORDERDATE_21 VALUES LESS THAN('1997-04-01 00:00:00'),PARTITION LO_ORDERDATE_22 VALUES LESS THAN('1997-07-01 00:00:00'),PARTITION LO_ORDERDATE_23 VALUES LESS THAN('1997-10-01 00:00:00'),PARTITION LO_ORDERDATE_24 VALUES LESS THAN('1998-01-01 00:00:00'),PARTITION LO_ORDERDATE_25 VALUES LESS THAN('1998-04-01 00:00:00'),PARTITION LO_ORDERDATE_26 VALUES LESS THAN('1998-07-01 00:00:00'),PARTITION LO_ORDERDATE_27 VALUES LESS THAN('1998-10-01 00:00:00'),PARTITION LO_ORDERDATE_28 VALUES LESS THAN('1999-01-01 00:00:00')) ;SET enable_hstoreopt_auto_bitmap=false;
  • API接口返回的中文字符为乱码,如何解决? 当API接口返回的中文字符出现乱码时,通常是因为字符编码格式不匹配。 DLI 接口返回的结果编码格式为“UTF-8”,在调用接口获取返回结果时需要对返回的信息编码转换为“UTF-8”。 例如,参考如下实现对返回的response.content内容做编码格式转换,确保返回的中文格式不会乱码。 print(response.content.decode("utf-8")) 父主题: DLI API类
共100000条
提示

您即将访问非华为云网站,请注意账号财产安全