检测到您已登录华为云国际站账号,为了您更好的体验,建议您访问国际站服务网站 https://www.huaweicloud.com/intl/zh-cn
不再显示此消息
Submitting a Spark job Upload the Java code file to the OBS bucket. In the Spark job editor, select the corresponding dependency module and execute the Spark job.
Click the name of the corresponding Flink job, choose Run Log, click OBS Bucket, and locate the folder of the log you want to view according to the date. Go to the folder of the date, find the folder whose name contains taskmanager, download the .out file, and view result logs.
Configure Path for Partition: This permission allows you to set the path of a partition in a partition table to a specified OBS path. Rename Table Partition: This permission allows you to rename partitions in a partition table.
Select Save Job Log, and specify the OBS bucket for saving job logs. Change the values of the parameters in bold as needed in the following script.
Available in all regions Using DLI to Submit a SQL Job to Query OBS Data Exporting SQL Job Results dli DLI Spark Job The DLI team has extensively optimized and transformed the open-source Spark to provide batch processing capabilities, while remaining compatible with the Apache Spark
file="obs://your_bucket/your_spark_app.jar", # (Mandatory) Location of the JAR file on OBS. class_name="your_class_fullname", # (Mandatory) Class path of the main class ( --class), for example, org.example.DliCatalogTest.
Submitting a Spark job Upload the Python code file to the OBS bucket. In the Spark job editor, select the corresponding dependency module and execute the Spark job. After the Spark job is created, click Execute in the upper right corner of the console to submit the job.
Submitting a Spark Job Upload the Python code file to the OBS bucket. In the Spark job editor, select the corresponding dependency module and execute the Spark job. For Spark 2.3.2 (soon to be take offline) or 2.4.5, set Module to sys.datasource.dws when submitting a job.
person')".stripMargin) Insert data. 1 sparkSession.sql("INSERT INTO TABLE person VALUES ('John', 30),('Peter', 45)".stripMargin) Query data. 1 sparkSession.sql("SELECT * FROM person".stripMargin).collect().foreach(println) Submitting a Spark job Upload the Python code file to the OBS
Insert data. 1 sparkSession.sql("insert into test_dds values('3', 'Ann',23)") Query data. 1 sparkSession.sql("select * from test_dds").show() Submitting a Spark job Upload the Python code file to the OBS bucket.
If this parameter is not required, set it to false or leave it blank (the default value is false). compression: If the created OBS table needs to be compressed, you can use the keyword compression to configure the compression format.
ctopentsdb" map("tags") = "city,location" map("Host") = "opentsdb-3xcl8dir15m58z3.cloudtable.com:4242" sparkSession.read.format("opentsdb").options(map.toMap).load().show() Response Submitting a Spark job Generate a JAR file based on the code file and upload the JAR file to the OBS
Create a Hive OBS external table using Spark SQL and insert data.
url", url) .option("uri", uri) .option("database", database) .option("collection", collection) .option("user", user) .option("password", password) .load() Operation result Submitting a Spark job Generate a JAR file based on the code file and upload the JAR file to the OBS
Click the name of the corresponding Flink job, choose Run Log, click OBS Bucket, and locate the folder of the log you want to view according to the date. Go to the folder of the date, find the folder whose name contains taskmanager, download the .out file, and view result logs.
Select Save Job Log, and specify the OBS bucket for saving job logs. Change the values of the parameters in bold as needed in the following script.
Submitting a Spark job Upload the Python code file to the OBS bucket. (Optional) Add the krb5.conf and user.keytab files to other dependency files of the job when creating a Spark job in an MRS cluster with Kerberos authentication enabled.
Select Save Job Log, and specify the OBS bucket for saving job logs. Storing authentication credentials such as usernames and passwords in code or plaintext poses significant security risks. It is recommended using DEW to manage credentials instead.
Flink jobs can directly access DIS, OBS, and SMN data sources without using datasource connections. When compute resources in non-elastic resource pools are used, enhanced datasource connections can only be created for yearly/monthly and pay-per-use dedicated queues.
Sample code: Prepare data: create table test_null2(str1 string,str2 string,str3 string,str4 string); insert into test_null2 select "a\rb", null, "1\n2", "ab"; Execute SQL: SELECT * FROM test_null2; Spark 2.4.5 a b 1 2 ab Spark 3.3.1 a b 1 2 ab Export query results to OBS and check