检测到您已登录华为云国际站账号,为了您更好的体验,建议您访问国际站服务网站 https://www.huaweicloud.com/intl/zh-cn
不再显示此消息
Prerequisites You have uploaded the program packages and data files required by jobs to OBS or HDFS. If the job program needs to read and analyze data in the OBS file system, you need to configure storage-compute decoupling for the MRS cluster.
MRS allows you to store data in OBS and use an MRS cluster for data computing only. In this way, storage and compute are decoupled. For details about the components supported by each MRS version, see List of MRS Component Versions. Parent topic: MRS Basics
How Do I Migrate Data from OBS/S3 to ClickHouse? An Error Is Reported in Logs When the Auxiliary ZooKeeper or Replica Data Is Used to Synchronize Table Data How Do I Grant the Select Permission at the Database Level to ClickHouse Users?
Import Data from HBase to HDFS A Data Format Error Is Reported When Data Is Exported from Hive to MySQL 8.0 Using Sqoop An Error Is Reported When the sqoop import Command Is Executed to Extract Data from PgSQL to Hive Failed to Use Sqoop to Read MySQL Data and Write Parquet Files to OBS
Application Development How Do I Get My Data into OBS or HDFS? Can MRS Write Data to HBase Through an HBase External Table of Hive? Where Can I Download the Dependency Package (com.huawei.gaussc10) in the Hive Sample Project? Does MRS Support Python Code?
Parameter Description Table 1 HDFS parameters Parameter Description Default Value fs.obs.security.provider Implementation method of obtaining the key for accessing the OBS file system Value options are as follows: com.huawei.mrs.MrsObsCredentialsProvider: obtains a credential through
Parameter Description Table 1 HDFS parameters Parameter Description Default Value fs.obs.security.provider Implementation method of obtaining the key for accessing the OBS file system Value options are as follows: com.huawei.mrs.MrsObsCredentialsProvider: obtains a credential through
Link Configurations of Loader Jobs Destination Link Configurations of Loader Jobs Managing Loader Jobs Preparing a Driver for MySQL Database Link Importing Data Exporting Data Managing Jobs Operator Help Client Tools Loader Log Overview Example: Using Loader to Import Data from OBS
If HBase data is stored in OBS, data backup is not supported. If you want to back up data to OBS, you have connected the current cluster to OBS and have the permission to access OBS.
File Format Loader supports the following file formats of data stored in OBS: CSV_FILE: Specifies a text file. When the destination link is a database link, only the text file is supported. BINARY_FILE: Specifies binary files excluding text files.
LocalDir LocalHDFS RemoteHDFS NFS CIFS SFTP OBS Flink (Applicable to MRS 3.2.0 and later versions) Flink metadata. LocalDir LocalHDFS RemoteHDFS OBS (available in MRS 3.5.0 and later) Kafka Kafka metadata. LocalDir LocalHDFS RemoteHDFS NFS CIFS OBS NameNode HDFS metadata.
Enable OBS Local Cache OBS provides local cache that meets your data storage demands, improving the read speed. For example, you can configure a single-disk 100 GB local cache with data_cache=/srv/BigData/data1/impala:100 GB.
OBS Permission Control Mapping between MRS users and OBS permissions. Data Connection Type of the data connection associated with the cluster. Agency Agency bound to or modified by the cluster. Key Pair Name of a key pair, which is set during cluster creation.
What Should I Do If Data Failed to Be Synchronized to a Hive Table on the OBS Using hive-table? What Should I Do If Data Failed to Be Synchronized to an ORC or Parquet Table Using hive-table? What Should I Do If Data Failed to Be Synchronized Using hive-table?
The OBS program path should start with obs://, for example, obs://wordcount/program/XXX.jar. The HDFS program path should start with hdfs://, for example, hdfs://hacluster/user/XXX.jar.
OBS Indicates that backup files are stored in an OBS directory. This option is available for MRS 3.3.0-LTS.1 or later only. Target Path: indicates the OBS directory for storing backup data.
If a low-permission user lacks OBS path access permission for the default database, Spark will display a permission error message but will still successfully create the database.
In the environmental protection industry, climate data is stored on OBS and periodically dumped into HDFS for batch analysis. 10 TB of climate data can be analyzed in 1 hour.
It provides a data abstraction layer for computing frameworks including Apache Spark, Presto, MapReduce, and Apache Hive, so that upper-layer computing applications can access persistent storage systems including HDFS and OBS through unified client APIs and a global namespace.
Set this parameter to a correct OBS path. The OBS path does not support files or programs encrypted by KMS. The value can contain a maximum of 1,023 characters, excluding special characters such as ;|&>'<$, and can be left blank.