华为云用户手册

  • URI DELETE /v1/{project_id}/instances/{instance_id}/databases/{ddm_dbname} 表1 路径参数 参数 是否必选 参数类型 描述 project_id 是 String 项目ID。 获取方法请参见获取项目ID。 instance_id 是 String DDM实例ID。 ddm_dbname 是 String 需要查询的逻辑库名称,不区分大小写。 表2 Query参数 参数 是否必选 参数类型 描述 delete_rds_data 否 String 是否同时删除关联后端数据库实例上存储的数据。 取值为“true”:删除。 取值为空或“false”:不删除。 默认值为空。 枚举值: true false
  • 响应示例 状态码: 200 OK { } 状态码: 400 bad request { "externalMessage" : "Parameter error.", "errCode" : "DBS.280001"} 状态码: 500 server error { "externalMessage" : "Server failure.", "errCode" : "DBS.200412"}
  • 请求示例 删除DDM逻辑库(删除关联后端数据库实例上存储的数据)。 DELETE https://{endpoint}/v1/{project_id}/instances/{instance_id}/databases/{ddm_dbname}?delete_rds_data=true 删除DDM逻辑库(保留关联后端数据库实例上存储的数据)。 DELETE https://{endpoint}/v1/{project_id}/instances/{instance_id}/databases/{ddm_dbname}?delete_rds_data=false
  • 操作步骤 删除DDM实例,同时删除关联RDS上存储的数据。 接口相关信息 URI格式:DELETE /v1/{project_id}/instances/{instance_id}?delete_rds_data=true 详情请参见删除DDM实例。 请求示例 DELETE: https://{endpoint}/v1/743b4c0428d945316666666666666666/instances/d0b008c1ee95479d8799710d9f3a4097in09?delete_rds_data=true {endpoint}信息请从地区和终端节点获取。 响应示例 { "id":"d0b008c1ee95479d8799710d9f3a4097in09"} 删除DDM实例,不删除关联RDS上存储的数据。 接口相关信息 URI格式:DELETE /v1/{project_id}/instances/{instance_id}?delete_rds_data=false 详情请参见删除DDM实例。 请求示例 DELETE: https://{endpoint}/v1/743b4c0428d945316666666666666666/instances/d0b008c1ee95479d8799710d9f3a4097in09?delete_rds_data=false {endpoint}信息请从地区和终端节点获取。 DELETE: https://{endpoint}/v1/743b4c0428d945316666666666666666/instances/d0b008c1ee95479d8799710d9f3a4097in09?delete_rds_data=false 响应示例 { "id":"d0b008c1ee95479d8799710d9f3a4097in09"}
  • 操作步骤 修改DDM实例名称。 接口相关信息 URI格式:PUT /v1/{project_id}/instances/{instance_id}/modify_name 详情请参见修改DDM实例名称。 请求示例 PUT: https://{endpoint}/v1/743b4c0428d945316666666666666666/instances/modify_name {"name": "ddm-testaa"} {endpoint}信息请从地区和终端节点获取。 响应示例 {"name":"ddm-testaa"}
  • 操作步骤 指定limit与offset查询DDM实例列表。 接口相关信息 URI格式: GET /v1/{project_id}/instances 详情请参见查询DDM实例列表。 请求示例 GET: https://{endpoint}/v1/743b4c0428d945316666666666666666/instances?offset=0&limit=1 {endpoint}信息请从地区和终端节点获取。 limit值可根据DDM实例数据调整。 响应示例 {"instance_num":10,"instances":[{"id":"cab932b426ed4215a8d76b9d71322661in09","status":"RUNNING","name":"ddm-20-single-2u4g-1-202010231552401522260","created":"2020-10-23T07:52:46+0000","updated":"2020-10-23T07:59:56+0000","available_zone":"az1xahz","vpc_id":"9cf0f8f5-9748-4ebb-9905-bbe429182bd6","subnet_id":"b35a4be7-65a5-4176-bec9-7a437493c498","security_group_id":"9d10da6d-38cc-4cf0-8f96-c34940a3fd15","node_count":1,"access_ip":"192.168.60.13","access_port":"5066","core_count":"2","ram_capacity":"4","node_status":"RUNNING","enterprise_project_id":"0","project_id":"070c071d8e80d58c2f42c0121b10cf9f","engine_version":"2.5.10.10222119"}],"page_no":1,"page_size":1,"total_record":10,"total_page":10} 汇总查询结果。 参考1持续调用,如果查询的DDM实例列表为空,或者返回的body体中不存在instances字段,表明所有DDM实例查询完成。 汇总所有查询到的DDM实例即当前查询条件下的所有DDM实例。
  • DDM逻辑库管理 表1 DDM逻辑库管理 权限 对应API接口 授权项 IAM 项目(Project) 企业项目 (Enterprise Rroject) 创建DDM逻辑库 POST /v1/{project_id}/instances/{instance_id}/databases ddm:database:create √ √ 查询DDM逻辑库列表 GET /v1/{project_id}/instances/{instance_id}/databases?offset={offset}&limit={limit} ddm:database:list √ √ 查询DDM逻辑库详细信息 GET /v1/{project_id}/instances/{instance_id}/databases/{ddm_dbname} ddm:database:get √ √ 删除DDM逻辑库 DELETE /v1/{project_id}/instances/{instance_id}/databases/{ddm_dbname}?delete_rds_data=true ddm:database:delete √ √ 父主题: 授权策略及授权项
  • 用户角色 是指用户在房间内的不同角色类型,不同角色类型有不同的权限模型。主要有如下三种角色类型: 主播(publisher):只发流不收流主播型角色。SparkRTC预留的角色类型。 互动观众(joiner):既能发流也能收流的互动型角色。 普通观众(player):只收流的观看型角色。 SparkRTC示例Demo中的角色切换、上台/下台、上麦/下麦主要指在joiner/player两个角色之间进行切换。
  • 样例 params = { "input_path": "", # @param {"label":"input_path","type":"path","required":"true","helpTip":""} "line_separator": "\n", # @param {"label":"line_separator","type":"string","required":"false","helpTip":""} "columns_str": "text_col" # @param {"label":"columns_str","type":"string","required":"false","helpTip":""}}read_text_data____id___ = MLSReadTextData(**params)read_text_data____id___.run()# @output {"label":"dataframe","name":"read_text_data____id___.get_outputs()['output_port_1']","type":"DataFrame"}
  • 样例 params = { "input_model_path": "" # @param {"label":"input_model_path","type":"path","required":"true","helpTip":""}}read_model____id___ = MLSReadPipelineModel(**params)read_model____id___.run()# @output {"label":"pipeline_model","name":"read_model____id___.get_outputs()['output_port_1']","type":"PipelineModel"}
  • 样例 inputs = { "dataframe": None # @input {"label":"dataframe","type":"DataFrame"}}params = { "inputs": inputs, "output_file_path": "", # @param {"label":"output_file_path","type":"path","required":"true","helpTip":""} "save_mode": "overwrite", # @param {"label":"save_mode","type":"string","required":"true","helpTip":""} "has_header": True # @param {"label":"has_header","type":"boolean","required":"true","helpTip":""}}save_data____id___ = MLSSaveData(**params)save_data____id___.run()
  • 样例 inputs = { "dataframe": None # @input {"label":"dataframe","type":"DataFrame"}}params = { "inputs": inputs, "output_file_path": "", # @param {"label":"output_file_path","type":"string","required":"true","helpTip":""} "encoding": "utf-8" # @param {"label":"encoding","type":"string","required":"true","helpTip":""}}mls_save_data____id___ = MLSSaveDataToOBS(**params)mls_save_data____id___.run()
  • 样例 inputs = { "dataframe": None # @input {"label":"dataframe","type":"DataFrame"}}params = { "inputs": inputs, " DLI _database": None, # @param {"label":"DLI_database","type":"string","required":"true","helpTip":""} "DLI_table": None, # @param {"label":"DLI_table","type":"string","required":"true","helpTip":""} "file_format": "parquet", # @param {"label":"file_format","type":"enum","options":"orc,parquet,json,csv,carbon,avro","required":"true","helpTip":""} "mode": "overwrite", # @param {"label":"mode","type":"enum","options":"overwrite,append","required":"true","helpTip":""} "OBS_path": "" # @param {"label":"OBS_path","type":"string","required":"true","helpTip":""}}save_DLI_table____id___ = MLSSaveDLITable(**params)save_DLI_table____id___.run()
  • 样例 inputs = { "pipeline_model": None # @input {"label":"pipeline_model","type":"PipelineModel"}}params = { "inputs": inputs, "output_model_path": "" # @param {"label":"output_model_path","type":"path","required":"true","helpTip":""}}save_model____id___ = MLSSavePipelineModel(**params)save_model____id___.run()
  • 参数说明 参数 子参数 参数说明 DLI_database - 用户的目标DLI数据库名称 DLI_table - 用户的目标DLI数据库中目标DLI外表或要新建DLI外表的名称 file_format - DLI外表使用的数据格式 mode - 数据的写入类型(追加或覆盖,默认为覆盖模式)。使用PySpark insertInto函数,因此追加或者覆盖都要保证特征列数量和顺序一致 OBS_path - 用户目标DLI外表的OBS存储路径
  • 样例 inputs = { "pipeline_model": None # @input {"label":"pipeline_model","type":"PipelineModel"}}params = { "inputs": inputs, "obs_model_path": "" # @param {"label":"obs_model_path","type":"string","required":"true","helpTip":""}}mls_save_model____id___ = MLSSavePipelineModelToOBS(**params)mls_save_model____id___.run()
  • 样例 inputs = { "dataframe": None # @input {"label":"dataframe","type":"DataFrame"}}params = { "inputs": inputs, "output_file_path": "", # @param {"label":"output_file_path","type":"path","required":"true","helpTip":""} "save_mode": "overwrite" # @param {"label":"save_mode","type":"string","required":"false","helpTip":""}}save_parquet_data____id___ = MLSSaveParquetData(**params)save_parquet_data____id___.run()
  • 参数说明 参数 子参数 参数说明 input_col - 输入的列名 output_col - 离散化后输出的列名,默认为"quantile_discretizer_result" num_buckets - 桶的个数,默认为2 handle_invalid - 处理无效值的策略,支持skip、keep、error,默认为skip relative_error - 相对错误值,取值范围是[0, 1],默认为0.001
  • 样例 inputs = { "dataframe": None # @input {"label":"dataframe","type":"DataFrame"}}params = { "inputs": inputs, "b_output_action": True, "outer_pipeline_stages": None, "input_col": "", # @param {"label":"input_col","type":"string","required":"true","helpTip":""} "output_col": "quantile_discretizer_result", # @param {"label":"output_col","type":"string","required":"true","helpTip":""} "num_buckets": 2, # @param {"label":"num_buckets","type":"integer","required":"true","range":"(0,2147483647]","helpTip": ""} "handle_invalid": "skip", # @param {"label":"handle_invalid","type":"enum","options":"skip,keep,error","required":"true","helpTip":""} "relative_error": 0.001 # @param {"label":"relative_error","type":"number","required":"true","range":"[0,1]","helpTip":""}}quantile_discretizer____id___ = MLSQuantileDiscretizer(**params)quantile_discretizer____id___.run()# @output {"label":"dataframe","name":"quantile_discretizer____id___.get_outputs()['output_port_1']","type":"DataFrame"}
  • 参数说明 参数 子参数 参数说明 input_features_str - 输入的特征列名以逗号分隔组成的格式化字符串,例如: "column_a" "column_a,column_b" input_vector_column - 算子输入的向量列的列名,默认为"input_features" output_vector_column - 算子输出的向量列的列名,默认为"standard_features" with_std - 是否按照方差进行标准化,默认为True with_mean - 是否按照均值进行标准化,默认为False
  • 样例 inputs = { "dataframe": None # @input {"label":"dataframe","type":"DataFrame"}}params = { "inputs": inputs, "b_output_action": True, "outer_pipeline_stages": None, "input_features_str": "", # @param {"label":"input_features_str","type": "string","required":"false","helpTip": ""} "input_vector_column": "input_features", # @param {"label":"input_vector_column","type":"string","required":"true","helpTip":""} "output_vector_column": "standard_features", # @param {"label":"output_vector_column","type":"string","required":"true","helpTip":""} "with_std": True, # @param {"label":"with_std","type":"boolean","required":"true","helpTip":""} "with_mean": False # @param {"label":"with_mean","type":"boolean","required":"true","helpTip":""}}standard_scaler____id___ = MLSStandardScaler(**params)standard_scaler____id___.run()# @output {"label":"pipeline_model","name":"standard_scaler____id___.get_outputs()['output_port_1']","type":"PipelineModel"} # @output {"label":"dataframe","name":"standard_scaler____id___.get_outputs()['output_port_2']","type":"DataFrame"}
  • 样例 inputs = { "dataframe": None # @input {"label":"dataframe","type":"DataFrame"}}params = { "inputs": inputs, "b_output_action": True, "b_use_default_encoder": True, # @param {"label": "b_use_default_encoder", "type": "boolean", "required": "true", "helpTip": ""} "input_features_str": "", # @param {"label": "input_features_str", "type": "string", "required": "false", "helpTip": ""} "outer_pipeline_stages": None, "label_col": "", # @param {"label": "label_col", "type": "string", "required": "true", "helpTip": ""} "classifier_label_index_col": "label_index", # @param {"label": "classifier_label_index_col", "type": "string", "required": "true", "helpTip": ""} "classifier_feature_vector_col": "model_features", # @param {"label": "classifier_feature_vector_col", "type": "string", "required": "true", "helpTip": ""} "prediction_index_col": "prediction_index", # @param {"label": "prediction_index_col", "type": "string", "required": "true", "helpTip": ""} "prediction_col": "prediction", # @param {"label": "prediction_col", "type": "string", "required": "true", "helpTip": ""} "max_depth": 5, # @param {"label": "max_depth", "type": "integer", "required": "true", "range":"(0,2147483647]", "helpTip": ""} "max_bins": 32, # @param {"label": "max_bins", "type": "integer", "required": "true", "range":"(0,2147483647]", "helpTip": ""} "min_instances_per_node": 1, # @param {"label": "min_instances_per_node", "type": "integer", "required": "true", "range": "[1,2147483647]", "helpTip": ""} "min_info_gain": 0.0, # @param {"label": "min_info_gain", "type": "number", "required": "true", "range": "[0,none)", "helpTip": ""} "impurity": "gini" # @param {"label": "impurity", "type": "enum", "required": "true", "options": "entropy,gini", "helpTip": ""}}dt_classifier____id___ = MLSDecisionTreeClassifier(**params)dt_classifier____id___.run()# @output {"label":"pipeline_model","name":"dt_classifier____id___.get_outputs()['output_port_1']","type":"PipelineModel"}
  • 参数说明 参数 子参数 参数说明 b_use_default_encoder - 是否使用默认编码,默认为True input_features_str - 输入的特征列名以逗号分隔组成的格式化字符串,例如: "column_a" "column_a,column_b" label_col - 目标列 classifier_label_index_col - 目标列经过标签编码后的新的列名,默认为"label_index" classifier_feature_vector_col - 算子输入的特征向量列的列名,默认为"model_features" prediction_index_col - 算子输出的预测label对应的标签列,默认为"prediction_index" prediction_col - 算子输出的预测label的列名,默认为"prediction" max_depth - 树的最大深度,默认为5 max_bins - 最大分箱数,默认为32 min_instances_per_node - 树节点分割时要求子节点包含的最小实例数,默认为1 min_info_gain - 最小信息增益,默认为0 impurity - 不纯度,支持entropy、gini,默认为"gini"
  • 概述 “梯度提升树分类”节点用于生成二分类模型,是一种基于决策树的迭代分类算法。该算法采用迭代的思想不断地构建决策树模型,每棵树都是通过梯度优化损失函数而构建,从而达到从基准值到目标值的逼近。算法思想可简单理解成:后一次模型都是针对前一次模型预测出错的情况进行修正,模型随着迭代不断地改进,从而获得比较好的预测效果。 梯度提升树分类的损失函数为对数似然损失函数,如下所示: 式中,N 表示样本数量,xi 表示样本i 的特征,yi 表示样本i 的标签,F(xi) 表示样本i 预测的标签。
  • 样例 inputs = { "dataframe": None # @input {"label":"dataframe","type":"DataFrame"}}params = { "inputs": inputs, "b_output_action": True, "b_use_default_encoder": True, "input_features_str": "", # @param {"label": "input_features_str", "type": "string", "required": "false", "helpTip": ""} "outer_pipeline_stages": None, "label_col": "", # @param {"label": "label_col", "type": "string", "required": "true", "helpTip": "target label column"} "classifier_label_index_col": "label_index", # @param {"label": "classifier_label_index_col", "type": "string", "required": "true", "helpTip": ""} "classifier_feature_vector_col": "model_features", # @param {"label": "classifier_feature_vector_col", "type": "string", "required": "true", "helpTip": ""} "prediction_index_col": "prediction_index", # @param {"label": "prediction_index_col", "type": "string", "required": "true", "helpTip": ""} "prediction_col": "prediction", # @param {"label": "prediction_col", "type": "string", "required": "true", "helpTip": ""} "max_depth": 5, # @param {"label": "max_depth", "type": "integer", "required": "true", "range": "(0,2147483647]", "helpTip": ""} "max_bins": 32, # @param {"label": "max_bins", "type": "integer", "required": "true", "range": "(0,2147483647]", "helpTip": ""} "min_instances_per_node": 1, # @param {"label": "min_instances_per_node", "type": "integer", "required": "true", "range":"(0,2147483647]", "helpTip": ""} "min_info_gain": 0.0, # @param {"label": "min_info_gain", "type": "number", "required": "true", "range": "[0,none)", "helpTip": ""} "loss_type": "logistic", "max_iter": 20, # @param {"label": "max_iter", "type": "integer", "required": "true", "range": "(0,2147483647]", "helpTip": ""} "step_size": 0.1, # @param {"label": "step_size", "type": "number", "required": "true", "range": "(0,none)", "helpTip": ""} "subsampling_rate": 1.0 # @param {"label": "subsampling_rate", "type": "number", "required": "true", "range": "(0,1.0]", "helpTip": ""}}gbt_classifier____id___ = MLSGBTClassifier(**params)gbt_classifier____id___.run()# @output {"label":"pipeline_model","name":"gbt_classifier____id___.get_outputs()['output_port_1']","type":"PipelineModel"}
  • 参数说明 参数 子参数 参数说明 input_features_str - 输入的列名以逗号分隔组成的字符串,例如: "column_a" "column_a,column_b" label_col - 目标列 classifier_label_index_col - 目标列经过标签编码后的新的列名,默认为"label_index" classifier_feature_vector_col - 算子输入的特征向量列的列名,默认为"model_features" prediction_index_col - 算子输出的预测label对应的标签列,默认为"prediction_index" prediction_col - 算子输出的预测label的列名,默认为"prediction" max_depth - 树的最大深度,默认为5 max_bins - 最大分箱数,默认为32 min_instances_per_node - 树节点分割时要求子节点包含的最小实例数,默认为1 min_info_gain - 最小信息增益,默认为0 max_iter - 最大迭代次数,默认为20 step_size - 步长,默认为0.1 subsampling_rate - 训练每棵树时对训练集的抽样率,默认为1.0
  • 样例 inputs = { "dataframe": None # @input {"label":"dataframe","type":"DataFrame"}}params = { "inputs": inputs, "b_output_action": True, "outer_pipeline_stages": None, "input_features_str": "", # @param {"label":"input_features_str","type":"string","required":"false","helpTip":""} "label_col": "", # @param {"label":"label_col","type":"string","required":"true","helpTip":""} "classifier_label_index_col": "label_index", # @param {"label":"classifier_label_index_col","type":"string","required":"false","helpTip":""} "classifier_feature_vector_col": "model_features", # @param {"label":"classifier_feature_vector_col","type":"string","required":"false","helpTip":""} "prediction_index_col": "prediction_index", # @param {"label":"prediction_index_col","type":"string","required":"false","helpTip":""} "prediction_col": "prediction", # @param {"label":"prediction_col","type":"string","required":"false","helpTip":""} "probability_col": "probability", # @param {"label":"probability_col","type":"string","required":"false","helpTip":""} "is_unbalance": False, # @param {"label":"is_unbalance","type":"boolean","required":"false","helpTip":""} "timeout": 1200.0, # @param {"label":"timeout","type":"number","required":"false","helpTip":""} "objective": "binary", # @param {"label":"objective","type":"string","required":"false","helpTip":""} "max_depth": -1, # @param {"label":"max_depth","type":"integer","required":"false","range":"[-1,2147483647]","helpTip":""} "num_iteration": 100, # @param {"label":"num_iteration","type":"integer","required":"false","range":"(0,2147483647]","helpTip":""} "learning_rate": 0.1, # @param {"label":"learning_rate","type":"number","required":"false","helpTip":""} "num_leaves": 31, # @param {"label":"num_leaves","type":"integer","required":"false","range":"(0,2147483647]","helpTip":""} "max_bin": 255, # @param {"label":"max_bin","type":"integer","required":"false","range":"(0,2147483647]","helpTip":""} "bagging_fraction": 1.0, # @param {"label":"bagging_fraction","type":"number","required":"false","helpTip":""} "bagging_freq": 0, # @param {"label":"bagging_freq","type":"integer","required":"false","range":"[0,2147483647]","helpTip":""} "bagging_seed": 3, # @param {"label":"bagging_seed","type":"integer","required":"false","range":"[0,2147483647]","helpTip":""} "early_stopping_round": 0, # @param {"label":"early_stopping_round","type":"integer","required":"false","range":"[0,2147483647]","helpTip":""} "feature_fraction": 1.0, # @param {"label":"feature_fraction","type":"number","required":"false","helpTip":""} "min_sum_hessian_in_leaf": 1e-3, # @param {"label":"min_sum_hessian_in_leaf","type":"number","required":"false","helpTip":""} "boost_from_average": True, # @param {"label":"boost_from_average","type":"boolean","required":"false","helpTip":""} "boosting_type": "gbdt", # @param {"label":"boosting_type","type":"string","required":"false","helpTip":""} "lambda_l1": 0.0, # @param {"label":"lambda_l1","type":"number","required":"false","helpTip":""} "lambda_l2": 0.0, # @param {"label":"lambda_l2","type":"number","required":"false","helpTip":""} "num_batches": 0, # @param {"label":"num_batches","type":"integer","required":"false","range":"[0,2147483647]","helpTip":""} "parallelism": "data_parallel", # @param {"label":"parallelism","type":"string","required":"false","helpTip":""} "thresholds_str": "" # @param {"label":"thresholds_str","type":"string","required":"false","helpTip":""}}lightgbm_classifier____id___ = MLSLightGBMClassifier(**params)lightgbm_classifier____id___.run()# @output {"label":"pipeline_model","name":"lightgbm_classifier____id___.get_outputs()['output_port_1']","type":"PipelineModel"}
  • 参数说明 参数 子参数 参数说明 input_features_str - 输入的列名以逗号分隔组成的字符串,例如: "column_a" "column_a,column_b" label_col - 目标列 classifier_label_index_col - 目标列经过标签编码后的新的列名,默认为"label_index" classifier_feature_vector_col - 算子输入的特征向量列的列名,默认为"model_features" prediction_index_col - 算子输出的预测label对应的标签列,默认为"prediction_index" prediction_col - 算子输出的预测label的列名,默认为"prediction" probability_col - 算子输出的概率列的列名,默认为"probability" is_unbalance - 数据集是否不平衡,默认为False timeout - 超时时间,默认为1200秒 objective - 目标函数,支持binary,multiclass,multiclassova,默认为"binary" max_depth - 树的最大深度,默认为-1 num_iteration - 迭代次数,默认为100 learning_rate - 学习率,默认为0.1 num_leaves - 叶子数目,默认为31 max_bin - 最大分箱数,默认为255 bagging_fraction - bagging的比例,默认为1 bagging_freq - bagging的频率,默认为0 bagging_seed - bagging时的随机数种子,默认为3 early_stopping_round - 提前结束迭代的轮数,默认为0 feature_fraction - 特征的比例,默认为1.0 min_sum_hessian_in_leaf - 一个叶子上最小hessian和。取值区间为[0, 1],默认为1e-3 boost_from_average - 是否将初始分数调整为标签的平均值,以加快收敛速度,,默认为True boosting_type - 提升方法的提升类型。 可选值有:gbdt、gbrt、rf、dart、goss,默认为"gbdt" lambda_l1 - L1正则化系数,默认为0.0 lambda_l2 - L2正则化系数,,默认为0.0 num_batches - 如果大于0,在训练中将数据集分割成不同的批次,默认为0 parallelism - 学习树时的并行方法,支持data_parallel, voting_parallel,默认为"data_parallel" thresholds_str - 多分类时使用,表示每个类别对应的概率值预置的数组,字符串用逗号隔开
  • 样例 inputs = { "dataframe": None # @input {"label":"dataframe","type":"DataFrame"}}params = { "inputs": inputs, "b_output_action": True, "b_use_default_encoder": True, # @param {"label": "b_use_default_encoder", "type": "boolean", "required": "true", "helpTip": ""} "input_features_str": "", # @param {"label": "input_features_str", "type": "string", "required": "false", "helpTip": ""} "outer_pipeline_stages": None, "label_col": "", # @param {"label": "label_col", "type": "string", "required": "true", "helpTip": ""} "classifier_label_index_col": "label_index", # @param {"label": "classifier_label_index_col", "type": "string", "required": "true", "helpTip": ""} "classifier_feature_vector_col": "model_features", # @param {"label": "classifier_feature_vector_col", "type": "string", "required": "true", "helpTip": ""} "prediction_col": "prediction", # @param {"label": "prediction_col", "type": "string", "required": "true", "helpTip": ""} "prediction_index_col": "prediction_index", # @param {"label": "prediction_index_col", "type": "string", "required": "true", "helpTip": ""} "smoothing": 1.0, # @param {"label": "smoothing", "type": "number", "required": "true", "range": "[0,none)", "helpTip": ""} "model_type": "multinomial" # @param {"label": "model_type", "type": "enum", "required": "true", "options": "multinomial,bernoulli", "helpTip": ""}}naive_bayes_classifier____id___ = MLSNaiveBayesClassifier(**params)naive_bayes_classifier____id___.run()# @output {"label":"pipeline_model","name":"naive_bayes_classifier____id___.get_outputs()['output_port_1']","type":"PipelineModel"}
  • 概述 “朴素贝叶斯”节点用于产生多分类模型,用户在使用时需要指定数据的“Role”字段,默认支持“Input”、“Target”、“Rejected”、“ID”四种类型,且只能选择其一种。 朴素贝叶斯算法是基于贝叶斯定理与特征条件独立假设的分类方法。 朴素贝叶斯法实现简单,学习与预测的效率都很高,是一种常用的方法。对于给定的训练数据集: 首先基于特征条件独立假设学习输入/输出的联合概率分布。 然后基于此模型,对给定的输入x,利用贝叶斯定理求出后验概率最大的输出y。
共100000条