云搜索服务 CSS-配置模型服务:管理模型服务

时间:2025-08-22 17:22:53

管理模型服务

搜索大模型插件深度集成Kibana命令行界面(CLI),支持对模型服务进行更新、监控、扩缩容等全生命周期管理。如表4所示,可以通过标准CLI命令执行更新(update)、删除(delete)等核心操作管理模型服务。

表4 模型服务管理的核心操作

操作类型

API命令

请求示例

响应示例

更新模型服务

POST  _inference/model_service/{service_name}/update

更新Embedding模型服务:

POST  _inference/model_service/pangu_vector/update
{
  "description": "搜索大模型-语义向量化模型更新",
  "service_config": {
    "semantic_vector": {
      "service_urls": ["http://{endpoint}/app/search/v1/vector"],
      "timeout_ms": 60000
    }
  }
}

返回更新后的模型服务信息:

{
  "service_name" : "pangu_vector",
  "service_type" : "remote",
  "description" : "搜索大模型-语义向量化模型更新",
  "create_time" : 1747966388508,
  "service_config" : {
    "semantic_vector" : {
      "embedding_type" : "query2doc",
      "service_urls" : [
"http://{endpoint}/app/search/v1/vector"],
      "method" : "POST",
      "timeout_ms" : 60000,
      "max_conn" : 200,
      "security" : false,
      "dimension" : "768",
      "algorithm" : "GRAPH",
      "metric" : "inner_product"
    }
  }
}

检查模型服务连通性

GET _inference/model_service/{service_name}/check

检查Embedding模型服务的连通性:

GET _inference/model_service/pangu_vector/check
{
  "acknowledged" : true
}

查看模型服务

  • 查看全部模型服务的配置信息
    GET _inference/model_service
  • 查看单个模型服务的配置信息
    GET _inference/model_service/{service_name}

查看Embedding模型服务的配置信息:

GET _inference/model_service/pangu_vector

返回模型服务信息:

{
  "count" : 1,
  "model_service_configs" : [
    {
      "service_name" : "pangu_vector",
      "service_type" : "remote",
      "description" : "搜索大模型-语义向量化模型",
      "create_time" : 1747966388508,
      "service_config" : {
        "semantic_vector" : {
          "embedding_type" : "query2doc",
          "service_urls" : ["http://{endpoint}/app/search/v1/vector"],
          "method" : "POST",
          "timeout_ms" : 60000,
          "max_conn" : 200,
          "security" : false,
          "dimension" : "768",
          "algorithm" : "GRAPH",
          "metric" : "inner_product"
        }
      }
    }
  ]
}

删除模型服务配置(删除后,索引将无法使用该模型服务)

DELETE _inference/model_service/{service_name}

删除Embedding模型服务配置:

DELETE _inference/model_service/pangu_vector
{
  "acknowledged" : true
}

设置模型服务的数量上限(最多支持创建几个模型服务)

PUT _cluster/settings
{
  "transient": {
    "pg_search.inference.max_inference_model_service": 100  //最大值是1000,最小值是1,默认值是100。
  } 
}

设置模型服务的数量上限为10:

PUT _cluster/settings
{
  "transient": {
    "pg_search.inference.max_inference_model_service": 10
  } 
}
{
  "acknowledged" : true,
  "persistent" : { },
  "transient" : {
    "pg_search" : {
      "inference" : {
        "max_inference_model_service" : "10"
      }
    }
  }
}
support.huaweicloud.com/usermanual-css/css_01_0282.html