数智融合计算服务 DATAARTSFABRIC-创建一个推理服务:操作步骤
操作步骤
- 调用创建工作空间接口,创建一个工作空间,记录接口返回工作空间的ID。
请求示例:
POST https://{hostname}/v1/workspaces
Body:{ "name": "apieworkspace", "description": "apie test workspace" } 响应示例: { "id": "e935d0ef-f4eb-4b95-aff1-9d33ae9f57a6", "name": "fabric", "description": "fabric", "create_time": "2023-05-30T12:24:30.401Z", "create_domain_name": "admin", "create_user_name": "user", "metastore_id": "2180518f-42b8-4947-b20b-adfc53981a25", "access_url": "https://:test.fabric.com/", "enterprise_project_id": "01049549-82cd-4b2b-9733-ddb94350c125" }
- 调用创建端点接口,创建推理端点,记录接口返回的终端节点ID。
请求示例:
POST https://{hostname}/v1/workspaces/{workspace_id}/endpoints
workspace_id:为步骤1中记录的工作空间ID。
Body:{ "name": "apie_test", "description": "apie test endpoint", "type": "inference", "reserved_resource": { "mu": { "spec_code": "mu.llama3.8b", "min": 0, "max": 1 } } }
响应示例:{ "visibility": "PRIVATE", "id": "0b5633ba2b904511ad514346f4d23d4b", "name": "endpoint1", "type": "inference", "status": "CREATING", "description": "description", "create_time": "2023-05-30T12:24:30.401Z", "update_time": "2023-05-30T12:24:30.401Z", "owner": { "domain_name": "string", "domain_id": "xxx", "user_name": "string", "user_id": "xxx" } "reserved_resource": { "mu": { "spec_code": "mu.llama3.8b", "min": 0, "max": 1 } } }
- 调用创建模型接口,创建一个用户私有的模型,并记录接口返回模型ID。
请求示例:
POST https://{hostname}/v1/workspaces/{workspace_id}/models
workspace_id:为步骤1中记录的工作空间ID。
Body:
{ "name": "LLama3-8b", "description": "this is a apie test model", "type": "LLM_MODEL", "version": { "name": "v1", "description": "test description", "config": { "llm_model_config": { "base_model_type": "", "model_path": "" } } } }
响应示例:
{ "id": "ac8111bf-3601-4905-8ddd-b41d3e636a4e"} }
- 调用创建推理服务接口,创建一个推理服务,记录接口返回推理服务的ID。
POST https://{hostname}/v1/workspaces/{workspace_id}/services/instances
workspace_id:为步骤1中记录的工作空间ID。
Body:
{ "source": { "id": "" }, "name": "test_serviceInstanceName", "description": "description", "endpoint_id": ""}
- id:为步骤3中接口返回记录的模型ID。
- endpoint_id:为步骤2中接口返回记录的推理端点ID。
响应示例:
{ "id": "b935d0ef-f4eb-4b95-aff1-9d33ae9f57b6" }
- 调用推理请求接口,发起推理请求。
POST https://{hostname}/v1/workspaces/{workspace_id}/services/instances/{instance_id}/invocations
- workspace_id:为步骤1中记录的工作空间ID。
- instance_id:为步骤4中记录的推理服务的ID。
Body:
{ "messages": [ { "role": "user", "content": "hello" } ] }
响应示例:推理请求接口为流式返回。
{ "id": "chatcmpl-62dda7304f53451c9477e0", "object": "chat.completion.chunk", "created": 1730120529, "model": "ada1d67d-f2a1-4e77-838f-0d8688d756f4", "choices": [ { "index": 0, "delta": { "role": "assistant", "content": "\n\nHello! LLM stands for Large Language Model. It refers to artificial intelligence models, like myself," }, "finish_reason": null } ], "system_fingerprint": null, "usage": null }