> ## Documentation Index
> Fetch the complete documentation index at: https://koreai.mintlify.app/llms.txt
> Use this file to discover all available pages before exploring further.

# Deploy a Model API

<Badge icon="arrow-left" color="gray">[Back to API List](/agent-platform/apis)</Badge>

This API deploys an open-source or fine-tuned model in the ***Ready to Deploy*** state. Users can configure deployment parameters, including hyperparameters, scaling, and optimization settings, allowing for flexible model scaling and performance tuning.

The API response includes the **model ID** and the **model deployment status**. After receiving the response, use the `dockStatusId` to call the [Get Dock Status API](/agent-platform/apis/apis-list/get-dock-status)  and verify the successful deployment of the model.

| Method        | POST                                                                                     |
| :------------ | :--------------------------------------------------------------------------------------- |
| Endpoint      | `https://{host}/api/public/models/:{<i>modelId</i>}/deploy?modelType={<i>modelType</i>}` |
| Content Type  | application/json                                                                         |
| Authorization | `X-api-key` - The API key used for authentication.                                       |

To use the API, [create an API key](/agent-platform/apis#how-to-create-the-api-key).

## Query Parameters

| PARAMETER | DESCRIPTION                                                            | TYPE   | REQUIRED/OPTIONAL | ENUM VALUES                 |
| :-------- | :--------------------------------------------------------------------- | :----- | :---------------- | :-------------------------- |
| host      | The environment URL. For example, `https://agent-platform.domain.ai/`. | String | Required          | N/A                         |
| modelId   | The model ID to deploy.                                                | String | Required          | N/A                         |
| modelType | Type of model being deployed.                                          | String | Required          | \["openSource", "fineTune"] |

## Sample Request

**For an Opensource Model Source**

```js theme={null}
curl --location 'https://{host}/api/public/models/cm-2xxxxxxxxxxxxxxxxxx0/deploy?modelType=openSource' 
--header 'x-api-key: kg-axxxxxxx-5xx3-5xx8-bxxb-9xxxxxxxxxx-ebxxxxxx-5xxb-4xxb-9xx5-cxxxxxxxxx3' 
--header 'Content-Type: application/json' 
--data '{
    "name": "Flant5_model",
    "hyperParameters": {
      "temperature": 1,
      "maxTokens": 512,
      "topP": 1,
      "topK": 50,
      "stopSequence": []
    },
    "scalingParameters": {
      "maxBatchSize": 10,
      "minReplicas": 1,
      "maxReplicas": 2,
      "scaleUpDelay": 30,
      "scaleDownDelay": 600
    },
    "deviceType": "g5.xlarge",
    "optimizationInfo": {
      "optimizationType": "",
      "quantizationType": ""
    },
    "isDeployedPreviously": true
  }'
```

**For a Fine-tune Model Source**

```js theme={null}
curl --location ' https://{host}/api/public/models/cm-6xxxxxxxxxxxxxxxxxx9/deploy?modelType=fineTune' 
--header 'x-api-key: kg-2xxxxxxxxxxxxxxxxxxf-7xxxxxxx-7xx8-4xxf-8xx7-dxxxxxxxxxx3' 
--header 'Content-Type: application/json' 
--data '{
    "name": "gpt2",
    "hyperParameters": {
      "temperature": 1,
      "maxTokens": 512,
      "topP": 1,
      "topK": 50,
      "stopSequence": []
    },
    "scalingParameters": {
      "maxBatchSize": 10,
      "minReplicas": 1,
      "maxReplicas": 2,
      "scaleUpDelay": 30,
      "scaleDownDelay": 600
    },
    "deviceType": "g5.xlarge",
    "optimizationInfo": {
      "optimizationType": "",
      "quantizationType": ""
    },
    "isDeployedPreviously": true
  }'
```

## Body Parameters

The following deployment parameters can be configured and passed in the body:

**General Parameters**

| PARAMETER            | DESCRIPTION                                 | TYPE    | REQUIRED/OPTIONAL | ENUM VALUES    |
| :------------------- | :------------------------------------------ | :------ | :---------------- | :------------- |
| name                 | Name of the model to deploy.                | String  | Required          | N/A            |
| isDeployedPreviously | Indicates if the model was deployed before. | Boolean | Optional          | \[true, false] |

**Hyperparameters**

| PARAMETER    | DESCRIPTION                    | TYPE  | REQUIRED/OPTIONAL | ENUM VALUES |
| :----------- | :----------------------------- | :---- | :---------------- | :---------- |
| temperature  | Controls randomness of output. | Float | Required          | 0-2         |
| maxTokens    | Maximum tokens allowed.        | Int   | Required          | 0-512       |
| topP         | Controls nucleus sampling.     | Float | Required          | 0-1         |
| topK         | Controls top-K sampling.       | Int   | Required          | 1-100       |
| stopSequence | Stop sequences for the model.  | Array | Optional          | N/A         |

**Scaling Parameters**

| PARAMETER      | DESCRIPTION                     | TYPE | REQUIRED/OPTIONAL | RANGE   |
| :------------- | :------------------------------ | :--- | :---------------- | :------ |
| maxBatchSize   | Maximum batch size.             | Int  | Optional          | 1-256   |
| minReplicas    | Minimum replicas.               | Int  | Optional          | 1-10    |
| maxReplicas    | Maximum replicas.               | Int  | Optional          | 1-50    |
| scaleUpDelay   | Delay before scaling up (ms).   | Int  | Optional          | 1-1000  |
| scaleDownDelay | Delay before scaling down (ms). | Int  | Optional          | 50-2000 |

**Deployment Device & Optimization**

| PARAMETER        | DESCRIPTION                 | TYPE   | REQUIRED/OPTIONAL | ENUM VALUES                                                                                                            |
| :--------------- | :-------------------------- | :----- | :---------------- | :--------------------------------------------------------------------------------------------------------------------- |
| deviceType       | Device type for deployment. | String | Required          | \["g4dn.xlarge", "g5.xlarge", "g5.2xlarge", "g6e.xlarge", "g4dn.12xlarge", "g5.12xlarge", "g5.48xlarge", "g4dn.metal"] |
| optimizationInfo | Optimization details.       | Object | Optional          | N/A                                                                                                                    |
| optimizationType | Type of optimization.       | String | Optional          | \["ctranslate2", "vllm"]                                                                                               |
| quantizationType | Type of quantization.       | String | Optional          | \["no\_quantization", "int8\_float16"]                                                                                 |

## Sample Response

```js theme={null}
{
  "dock-statusId": "ds-d0xxxxxd-bxx9-5xx0-8xx5-5bxxxxxxxxx1",
  "modelId": "cm-77xxxxxb-exx9-5xxc-8xx6-5xxxxxxxxxx1",
  "jobType": "MODELS",
  "action": "DEPLOY",
  "status": "IN_PROGRESS"
}
```

## Response Parameters

| PARAMETER           | DESCRIPTION                                                | TYPE   |
| :------------------ | :--------------------------------------------------------- | :----- |
| <b>dockStatusId</b> | The unique identifier for tracking the model deployment.   | String |
| <b>modelId</b>      | The model that was deployed.                               | String |
| <b>jobType</b>      | Specifies the type of job (for example, `MODELS`).         | String |
| <b>action</b>       | Indicates the performed action (`DEPLOY`).                 | String |
| <b>status</b>       | Deployment status (`SUCCESS`, `IN_PROGRESS`, or `FAILED`). | String |
