dockStatusId to call the Get Dock Status API and verify the successful deployment of the model.
| Method | POST |
|---|---|
| Endpoint | https://{host}/api/public/models/:{<i>modelId</i>}/deploy?modelType={<i>modelType</i>} |
| Content Type | application/json |
| Authorization | X-api-key - The API key used for authentication. |
Query Parameters
| PARAMETER | DESCRIPTION | TYPE | REQUIRED/OPTIONAL | ENUM VALUES |
|---|---|---|---|---|
| host | The environment URL. For example, https://agent-platform.domain.ai/. | String | Required | N/A |
| modelId | The model ID to deploy. | String | Required | N/A |
| modelType | Type of model being deployed. | String | Required | [“openSource”, “fineTune”] |
Sample Request
For an Opensource Model SourceBody Parameters
The following deployment parameters can be configured and passed in the body: General Parameters| PARAMETER | DESCRIPTION | TYPE | REQUIRED/OPTIONAL | ENUM VALUES |
|---|---|---|---|---|
| name | Name of the model to deploy. | String | Required | N/A |
| isDeployedPreviously | Indicates if the model was deployed before. | Boolean | Optional | [true, false] |
| PARAMETER | DESCRIPTION | TYPE | REQUIRED/OPTIONAL | ENUM VALUES |
|---|---|---|---|---|
| temperature | Controls randomness of output. | Float | Required | 0-2 |
| maxTokens | Maximum tokens allowed. | Int | Required | 0-512 |
| topP | Controls nucleus sampling. | Float | Required | 0-1 |
| topK | Controls top-K sampling. | Int | Required | 1-100 |
| stopSequence | Stop sequences for the model. | Array | Optional | N/A |
| PARAMETER | DESCRIPTION | TYPE | REQUIRED/OPTIONAL | RANGE |
|---|---|---|---|---|
| maxBatchSize | Maximum batch size. | Int | Optional | 1-256 |
| minReplicas | Minimum replicas. | Int | Optional | 1-10 |
| maxReplicas | Maximum replicas. | Int | Optional | 1-50 |
| scaleUpDelay | Delay before scaling up (ms). | Int | Optional | 1-1000 |
| scaleDownDelay | Delay before scaling down (ms). | Int | Optional | 50-2000 |
| PARAMETER | DESCRIPTION | TYPE | REQUIRED/OPTIONAL | ENUM VALUES |
|---|---|---|---|---|
| deviceType | Device type for deployment. | String | Required | [“g4dn.xlarge”, “g5.xlarge”, “g5.2xlarge”, “g6e.xlarge”, “g4dn.12xlarge”, “g5.12xlarge”, “g5.48xlarge”, “g4dn.metal”] |
| optimizationInfo | Optimization details. | Object | Optional | N/A |
| optimizationType | Type of optimization. | String | Optional | [“ctranslate2”, “vllm”] |
| quantizationType | Type of quantization. | String | Optional | [“no_quantization”, “int8_float16”] |
Sample Response
Response Parameters
| PARAMETER | DESCRIPTION | TYPE |
|---|---|---|
| dockStatusId | The unique identifier for tracking the model deployment. | String |
| modelId | The model that was deployed. | String |
| jobType | Specifies the type of job (for example, MODELS). | String |
| action | Indicates the performed action (DEPLOY). | String |
| status | Deployment status (SUCCESS, IN_PROGRESS, or FAILED). | String |