Skip to main content

Models

Configure and manage AI models for your agents, tools, and workflows.

Overview

Agent Platform is model-agnostic, supporting a wide range of AI models from commercial providers, open-source repositories, and custom fine-tuned deployments. Access Model Hub from the top navigation to manage all your models in one place. Model Hub provides three ways to work with models:
Model TypeDescriptionBest For
External ModelsCommercial models from OpenAI, Anthropic, Google, Azure, Cohere, and Amazon BedrockProduction workloads requiring proven reliability
Open-Source Models30+ curated models plus any Hugging Face text generation modelCost control, customization, data privacy
Fine-Tuned ModelsCustom models trained on your enterprise dataDomain-specific tasks, consistent outputs

External Models

Connect commercial models with minimal setup using Easy Integration or API Integration.

Supported Providers

ProviderPopular ModelsTool Calling
OpenAIGPT-4o, GPT-4 Turbo, GPT-3.5 Turbo, o1 series
AnthropicClaude 3.5 Sonnet, Claude 3 Opus/Sonnet/Haiku
GoogleGemini 1.5 Pro, Gemini 1.5 Flash
Azure OpenAIGPT-4o, GPT-4, GPT-3.5 Turbo
CohereCommand R+, Command R
Amazon BedrockClaude, Titan, Llama models via AWS

Adding an External Model

Easy Integration (recommended):
  1. Go to ModelsExternal ModelsAdd a model
  2. Select Easy Integration and choose your provider
  3. Enter your API credentials
  4. Select the model and confirm
API Integration (for custom endpoints):
  1. Select API Integration when adding a model
  2. Configure the endpoint URL and authentication
  3. Map request/response parameters
  4. Test and save
Note: Custom models must support tool calling and follow OpenAI or Anthropic API structures to work with Agentic Apps.
For detailed setup instructions, see Add a Model using Easy Integration or Add a Model using API Integration.

Open-Source Models

Deploy from 30+ curated models or import any text generation model from Hugging Face.

Deployment Options

OptionDescription
Kore-HostedSelect from curated models, optimize, and deploy on managed infrastructure
Hugging Face ImportBring any compatible model from Hugging Face (public or private)

Optimization Techniques

Before deployment, optimize models for better performance:
  • vLLM: High-throughput inference optimization
  • CTranslate2 (CT2): Efficient inference with reduced memory footprint
  • No Optimization: Deploy as-is for maximum compatibility

Quick Deploy Steps

  1. Go to ModelsOpen-source modelsDeploy a model
  2. Select a Kore-hosted model or import from Hugging Face
  3. Choose optimization technique (optional)
  4. Configure parameters and hardware
  5. Click Deploy
For the complete model list and detailed instructions, see Deploy an Open-Source Model.

Fine-Tuned Models

Create custom models trained on your enterprise data for domain-specific tasks.

When to Fine-Tune

  • Consistent output format required across responses
  • Domain-specific terminology or jargon
  • Unique tone, style, or brand voice
  • Improved accuracy for specialized tasks

Fine-Tuning Process

  1. Prepare Data: Format training data as JSONL with conversation examples
  2. Select Base Model: Choose from supported Kore-hosted or Hugging Face models
  3. Configure Training: Select fine-tuning type (Full, LoRA, or QLoRA)
  4. Monitor Progress: Track metrics via Weights & Biases integration
  5. Deploy: Make the model available across Agent Platform

Training Data Format

{"messages": [{"role": "system", "content": "You are a support agent."}, {"role": "user", "content": "Order status?"}, {"role": "assistant", "content": "I'll check your order status right away."}]}
{"messages": [{"role": "system", "content": "You are a support agent."}, {"role": "user", "content": "Return request"}, {"role": "assistant", "content": "I can help process your return."}]}
For step-by-step instructions, see Create a Fine-Tuned Model.

Model Parameters

Configure generation behavior when using models across Agent Platform:
ParameterDescriptionTypical Range
TemperatureControls randomness. Lower = focused, higher = creative0.0–2.0 (default: 0.7)
Top PNucleus sampling—considers tokens within probability mass0.0–1.0 (default: 0.9)
Top KLimits selection to top K tokens1–100 (default: 50)
Max TokensMaximum output lengthVaries by model
Stop SequencesStrings that stop generationCustom list

Tool Calling Support

Not all models support tool calling, which is required for Agentic Apps. Use models with tool calling support for agent orchestration. Supported for Tool Calling:
  • OpenAI: GPT-4o, GPT-4 Turbo, GPT-3.5 Turbo
  • Anthropic: Claude 3.5 Sonnet, Claude 3 series
  • Google: Gemini 1.5 Pro, Gemini 1.5 Flash
  • Azure OpenAI: GPT-4o, GPT-4, GPT-3.5 Turbo
  • Amazon Bedrock: Models via supported providers
Not Supported:
  • Kore-hosted open-source models
  • Most Hugging Face imports
  • Models without function calling capabilities

Model Selection Guide

Choose the right model based on your use case:
Use CaseRecommended ModelsWhy
Complex reasoningGPT-4o, Claude 3 OpusHighest accuracy
Fast responsesGPT-3.5, Claude 3 Haiku, Gemini FlashLow latency
Code generationGPT-4o, Claude 3.5 SonnetBest code quality
Cost-sensitiveGPT-3.5, Claude 3 HaikuLower token cost
Long contextClaude 3, Gemini 1.5100K+ token windows
Data privacyOpen-source (Kore-hosted)No external API calls
Real-time voiceGPT-4o Realtime PreviewNative voice support

Structured Output

Enable consistent, parseable responses using JSON schemas. Supported:
  • External models (OpenAI, Anthropic, Google)
  • Kore-hosted open-source models with vLLM or no optimization
Not Supported:
  • CT2-optimized models
  • Fine-tuned models
  • Hugging Face imports
  • Locally imported models

Model Endpoint & API Keys

After deployment, each model provides:
  • API Endpoint: Use models externally via REST API
  • API Keys: Secure access tokens for endpoint authentication
  • Deployment History: Track version changes
Access these from the model’s three-dot menu → API Endpoint.

Monitoring

Track model performance across Agent Platform:
  • Model Analytics Dashboard: Token usage, latency, error rates
  • Model Traces: Detailed request/response logs
  • Usage Summary: Cost tracking by model
Access via SettingsMonitoringAnalytics.