> ## Documentation Index
> Fetch the complete documentation index at: https://koreai.mintlify.app/llms.txt
> Use this file to discover all available pages before exploring further.

# Prompts and LLM Configuration

> Configure LLM providers, model parameters, and prompt templates for agents in the AgenticAI Core SDK.

# Prompts and LLM Configuration

Use `LlmModel` and `Prompt` to control which model your agent uses, how it generates responses, and what instructions it follows.

## Prerequisites

* AgenticAI Core SDK installed and configured.
* A valid connection configured for your LLM provider (OpenAI, Anthropic, or Azure OpenAI).

## Configure the LLM model

### Basic configuration

```python theme={null}
from agenticai_core.designtime.models.llm_model import LlmModel, LlmModelConfig

llm = LlmModel(
    model="gpt-4o",
    provider="Open AI",
    connection_name="Default Connection",
    max_timeout="60 Secs",
    max_iterations="25",
    modelConfig=LlmModelConfig(
        temperature=0.7,
        max_tokens=1600,
        top_p=1.0,
        frequency_penalty=0.0,
        presence_penalty=0.0
    )
)
```

### Builder pattern

Use `LlmModelBuilder` and `LlmModelConfigBuilder` for a fluent configuration style:

```python theme={null}
from agenticai_core.designtime.models.llm_model import (
    LlmModelBuilder, LlmModelConfigBuilder
)

# Build config
config_dict = LlmModelConfigBuilder() \
    .set_temperature(0.7) \
    .set_max_tokens(1600) \
    .set_top_p(0.9) \
    .build()

config = LlmModelConfig(**config_dict)

# Build model
llm_dict = LlmModelBuilder() \
    .set_model("gpt-4o") \
    .set_provider("Open AI") \
    .set_connection_name("Default") \
    .set_model_config(config) \
    .build()

llm = LlmModel(**llm_dict)
```

## Supported providers

### OpenAI

```python theme={null}
llm = LlmModel(
    model="gpt-4o",
    provider="Open AI",
    connection_name="OpenAI Connection",
    modelConfig=LlmModelConfig(
        temperature=0.7,
        max_tokens=1600,
        frequency_penalty=0.0,
        presence_penalty=0.0,
        top_p=1.0
    )
)
```

### Anthropic (Claude)

```python theme={null}
llm = LlmModel(
    model="claude-3-5-sonnet-20240620",
    provider="Anthropic",
    connection_name="Anthropic Connection",
    modelConfig=LlmModelConfig(
        temperature=1.0,
        max_tokens=1024,
        top_p=0.7,
        top_k=5  # Anthropic-specific
    )
)
```

### Azure OpenAI

```python theme={null}
llm = LlmModel(
    model="gpt-4",
    provider="Azure OpenAI",
    connection_name="Azure Connection",
    modelConfig=LlmModelConfig(
        temperature=0.8,
        max_tokens=2048
    )
)
```

## LLM parameters

### Temperature (0.0–2.0)

Controls output randomness. Lower values produce more predictable responses; higher values produce more varied ones.

| Range   | Behavior               | Use for                           |
| ------- | ---------------------- | --------------------------------- |
| 0.0–0.3 | Deterministic, focused | Factual queries, data extraction  |
| 0.4–0.7 | Balanced               | General-purpose agents            |
| 0.8–1.5 | Creative, diverse      | Brainstorming, content generation |
| 1.6–2.0 | Highly random          | Experimental use cases            |

```python theme={null}
# Factual task
config = LlmModelConfig(temperature=0.1)

# Balanced
config = LlmModelConfig(temperature=0.7)

# Creative
config = LlmModelConfig(temperature=1.2)
```

### Max tokens

Sets the maximum number of tokens the model generates per response.

| Response type      | Recommended range |
| ------------------ | ----------------- |
| Short answers      | 500–1000          |
| Detailed responses | 1000–2000         |
| Long-form content  | 2000–4000         |

```python theme={null}
config = LlmModelConfig(
    max_tokens=1600  # Moderate response length
)
```

### Top P (0.0–1.0)

Nucleus sampling parameter — controls the token pool the model samples from.

* **0.1–0.5**: Focused, less diverse sampling.
* **0.6–0.9**: Balanced diversity.
* **0.95–1.0**: Maximum diversity.

```python theme={null}
config = LlmModelConfig(top_p=0.9)
```

### Penalties (−2.0 to 2.0)

Reduce repetition in responses:

* `frequency_penalty`: Penalizes tokens that appear frequently in the output.
* `presence_penalty`: Encourages the model to introduce new topics.

```python theme={null}
config = LlmModelConfig(
    frequency_penalty=0.5,  # Penalize frequent tokens
    presence_penalty=0.3    # Encourage topic diversity
)
```

## Configure prompts

### System prompt

Sets the base role for the agent:

```python theme={null}
from agenticai_core.designtime.models.prompt import Prompt

prompt = Prompt(
    system="You are a helpful assistant."
)
```

### Custom prompt

Provides detailed instructions and context beyond the system role:

```python theme={null}
prompt = Prompt(
    system="You are a helpful assistant.",
    custom="""You are an intelligent banking assistant designed to help
    customers manage their financial needs efficiently and securely.

    ## Your Capabilities
    - Check account balances
    - Process transactions
    - Answer banking policy questions
    - Provide loan information

    ## Customer Context
    You have access to:
    {{memory.accountInfo.accounts}}

    Use this information for quick responses.
    """
)
```

### Instructions

Pass structured rules as a list. Use instructions for compliance, tone, and handling guidelines — especially for sensitive domains:

```python theme={null}
prompt = Prompt(
    system="You are a banking assistant.",
    custom="Help customers with account management.",
    instructions=[
        """### Security Protocols
        - Never ask for passwords, PINs, or CVV numbers
        - If request seems suspicious, politely decline""",

        """### Speaking Style
        - Use natural, conversational language
        - Keep responses concise
        - Provide key information first""",

        """### Handling Requests
        1. Greet the customer warmly
        2. Identify their need
        3. Execute the request efficiently
        4. Summarize and ask if anything else needed"""
    ]
)
```

**Security guidance**: Always include a security instruction block for apps that handle sensitive data:

```python theme={null}
instructions=[
    """### Security
    - Never ask for passwords, PINs, CVV, or OTPs
    - Verify unusual requests
    - Escalate suspicious activity"""
]
```

**Voice agent guidance**: For voice or audio agents, add a speaking style instruction:

```python theme={null}
instructions=[
    """### Speaking Style
    - Use natural, conversational language
    - Avoid markdown formatting
    - Speak numbers clearly
    - Use pauses with commas
    - Keep responses concise"""
]
```

## Template variables

Prompts support runtime variable substitution using `{{variable}}` syntax:

| Variable                 | Description                 |
| ------------------------ | --------------------------- |
| `{{app_name}}`           | Application name.           |
| `{{app_description}}`    | Application description.    |
| `{{agent_name}}`         | Current agent name.         |
| `{{memory.store.field}}` | Access memory store data.   |
| `{{session_id}}`         | Current session identifier. |

```python theme={null}
prompt = Prompt(
    custom="""You are acting as {{agent_name}} for the application "{{app_name}}".

    Application Description:
    {{app_description}}

    Customer Account Information:
    {{memory.accountInfo.accounts}}

    Use the above context to provide quick, accurate responses.
    """
)
```

## Orchestrator prompts

For supervisor or orchestrator agents, define routing rules in the custom prompt:

```python theme={null}
supervisor_prompt = Prompt(
    system="You are a helpful assistant.",
    custom="""You are an AI Supervisor for "{{app_name}}".

    ### Your Team
    You manage multiple workers:
    - BillingAgent: Handles payments and billing
    - SupportAgent: General customer support
    - TechnicalAgent: Technical issues

    ### Routing Rules
    1. **Small-talk**: Route to user with friendly response
    2. **Direct Routing**: Match requests to worker expertise
    3. **Follow-up**: Route responses to same worker
    4. **Route to user**: When unrelated or complete
    5. **Multi-Intent**: Break into sequential requests
    """
)
```

## Task-specific configurations

Match your `LlmModelConfig` to the nature of the agent's task:

**Factual tasks** — use low temperature for consistent, accurate responses:

```python theme={null}
LlmModelConfig(
    temperature=0.1,  # Low for consistency
    max_tokens=800
)
```

**Creative tasks** — use higher temperature for varied output:

```python theme={null}
LlmModelConfig(
    temperature=1.0,  # Higher for creativity
    max_tokens=2000
)
```

**Balanced (general-purpose)**:

```python theme={null}
LlmModelConfig(
    temperature=0.7,
    max_tokens=1600,
    top_p=0.9
)
```

## Optimization tips

**Cost**

* Use smaller models for simple, repetitive tasks.
* Set `max_tokens` to the minimum needed for the expected response length.
* Set `max_iterations` to limit unnecessary tool calls.
* Configure reasonable timeouts to avoid runaway sessions.

**Quality**

* Use the latest model versions for your provider.
* Increase `max_tokens` when detailed responses are required.
* Lower temperature for tasks that require consistency.
* Increase `max_iterations` for complex multi-step workflows.

## Related resources

* [LLM Model API Reference](/agent-platform/sdk/api-reference#llmmodel)
* [Prompt API Reference](/agent-platform/sdk/api-reference#prompt)
* [Agent API Reference](/agent-platform/sdk/api-reference#agent)
* [Creating Agents](/agent-platform/sdk/guide/creating-agents)
