> ## Documentation Index
> Fetch the complete documentation index at: https://koreai.mintlify.app/llms.txt
> Use this file to discover all available pages before exploring further.

# Evaluation Studio

Evaluation Studio is a unified workspace for evaluating AI system performance across two areas: **Model Evaluation** and **Agentic Evaluation**. It enables users to systematically assess both the quality of large language model (LLM) outputs and the end-to-end behavior of agentic applications in real-world scenarios.

```
Define Criteria → Load Test Data → Run Evaluations → Analyze Results → Iterate
```

***

## Evaluation Types

### Model Evaluation

Model Evaluation enables you to assess LLM performance using configurable quality and safety metrics. Organize datasets into projects and evaluations, apply built-in or custom evaluators, and analyze results through visual scoring and collaborative workflows.

**Key capabilities:**

* Upload datasets with input-output pairs, or generate outputs from a deployed model.
* Apply Quality, Safety, and RAGAS evaluators—or create custom evaluators tailored to your needs.
* Integrate external outputs via Run a Prompt, Run an API, or Run Search AI.
* Analyze model effectiveness through visual scoring, pass/fail thresholds, and exportable results.

**Use when**: Fine-tuning, comparing, or validating models before or after deployment.

[Learn more →](/agent-platform/evaluation/model-evaluation)

### Agentic Evaluation

Agentic Evaluation assesses how effectively an agentic app performs in both production and pre-production environments. Import real session data from deployed apps or generate simulated sessions using Personas and Test Scenarios to validate behavior before go-live.

**Key capabilities:**

* Import app sessions and traces from production or simulations.
* Generate simulated sessions using Personas and Test Scenarios to test agent behavior before deployment.
* Run multi-level evaluations across sessions and traces to assess goal achievement, workflow adherence, and tool usage.
* Analyze inputs and outputs across supervisors, agents, and tools to uncover coordination issues and optimization opportunities.

**Use when**: Pre-deployment validation, debugging agentic workflows, or ongoing production quality monitoring.

[Learn more →](/agent-platform/evaluation/agentic-evaluation)

***

## Accessing Evaluation Studio

1. [Log in](/agent-platform/getting-started) to your Platform account.
2. From the modules menu, select **Evaluation Studio**.
3. On the Evaluation page, select **Model Evaluation** or **Agentic Evaluation** to begin.

   <img src="https://mintcdn.com/koreai/3WP-UBS_WnsMhm49/agent-platform/evaluation/images/eval_studio_page.png?fit=max&auto=format&n=3WP-UBS_WnsMhm49&q=85&s=34644f837642fb16617d72910f72c43e" alt="Evaluation Studio Page" width="1545" height="817" data-path="agent-platform/evaluation/images/eval_studio_page.png" />
