Documentation Index
Fetch the complete documentation index at: https://koreai.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
The Runs page is a central observability hub within your project. It displays all telemetry data captured from your connected AI agents, so you can monitor real-time agent activity, inspect individual sessions and traces, debug failures, and curate datasets for evaluation.
Navigation: Select a Project, then go to Observability → Runs in the left sidebar.
Overview
| Section | Purpose |
|---|
| Key Capabilities | What you can do on the Runs page |
| Sessions, Traces, and Spans | The three-level telemetry hierarchy |
| Streaming and Paused Mode | How live data and frozen views differ |
| The Sessions Grid | The main data grid and its columns |
| Inspect Session Details | Drill into I/O, logs, policies, and metadata |
| Traces and Spans | Navigate the session hierarchy |
| Filter Runs | Narrow down sessions by time, status, and metadata |
| Create Datasets | Group sessions for evaluation and regression testing |
| Common Use Cases | Real-world workflows on the Runs page |
Key Capabilities
- Stream incoming telemetry in real time with automatic updates.
- Drill down from Sessions → Traces → Spans for granular inspection.
- Filter data by time range, status, evaluation results, metadata, and natural language queries.
- Create static or dynamic datasets directly from sessions for evaluation and regression testing.
- Visualize trace timelines in a List or Waterfall view with color-coded span types.
Sessions, Traces, and Spans
The Runs page organizes telemetry data into a three-level hierarchy following the OpenTelemetry standard.
| Concept | Definition | Example |
|---|
| Session | A collection of related traces representing a complete user interaction or conversation | A multi-turn customer support conversation |
| Trace | A single agent workflow from input to output within a session. Each trace represents one request-response cycle | A user asks, “What is my account balance?” and the agent responds |
| Span | An individual operation within a trace. Spans nest hierarchically to form a tree structure | An LLM API call, tool invocation, or database query within a single trace |
A session contains one or more traces. Each trace contains one or more spans. You can drill down through each level to inspect execution details, timing, inputs, outputs, and errors.
Streaming and Paused Mode
The Runs page defaults to Streaming mode, where sessions update automatically. The streaming indicator in the top-right corner displays the current state and how recently the data was refreshed (for example, “Streaming · Updated 6s ago”).
Click the Streaming indicator to pause the data feed. While paused:
- The data grid freezes at the current point in time.
- A counter shows how many new sessions have arrived since you paused (for example, +1, +2, +3).
- Advanced filtering, bulk selection, and actions such as Save become available.
Click the indicator again to resume streaming. Any sessions that arrived while paused load into the view.
You can apply filters and perform bulk actions only while the stream is paused.
Streaming vs. Paused Mode
| Feature | Streaming Mode | Paused Mode |
|---|
| Data updates | Auto-refresh (real-time) | Static (frozen at the point of pause) |
| Search bar | Disabled | Enabled |
| Time range picker | Disabled | Enabled (defaults to Last 30 days) |
| Row checkboxes | Hidden | Visible |
| Save button | Disabled | Enabled (when rows are selected) |
| Policies tab in Detail Panel | Enabled | Enabled |
The Sessions Grid
The main area of the Runs page displays a data grid listing all sessions within the project.
Click Columns in the top-right corner of the grid to open the Toggle columns panel. Select or clear checkboxes to show or hide columns. Available columns include:
ID, Start Time, Last Updated Time, Duration, View Traces, Policies, Cost, Input Tokens, Output Tokens, Avg Latency, and PII.
Inspect Session Details
Click any session row in the data grid to open a detail panel with a complete breakdown of everything that happened during that session. The panel header displays the Session ID (with a copy icon), Latency, and Total Cost.
The detail panel has two areas:
- Timeline Visualization (left) — Displays all traces, agents, and spans that executed during the session. Toggle between List view (a vertical list with durations) and Waterfall view (a Gantt chart showing start times and durations relative to the root session). Click any item in the timeline to load its details on the right.
- Data Tabs (right) — Four tabs organize the session data: I/O, Log View, Policies, and Metadata. The content updates based on the item you select in the timeline.
I/O
Displays the execution hierarchy as interactive cards. Each card shows the item name, type (session, trace, agent, or chat), duration, cost, and ID. Click any card to navigate deeper into that item.
Log View
Displays the structured telemetry data for the selected item. Toggle between Formatted view (human-readable key-value layout) and JSON view (raw telemetry structure as sent by the SDK). Key fields include:
| Field Group | Fields |
|---|
| Identifiers | session_id, type, status |
| Performance | latency_ms, total_cost_usd |
| Tokens | total_input_tokens, total_output_tokens |
| Custom | metadata (expandable object containing custom keys) |
Policies
Displays the evaluation results for all policies applied to the session. Each policy appears as a card showing:
- Policy name and status badge (Pass, Fail, or Inconclusive).
- Version, severity level, and category.
- A Metrics Evaluated table with the metric name, threshold, and actual value.
- Evaluation timestamp.
Organizes all attributes associated with the selected item into collapsible sections:
| Section | Contents |
|---|
| METADATA | Session ID, Type, Duration, and Total Cost |
| TOKEN USAGE | Input and output token counts |
| TIMING | Timing-related attributes |
| CONTEXT | Contextual information passed with the session |
Traces and Spans
The Runs page lets you navigate through the session hierarchy to isolate exactly where an issue occurred. This drill-down workflow helps you trace a failure from the session level all the way to the specific span — such as a failed LLM call or a timed-out tool invocation — that caused the issue.
- From the Sessions grid, click the View Traces link on any row to see all traces within that session.
- From the Traces list, click View Spans on any trace to see the individual operations that executed within it.
- Use the Back button at each level to navigate up to the previous level.
Filter Runs
The Runs page provides several ways to narrow down data so you can focus on the sessions that matter most. Filtering is available only in Paused mode, so pause the stream before applying filters.
Click the Filters button to open the Filter runs panel:
| Filter | Description |
|---|
| Time Range | Select a preset (Lifetime, Last 15 minutes, Last hour, Last 24 hours, Last 7 days, Last 30 days) or define a Custom range. A histogram above the grid shows session distribution across the period |
| Session Status | Filter by Success, Failure, or In Progress |
| Evaluation Status | Filter by Pass or Fail |
| Policy Name | Select a specific policy to filter sessions evaluated by that policy |
| Input Tokens | Enter min/max values to filter sessions by input token count |
| Avg Latency | Enter min/max values to filter sessions by average latency |
Click Apply filters to apply your selections, or Reset to clear all active filters.
Natural Language Search
In Paused mode, use the Search telemetry bar to type a natural language query. The platform parses your input and maps it to structured filters automatically.
Examples:
duration greater than 15 seconds
errors in the last hour
payment_failed
Save Quick Filters
When you find yourself applying the same combination of filters repeatedly, save them as a Quick Filter for one-click reuse.
- Apply your desired filters.
- Click Save as Quick Filter at the bottom of the filter panel.
- Enter a name (for example, “Last 24h Failures”) and save.
The saved filter appears as a pill below the search bar. Click any quick filter pill to instantly apply all its associated settings.
Create Datasets
Datasets let you group sessions for focused analysis, evaluation, or regression testing. You can create them directly from the Runs page without leaving your workflow.
| Type | Description |
|---|
| Static | Fixed collection built by manually selecting sessions. You can add more sessions over time |
| Auto-update | Defined by filter criteria. The platform automatically adds all matching sessions, including new ones that arrive later |
| Static-Simulated | Created automatically when you trigger a simulation, saving those sessions as a dataset |
Create a Static Dataset
- Pause the stream.
- Select one or more sessions using the row checkboxes.
- Click the Save dropdown in the header bar and select Save Selection to Dataset.
- Choose an existing dataset or create a new one by providing a name and description.
To add more sessions later, repeat the same steps and select the same dataset as the destination.
Create an Auto-update Dataset
- Apply the filters that define the sessions you want to track (for example, Status = Error, Time Range = Last 7 Days).
- Click the Save dropdown and select Save current filters as Auto-update Dataset.
- Review the active filters in the confirmation modal, then provide a name and description.
The platform continuously adds new sessions matching the criteria. This is useful for tracking trends — for example, monitoring whether error rates decrease after deploying a fix.
You cannot manually add or remove sessions from an auto-update dataset. The filter criteria control its contents entirely.
Access all your datasets by navigating to Evaluations → Datasets in the left sidebar.
Common Use Cases
| Use Case | Workflow |
|---|
| Spot errors in real time | Keep the Runs page in Streaming mode. Watch for sessions with failure status indicators in the Policies column. When you notice a spike, pause the stream and drill down into affected sessions to inspect traces, spans, and error details |
| Debug a slow agent workflow | Pause the stream and filter for high-latency sessions. Open a session’s Detail Panel and switch to Waterfall view to identify which spans consume the most time. Check the Token Usage section on the Metadata tab to review token consumption |
| Validate a new agent version | After deploying a new version to staging, keep the Runs page in Streaming mode. Pause after collecting sufficient data. Filter by error status and review the failure rate. Save the filtered results as a dynamic dataset to track improvements over time |
| Build a regression test dataset | Pause the stream. Apply filters to select representative sessions. Save the selection as a static dataset. Use this dataset to run evaluations whenever you update an agent version or policy |
| Monitor policy compliance | Create a dynamic dataset with filters for failed evaluation status. The platform automatically captures all non-compliant sessions. Review the dataset periodically and drill into sessions to understand root causes using the Policies tab |