> ## Documentation Index
> Fetch the complete documentation index at: https://koreai.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

# XO GPT User Query Paraphrasing Model

<Badge icon="arrow-left" color="gray">[Back to XO GPT Model Specifications](/ai-for-service/generative-ai-tools/xo-gpt-module#xo-gpt-model-specifications)</Badge>

The XO GPT User Query Paraphrasing model improves NLP accuracy by expanding and rephrasing user queries using conversation context. It resolves co-references and ambiguities before the query reaches downstream NLP components, enabling more accurate intent detection and entity recognition.

***

## Challenges with Commercial Models

| Challenge                  | Impact                                                                                          |
| -------------------------- | ----------------------------------------------------------------------------------------------- |
| **Latency**                | High processing times affect user experience in real-time or high-volume scenarios.             |
| **Cost**                   | Per-request pricing scales poorly for large deployments.                                        |
| **Data Governance**        | Sending queries to external models raises privacy and security concerns.                        |
| **Lack of Customization**  | General-purpose models aren't tuned for specific industries or use cases.                       |
| **Limited Control**        | Minimal ability to correct or refine model behavior for incorrect outputs.                      |
| **Compliance Constraints** | Some industries have regulatory requirements that commercial LLM providers don't fully support. |

***

## Key Assumptions

* Designed for text-based conversations only.
* Paraphrases the user query only when it references or co-refers to details from prior conversation context. Other queries are passed through unchanged.

***

## Benefits

<img src="https://mintcdn.com/koreai/eMSfxjuT2g-7-Hla/ai-for-service/generative-ai-tools/images/answer03.png?fit=max&auto=format&n=eMSfxjuT2g-7-Hla&q=85&s=3f72875c7590bbdd23820416e2fe2bd1" alt="XO GPT Benefits" width="1828" height="970" data-path="ai-for-service/generative-ai-tools/images/answer03.png" />

### Contextual Communication

Adapts user queries to the conversation context, enabling accurate intent interpretation for meaningful interactions. See [Model Benchmarks](#model-benchmarks) for performance insights.

### Cost-Effective

For Enterprise Tier customers, XO GPT eliminates commercial model usage costs. Example comparison (100 input tokens/conversation, 10,000 daily interactions, 15 tokens/response):

| Model       | Input \$/MTok | Output \$/MTok | Input \$/Year | Output \$/Year | Total \$/Year |
| ----------- | ------------- | -------------- | ------------- | -------------- | ------------- |
| GPT-4 Turbo | \$30          | \$60           | \$10,950      | \$3,285        | \$14,235      |
| GPT-4       | \$10          | \$30           | \$3,650       | \$1,643        | \$5,293       |
| GPT-4o Mini | \$0.15        | \$0.60         | \$54.75       | \$32.85        | \$87.60       |

### Enhanced Security

No client or user data is used for model retraining.

**Guardrails:** Content moderation, behavioral guidelines, response oversight, input validation, and usage controls.

**AI Safety:** Ethical guidelines, bias monitoring, transparency, and continuous improvement.

<Note>
  Performance, features, and language support may vary by implementation. Test thoroughly in your environment before production use.
</Note>

***

## Use Cases

| Domain               | Use Cases                                                                                                                      |
| -------------------- | ------------------------------------------------------------------------------------------------------------------------------ |
| Customer Support     | Simplify complex queries for accurate intent detection; remove ambiguous references; enable contextual continuity across turns |
| Healthcare           | Simplify complex patient inquiries; eliminate co-references for precise understanding of patient history or treatments         |
| Banking & Finance    | Clarify account action queries; simplify follow-up queries about previous interactions; clarify ambiguous product inquiries    |
| Education            | Clarify multi-part or context-heavy student queries; simplify questions about schedules or course content                      |
| Human Resources      | Clarify ambiguous HR questions about benefits or leave policies; rephrase workplace policy questions                           |
| Legal                | Simplify user queries about legal contracts or policies; clarify complex legal questions                                       |
| E-commerce           | Rephrase follow-up queries about orders or shipments; eliminate ambiguities in return/refund queries                           |
| Social Media         | Clarify questions about flagged content or platform policies; simplify account settings or privacy queries                     |
| IT Support           | Rephrase vague or context-dependent technical queries; eliminate co-references in user-reported issues                         |
| Travel & Hospitality | Clarify multi-part or ambiguous booking inquiries; simplify questions about changes to travel plans                            |

***

## Sample Output

**Conversation:**

```
User: Hi, can you help me select a University for studying Physics?
AI Agent: Sure, here are some top universities for Physics:
         1. Harvard University  2. MIT  3. Stanford  4. University of Cambridge.
         Which sounds best to you?
User: Which one is best in fee structure?
AI Agent: Generally, the most affordable undergraduate tuition for Physics is at Stanford.
User: Ok, I'll choose that one.
```

**Paraphrased query:**

> User: Ok, I will choose to apply at Stanford University for a Physics course.

***

## Model Building Process

See [Model Building Process](/ai-for-service/generative-ai-tools/xogpt-model-specifications#model-building-process).

***

## Model Benchmarks

| Version | Accuracy | TPS | Latency (s) | Benchmark                            | Test Data                                                                                                                                                                           |
| ------- | -------- | --- | ----------- | ------------------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| v1.0    | 97%      | 43  | 0.54        | [Summary v1](#benchmarks-summary-v1) | [Results v1](https://raw.githubusercontent.com/Koredotcom/docs-v2/refs/heads/main/ai-for-service/generative-ai-tools/test-date-and-results/xogpt-user-query-paraphrasing-v1.0.xlsx) |

***

## Version 1.0

### Model Choice

Base model: [Mistral 7B Instruct v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2)

| Base Model               | Developer  | Language      | Release Date | Status | Knowledge Cutoff |
| ------------------------ | ---------- | ------------- | ------------ | ------ | ---------------- |
| Mistral 7B Instruct v0.2 | Mistral AI | Multi-lingual | March 2024   | Static | September 2024   |

### Fine-Tuning Parameters

| Parameter               | Description                               | Value                 |
| ----------------------- | ----------------------------------------- | --------------------- |
| Load in 4-bit Precision | Reduce memory by loading weights at 4-bit | True                  |
| Use Double Quantization | Improve accuracy with double quantization | True                  |
| 4-bit Quantization Type | Type of 4-bit quantization                | nf4                   |
| Computation Data Type   | Data type for 4-bit quantized weights     | torch.float16         |
| LoRA Rank               | Rank of low-rank decomposition            | 32                    |
| LoRA Alpha              | LoRA scaling factor                       | 16                    |
| LoRA Dropout Rate       | Dropout to prevent overfitting            | 0.05                  |
| Bias Term Inclusion     | Add bias terms in LoRA layers             | —                     |
| Task Type               | LoRA task type                            | CAUSAL\_LM            |
| Targeted Modules        | Layers where LoRA is applied              | `["query_key_value"]` |

### General Parameters

Infrastructure: A10 (g5-xlarge).

| Parameter           | Description                | Value              |
| ------------------- | -------------------------- | ------------------ |
| Learning Rate       | Rate toward loss minimum   | 2e-4 (0.0002)      |
| Batch Size          | Examples per training step | 2                  |
| Epochs              | Passes over training data  | 4                  |
| Max Sequence Length | Maximum input length       | 32k                |
| Optimizer           | Optimization algorithm     | paged\_adamw\_8bit |

### Benchmarks Summary v1

Comparison models: Flan-T5, GPT-4.

<img src="https://mintcdn.com/koreai/s3bkaKmzowgJ31et/ai-for-service/generative-ai-tools/images/user01.png?fit=max&auto=format&n=s3bkaKmzowgJ31et&q=85&s=b29239d43263d709f9af2929f12b57ba" alt="Benchmarks Summary v1" width="1920" height="1080" data-path="ai-for-service/generative-ai-tools/images/user01.png" />

See [Test Data and Results v1](https://raw.githubusercontent.com/Koredotcom/docs-v2/refs/heads/main/ai-for-service/generative-ai-tools/test-date-and-results/xogpt-user-query-paraphrasing-v1.0.xlsx) for full details.