Skip to main content
Back to XO GPT Model Specifications The XO GPT User Query Paraphrasing model improves NLP accuracy by expanding and rephrasing user queries using conversation context. It resolves co-references and ambiguities before the query reaches downstream NLP components, enabling more accurate intent detection and entity recognition.

Challenges with Commercial Models

ChallengeImpact
LatencyHigh processing times affect user experience in real-time or high-volume scenarios.
CostPer-request pricing scales poorly for large deployments.
Data GovernanceSending queries to external models raises privacy and security concerns.
Lack of CustomizationGeneral-purpose models are not tuned for specific industries or use cases.
Limited ControlMinimal ability to correct or refine model behavior for incorrect outputs.
Compliance ConstraintsSome industries have regulatory requirements that commercial LLM providers don’t fully support.

Key Assumptions

  • Designed for text-based conversations only.
  • Paraphrases the user query only when it references or co-refers to details from prior conversation context. Other queries are passed through unchanged.

Benefits

XO GPT Benefits

Contextual Communication

Adapts user queries to the conversation context, enabling accurate intent interpretation for meaningful interactions. See Model Benchmarks for performance insights.

Cost-Effective

For Enterprise Tier customers, XO GPT eliminates commercial model usage costs. Example comparison (100 input tokens/conversation, 10,000 daily interactions, 15 tokens/response):
ModelInput $/MTokOutput $/MTokInput $/YearOutput $/YearTotal $/Year
GPT-4 Turbo$30$60$10,950$3,285$14,235
GPT-4$10$30$3,650$1,643$5,293
GPT-4o Mini$0.15$0.60$54.75$32.85$87.60

Enhanced Security

No client or user data is used for model retraining. Guardrails: Content moderation, behavioral guidelines, response oversight, input validation, and usage controls. AI Safety: Ethical guidelines, bias monitoring, transparency, and continuous improvement.
Performance, features, and language support may vary by implementation. Test thoroughly in your environment before production use.

Use Cases

DomainUse Cases
Customer SupportSimplify complex queries for accurate intent detection; remove ambiguous references; enable contextual continuity across turns
HealthcareSimplify complex patient inquiries; eliminate co-references for precise understanding of patient history or treatments
Banking & FinanceClarify account action queries; simplify follow-up queries about previous interactions; clarify ambiguous product inquiries
EducationClarify multi-part or context-heavy student queries; simplify questions about schedules or course content
Human ResourcesClarify ambiguous HR questions about benefits or leave policies; rephrase workplace policy questions
LegalSimplify user queries about legal contracts or policies; clarify complex legal questions
E-commerceRephrase follow-up queries about orders or shipments; eliminate ambiguities in return/refund queries
Social MediaClarify questions about flagged content or platform policies; simplify account settings or privacy queries
IT SupportRephrase vague or context-dependent technical queries; eliminate co-references in user-reported issues
Travel & HospitalityClarify multi-part or ambiguous booking inquiries; simplify questions about changes to travel plans

Sample Output

Conversation:
User: Hi, can you help me select a University for studying Physics?
AI Agent: Sure, here are some top universities for Physics:
         1. Harvard University  2. MIT  3. Stanford  4. University of Cambridge.
         Which sounds best to you?
User: Which one is best in fee structure?
AI Agent: Generally, the most affordable undergraduate tuition for Physics is at Stanford.
User: Ok, I'll choose that one.
Paraphrased query:
User: Ok, I will choose to apply at Stanford University for a Physics course.

Model Building Process

See Model Building Process.

Model Benchmarks

VersionAccuracyTPSLatency (s)BenchmarkTest Data
v1.097%430.54Summary v1Results v1

Version 1.0

Model Choice

Base model: Mistral 7B Instruct v0.2
Base ModelDeveloperLanguageRelease DateStatusKnowledge Cutoff
Mistral 7B Instruct v0.2Mistral AIMulti-lingualMarch 2024StaticSeptember 2024

Fine-Tuning Parameters

ParameterDescriptionValue
Load in 4-bit PrecisionReduce memory by loading weights at 4-bitTrue
Use Double QuantizationImprove accuracy with double quantizationTrue
4-bit Quantization TypeType of 4-bit quantizationnf4
Computation Data TypeData type for 4-bit quantized weightstorch.float16
LoRA RankRank of low-rank decomposition32
LoRA AlphaLoRA scaling factor16
LoRA Dropout RateDropout to prevent overfitting0.05
Bias Term InclusionAdd bias terms in LoRA layers
Task TypeLoRA task typeCAUSAL_LM
Targeted ModulesLayers where LoRA is applied["query_key_value"]

General Parameters

Infrastructure: A10 (g5-xlarge).
ParameterDescriptionValue
Learning RateRate toward loss minimum2e-4 (0.0002)
Batch SizeExamples per training step2
EpochsPasses over training data4
Max Sequence LengthMaximum input length32k
OptimizerOptimization algorithmpaged_adamw_8bit

Benchmarks Summary v1

Comparison models: Flan-T5, GPT-4. Benchmarks Summary v1 See Test Data and Results v1 for full details.