Skip to main content
Back to XO GPT Model Specifications The XO GPT Conversation Summarization model generates concise, context-aware summaries of agent-customer interactions. It uses abstractive summarization, context analysis, and sentiment detection to transform lengthy dialogues into actionable insights.

Challenges with Commercial Models

ChallengeImpact
LatencyHigh processing times affect user experience in real-time or high-volume scenarios.
CostPer-request pricing scales poorly for large deployments.
Data GovernanceSending conversations to external models raises privacy and security concerns.
Lack of CustomizationGeneral-purpose models are not tuned for specific industries or use cases.
Limited ControlMinimal ability to correct or refine model behavior for incorrect outputs.
Compliance ConstraintsSome industries have regulatory requirements that commercial LLM providers don’t fully support.

Key Assumptions

  • Designed for text-based conversations only.
  • Assumes structured conversational data with clear speaker delineation.

Benefits

XO GPT Benefits

Consistent and Accurate

Delivers precise, contextually relevant summaries for conversation transcripts. See Model Benchmarks for latency and accuracy metrics.

Cost-Effective

For Enterprise Tier customers, XO GPT eliminates commercial model usage costs. Example comparison (250 input tokens/conversation, 1,000 daily summaries, 120 tokens/summary):
ModelInput $/MTokOutput $/MTokInput $/YearOutput $/YearTotal $/Year
GPT-4 Turbo$30$60$2,738$2,628$5,366
GPT-4$10$30$913$1,314$2,227
GPT-4o Mini$0.15$0.60$13.69$26.28$39.97

Enhanced Security

No client or user data is used for model retraining. Guardrails: Content moderation, behavioral guidelines, response oversight, input validation, and usage controls. AI Safety: Ethical guidelines, bias monitoring, transparency, and continuous improvement.
Performance, features, and language support may vary by implementation. Test thoroughly in your environment before production use.

Use Cases

DomainUse Case
HealthcareSummarize patient inquiries about symptoms, medications, and follow-up instructions.
BankingSummarize conversations about account issues, transaction disputes, or loan applications.
E-commerceSummarize inquiries about product availability, order status, returns, and refunds.
InsuranceSummarize policyholder interactions about claims, policy updates, and coverage questions.
IT SupportSummarize troubleshooting steps, error reports, and resolutions for technical issues.
TelecommunicationsSummarize complaints and service requests about network issues, billing errors, and plan changes.
Travel & HospitalitySummarize queries about booking modifications, cancellations, and special requests.
RetailSummarize interactions about store policies, promotions, and product exchanges.
EducationSummarize inquiries about course enrollments, schedules, and academic records.
UtilitiesSummarize communications about service outages, bill inquiries, and usage reports.

Sample Output

Conversation:
App: Hello! How can I help you today?
Customer: I need to check the status of my order.
App: Sure! Please provide your order reference number.
Customer: It's 12345-67890.
[Identity verified; order confirmed shipping within 48 hours]
Customer: Yes, I want to speak with an agent.
Agent: Hi, this is John from XYZ Support. How can I assist you today?
Customer: I wanted to confirm the shipping address on my order.
Agent: The address on file is 123 Elm Street, Springfield, IL.
Customer: That's correct. Thanks!
Generated Summary:
The customer contacted support to check the status of their order. The AI Agent verified the customer’s identity and informed them their order would ship within 48 hours. The customer then requested to speak with an agent to verify their shipping address. The agent confirmed the address on file was correct. The conversation ended with the customer satisfied.
Conversation Summary Example

Model Building Process

See Model Building Process.

Model Benchmarks

VersionAccuracyTPSLatency (s)BenchmarkTest Data
v2.0100%712.00Summary v2Results v2
v1.098%403.04Summary v1Results v1

Version 2.0

Model Choice

Base model: Mistral 7B Instruct v0.2
Base ModelDeveloperLanguageRelease DateStatusKnowledge Cutoff
Mistral 7B Instruct v0.2Mistral AIMulti-lingualSeptember 2024StaticSeptember 2024

Fine-Tuning Parameters

ParameterDescriptionValue
Fine-Tuning TypeMethod usedpeft-qlora
QuantizationBits for loading parameters4-bit
RankNumber of trainable parameters32
LoRA DropoutPrevents co-adaptation in neural network0.05
LoRA AlphaScaling factor
Learning RateRate toward loss minimum2e-4 (0.0002)
Batch SizeExamples per training step2
EpochsPasses over training data3
Max Sequence LengthMaximum input length32768
OptimizerOptimization algorithmpaged_adamw_8bit
Task TypeLoRA task typeCAUSAL_LM
Targeted ModulesLayers where LoRA is applied["up_proj", "o_proj", "down_proj", "gate_proj", "q_proj", "k_proj", "v_proj"]

General Parameters

Infrastructure: 2× A10 GPUs. Requires an Agent AI License.
ParameterDescriptionValue
Learning RateRate toward loss minimum2e-4 (0.0002)
Batch SizeExamples per training step2
EpochsPasses over training data3
Max Sequence LengthMaximum input length32768
OptimizerOptimization algorithmpaged_adamw_8bit

AWQ Model Quantization

ParameterDescriptionValue
Zero PointInclude zero-point for better weight representationTrue
Quantization Group SizeWeight group size128
Weight PrecisionBits for weight representation4
Quantization VersionAWQ version for GEMM operations”GEMM”
Computation Data TypeData type for inferencetorch.float16
Model LoadingReduced CPU memory usage{"low_cpu_mem_usage": True}
Tokenizer LoadingRemote code compatibilitytrust_remote_code=True

Benchmarks Summary v2

Comparison models: LLama-8B, GPT-4, Claude 3 Sonnet. Benchmarks Summary v2 XO GPT achieved an outstanding overall score, positioning it alongside Llama and ahead of Sonnet and GPT-4. It delivers strong results across accuracy, fluency, and robustness in English, French, German, Japanese, Turkish, and Spanish. See Test Data and Results v2 for full details.

Version 1.0

Model Choice

Base model: Mistral 7B Instruct v0.2
Base ModelDeveloperLanguageRelease DateStatusKnowledge Cutoff
Mistral 7B Instruct v0.2Mistral AIMulti-lingualMarch 2024StaticSeptember 2024

Fine-Tuning Parameters

ParameterDescriptionValue
Fine-Tuning TypeMethod usedpeft-qlora
QuantizationBits for loading parameters4-bit
RankNumber of trainable parameters32
LoRA DropoutPrevents co-adaptation in neural network0.05
Learning RateRate toward loss minimum1e-3 (0.001)
Batch SizeExamples per training step2
EpochsPasses over training data4
Max Sequence LengthMaximum input length
OptimizerOptimization algorithmpaged_adamw_8bit
Task TypeLoRA task typeCAUSAL_LM

General Parameters

Infrastructure: 2× A10 GPUs. Requires an Agent AI License.
ParameterDescriptionValue
Learning RateRate toward loss minimum1e-3 (0.001)
Batch SizeExamples per training step2
EpochsPasses over training data4
Max Sequence LengthMaximum input length32768
OptimizerOptimization algorithmpaged_adamw_8bit

Benchmarks Summary v1

Comparison models: Llama 3 8B (Ctranslate), Sonnet 3.5, GPT-4o. Benchmarks Summary v1 XO GPT demonstrates strong performance in English, French, German, and Spanish, with notable results in bias detection, sentiment analysis, and negation detection. See Test Data and Results v1 for full details.