Skip to main content
Search AI is an advanced RAG-based search solution that enables organizations to search across large datasets conversationally. It combines LLM capabilities with intelligent retrieval to generate accurate, context-aware answers. Answers in Search AI are specific pieces of information extracted or generated in response to user queries. Unlike traditional search results that present ranked lists of documents, answers provide precise, tailored information directly addressing the user’s question. This guide covers architecture, key concepts, the answer generation pipeline, and the complete setup workflow for configuring Search AI.

Architecture

┌────────────────────────────────────────────────────────────┐ │ User Query │ │ “What’s the return policy for electronics?” │ └──────────────────────────────┬─────────────────────────────┘ │ │ │ ▼ ┌────────────────────────────────────────────────────────────┐ │ Query Processing │ │ Query understanding → Intent detection → Query expansion │ └───────────────────────────────┬────────────────────────────┘ │ │ │ ▼ ┌────────────────────────────────────────────────────────────┐ │ Retrieval │ │ Vector search → Keyword search → Hybrid ranking │ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │ │ Chunk │ │ Chunk │ │ Chunk │ │ Chunk │ │ │ │ 1 │ │ 2 │ │ 3 │ │ 4 │ │ │ └─────────┘ └─────────┘ └─────────┘ └─────────┘ │ └──────────────────────────────┬─────────────────────────────┘ │ │ │ ▼ ┌────────────────────────────────────────────────────────────┐ │ Answer Generation │ │LLM synthesizes answer from retrieved chunks with citations │ └──────────────────────────────┬─────────────────────────────┘ │ │ │ ▼ ┌────────────────────────────────────────────────────────────┐ │ Response │ │ “Electronics can be returned within 30 days of purchase” │ │ [Source: Return Policy Document, Section 2.3] │ └────────────────────────────────────────────────────────────┘

Key Terminology

TermDescription
RAG (Retrieval Augmented Generation)A method that extracts data from fragmented, unstructured content and formulates responses based on retrieved information. RAG retrieves both content and context from a dataset, significantly enhancing the precision and relevance of generated answers.
ChunkingThe process of segmenting large content units into smaller, searchable segments. Search AI offers multiple chunking strategies based on content format.
EmbeddingsMulti-dimensional vector representations of chunks, stored in a vector database. Different embedding models can be used for generating these vectors, enabling semantic similarity comparison.
TokensThe smallest unit of data in computing terminology. A token represents a group of characters. Token Estimation: ~1 token ≈ 4 characters in English

Answer Generation Pipeline

The answer generation process consists of five sequential steps:
StepProcessDescription
1Content IngestionImporting source documents that will be used to generate answers
2ChunkingBreaking down source documents into smaller, meaningful units called chunks
3Vector Embedding GenerationConverting chunks into multi-dimensional vectors for storage and comparison
4Chunk RetrievalSelecting the most relevant chunks based on similarity to the user query
5Answer GenerationGenerating a response to the user query using the retrieved chunks
Pipeline Flow:
Content Ingestion → Chunking → Vector Embeddings → Chunk Retrieval → Answer Generation

Setup Workflow

The Search AI setup consists of three main stages:
StagePurposeKey Activities
IngestionBuild unified knowledge baseAdd content sources (websites, documents, connectors)
EnhancementRefine and optimize contentConfigure extraction, process via Workbench, set up vector generation
RetrievalConfigure answer deliverySet retrieval strategy, query processing, answer generation

Stage 1: Content Ingestion

Integrate and index content from diverse data sources to build a unified knowledge base.

Content Source Types

Source TypeDescriptionUse Case
Web CrawlIndex content from websitesPublic websites, knowledge portals
DirectoryUpload and index documentsPDFs, Word docs, presentations
ConnectorsConnect third-party applicationsSharePoint, Confluence, ServiceNow, Google Drive

Supported Connectors

Search AI provides 100+ connectors including: SharePoint, Confluence, ServiceNow, OneDrive, Google Drive, Slack, Teams, Salesforce, Jira, HubSpot, Zendesk, and more.

Stage 2: Content Enhancement

Refine and enrich ingested content to improve answer quality.

Chunking Strategies

StrategyMethodBest For
Text-basedToken-based segmentation; fixed number of consecutive tokens per chunkGeneral text content
Rule-basedHeader-based segmentation; content between headers treated as chunksStructured documents with clear headings

Workbench Processing

The Workbench tool enables content transformation through various stages:
Stage TypePurpose
Field MappingAdd, update, or delete fields
Custom ScriptApply JavaScript transformations
Exclude DocumentsFilter content before indexing
API StageEnrich content via external APIs
LLM StageRefine content using LLM processing

Vector Configuration

Configure vector generation by selecting:
ConfigurationDescription
Embedding ModelChoose from supported models for vector generation
Fields for VectorizationSpecify which content parts (title, body, metadata) to use for embeddings

Stage 3: Retrieval Configuration

Set up retrieval and answer generation strategies for optimal results.

Retrieval Strategies

StrategyDescriptionBest For
Vector RetrievalFinds vectors most similar to the query vectorSemantic similarity, contextual queries
Hybrid RetrievalCombines keyword-based and vector retrieval techniquesBalanced precision and recall

Query Processing (Agentic RAG)

Leverages LLM to enhance retrieval:
CapabilityDescription
Context UnderstandingInterprets user intent from query context
Key Term IdentificationExtracts important terms within queries
Noise RemovalFilters irrelevant content from queries
Accuracy ImprovementEnhances overall retrieval precision

Answer Generation Types

TypeDescription
ExtractiveAnswers directly from content
GenerativeSummarized and rephrased responses using LLM

Integration with Automation AI

Configure Search AI as a response method within AI Agents.

Answers Configuration

Navigate to: App Settings > App Profile > Enable Answers Feature

Intent Identification Priority Options

OptionBehavior
Automation first, Search AI as FallbackPrioritizes automation framework; uses Search AI if intent identification fails
Search AI first, Automation as FallbackUses Search AI first; falls back to automation if no satisfactory match found

Additional Settings

SettingDescription
Use Search AI for Unrecognized Inputs During DialogsPasses unidentified user inputs to Search AI during Dialog Tasks, allowing knowledge base access mid-dialog

Setup Checklist

StepActionLocation
1Add content sourcesContent > Sources
2Configure extraction strategyIndex > Extraction
3Process content via WorkbenchIndex > Workbench
4Review chunksIndex > Content Browser
5Configure vector settingsIndex > Vector Configuration
6Set retrieval strategyConfiguration > Retrieval Strategies
7Configure answer generationConfiguration > Answer Generation
8Enable Answers in App ProfileApp Settings > App Profile
9Test and debugConfiguration > Testing and Debugging

Key Capabilities Summary

CapabilityDescription
Extensive Data Connectivity100+ connectors to enterprise content repositories
Enterprise Application IntegrationReal-time access to business data from Salesforce, ServiceNow, Jira, etc.
Advanced Hybrid SearchText and multi-vector weighted search with customizable pipelines
Enterprise Security ComplianceIntegration with existing access control mechanisms
Contextual IntelligenceDynamic enterprise and user context for personalized responses