Ingest Data API

Back to API List This API allows you to ingest and index data into the SearchAI application. You can ingest structured data as chunk fields, ingest an uploaded document, or perform incremental web crawling on existing web sources.

Ingesting Documents

To ingest content from a file, use the Upload File API to upload the file to the application.
After uploading, include the fileId from the Upload File API response in the Ingest API to process the file content.
Supported file formats: PDF, docx, ppt, or txt. Other file types will cause an error.

Ingesting Structured Data

To ingest structured data, add the content to the request body using the Chunk Fields listed in the table below.
File Structure: The JSON file must follow a specific structure:
- The file name is used as the recordTitle.
- The JSON file must be an array of objects, where each object represents a chunk.
- Each chunk’s fields must correspond to the configured chunk fields.

Crawling Web Pages

This API supports incremental web crawling by adding content for an existing web source in Search AI.
- The sourceName must match the Source Title for the web domain in Search AI.
- Set sourceType to "web".
- Provide the URLs to crawl in the urls array under the documents field.
The web crawl uses the crawl configuration set in Search AI for that source.
Existing URLs are re-crawled; new URLs are crawled if the crawl configuration permits.

API Specifications

Field	Value
Method	POST
Endpoint	`https://{{host}}/api/public/bot/:botId/ingest-data`
Content Type	`application/json`
Authorization	`auth: {{JWT Token}}`
API Scope	Ingest data

Query Parameters

Parameter	Required	Description
`host`	Required	The environment URL. For example, `https://platform.example.org`.
Bot ID	Required	Unique identifier of your application. To view it, go to Dev Tools under App Settings and check the API scopes.

Request Parameters

Parameter	Required	Description
`sourceName`	Yes	If the given name does not exist, a new source is created automatically.
`sourceType`	Yes	Accepted values: `"json"` — to upload structured chunk fields via the request object (fileId is ignored); `"file"` — to upload documents using a fileId (chunk payload is ignored); `"web"` — to crawl web pages using provided URLs.
`documents`	Yes	Depending on `sourceType`: for `json`, pass structured content directly with a `title` and `chunks` array; for `web`, pass a `urls` array of pages to crawl; for `file`, pass objects with a `fileId` and optional `fileName`, `permissions`, `category`, and `priority`.

Sample Request — Ingesting Chunks Directly

{
  "sourceName": "Abc",
  "sourceType": "json",
  "documents": [
    {
      "title": "Cybersecurity",
      "chunks": [
        {
          "chunkText": "Cybersecurity is the practice of protecting systems, networks, and programs from digital attacks.",
          "recordUrl": "https://www.example.com/cybersecurity",
          "chunkTitle": "The Importance of Cybersecurity"
        }
      ]
    }
  ]
}

The fields inside the chunks object must correspond to the configured chunk fields. To view chunk fields, refer to the Chunk Browser.

Sample Request — Incremental Web Crawl

{
  "sourceName": "myWebDomain",
  "sourceType": "web",
  "documents": [
    {
      "urls": [
        "https://example.com/docs/",
        "https://example.com/product-guide/",
        "https://example.com/user-guide/"
      ]
    }
  ]
}

If a URL is already crawled, it is re-crawled. New URLs are crawled if the source’s crawl configuration permits.

Sample Request — Ingesting Content from Files

{
  "sourceName": "Abc",
  "sourceType": "file",
  "documents": [
    {
      "fileId": "f12455",
      "permissions": {
        "allowedUsers": ["john@example.com", "jane@example.com"],
        "allowedGroups": ["Engineering", "Management"]
      }
    }
  ]
}

Use the Upload File API to upload the file and obtain the fileId. Pass that fileId here to ingest and index the file contents.

Modules

Platform Services

Administration

References

Ingesting Documents

Ingesting Structured Data

Crawling Web Pages

API Specifications

Query Parameters

Request Parameters

Sample Request — Ingesting Chunks Directly

Sample Request — Incremental Web Crawl

Sample Request — Ingesting Content from Files

Modules

Platform Services

Administration

References

Documentation Index

​Ingesting Documents

​Ingesting Structured Data

​Crawling Web Pages

​API Specifications

​Query Parameters

​Request Parameters

​Sample Request — Ingesting Chunks Directly

​Sample Request — Incremental Web Crawl

​Sample Request — Ingesting Content from Files

Ingesting Documents

Ingesting Structured Data

Crawling Web Pages

API Specifications

Query Parameters

Request Parameters

Sample Request — Ingesting Chunks Directly

Sample Request — Incremental Web Crawl

Sample Request — Ingesting Content from Files