Skip to main content
Back to API List This API allows you to ingest and index data into the SearchAI application. You can ingest structured data as chunk fields, ingest an uploaded document, or perform incremental web crawling on existing web sources.

Ingesting Documents

  • To ingest content from a file, use the Upload File API to upload the file to the application.
  • After uploading, include the fileId from the Upload File API response in the Ingest API to process the file content.
  • Supported file formats: PDF, docx, ppt, or txt. Other file types will cause an error.

Ingesting Structured Data

  • To ingest structured data, add the content to the request body using the Chunk Fields listed in the table below.
  • File Structure: The JSON file must follow a specific structure:
    • The file name is used as the recordTitle.
    • The JSON file must be an array of objects, where each object represents a chunk.
    • Each chunk’s fields must correspond to the configured chunk fields.

Crawling Web Pages

  • This API supports incremental web crawling by adding content for an existing web source in Search AI.
    • The sourceName must match the Source Title for the web domain in Search AI.
    • Set sourceType to "web".
    • Provide the URLs to crawl in the urls array under the documents field.
  • The web crawl uses the crawl configuration set in Search AI for that source.
  • Existing URLs are re-crawled; new URLs are crawled if the crawl configuration permits.

API Specifications

FieldValue
MethodPOST
Endpointhttps://{{host}}/api/public/bot/:botId/ingest-data
Content Typeapplication/json
Authorizationauth: {{JWT Token}}
API ScopeIngest data

Query Parameters

ParameterRequiredDescription
hostRequiredThe environment URL. For example, https://platform.example.org.
Bot IDRequiredUnique identifier of your application. To view it, go to Dev Tools under App Settings and check the API scopes.

Request Parameters

ParameterRequiredDescription
sourceNameYesIf the given name does not exist, a new source is created automatically.
sourceTypeYesAccepted values: "json" — to upload structured chunk fields via the request object (fileId is ignored); "file" — to upload documents using a fileId (chunk payload is ignored); "web" — to crawl web pages using provided URLs.
documentsYesDepending on sourceType: for json, pass structured content directly with a title and chunks array; for web, pass a urls array of pages to crawl; for file, pass objects with a fileId and optional fileName, permissions, category, and priority.

Sample Request — Ingesting Chunks Directly

{
  "sourceName": "Abc",
  "sourceType": "json",
  "documents": [
    {
      "title": "Cybersecurity",
      "chunks": [
        {
          "chunkText": "Cybersecurity is the practice of protecting systems, networks, and programs from digital attacks.",
          "recordUrl": "https://www.example.com/cybersecurity",
          "chunkTitle": "The Importance of Cybersecurity"
        }
      ]
    }
  ]
}
The fields inside the chunks object must correspond to the configured chunk fields. To view chunk fields, refer to the Chunk Browser.

Sample Request — Incremental Web Crawl

{
  "sourceName": "myWebDomain",
  "sourceType": "web",
  "documents": [
    {
      "urls": [
        "https://example.com/docs/",
        "https://example.com/product-guide/",
        "https://example.com/user-guide/"
      ]
    }
  ]
}
If a URL is already crawled, it is re-crawled. New URLs are crawled if the source’s crawl configuration permits.

Sample Request — Ingesting Content from Files

{
  "sourceName": "Abc",
  "sourceType": "file",
  "documents": [
    {
      "fileId": "f12455",
      "permissions": {
        "allowedUsers": ["john@example.com", "jane@example.com"],
        "allowedGroups": ["Engineering", "Management"]
      }
    }
  ]
}
Use the Upload File API to upload the file and obtain the fileId. Pass that fileId here to ingest and index the file contents.