Ingesting Documents
- To ingest content from a file, use the Upload File API to upload the file to the application.
- After uploading, include the
fileIdfrom the Upload File API response in the Ingest API to process the file content. - Supported file formats: PDF, docx, ppt, or txt. Other file types will cause an error.
Ingesting Structured Data
- To ingest structured data, add the content to the request body using the Chunk Fields listed in the table below.
- File Structure: The JSON file must follow a specific structure:
- The file name is used as the
recordTitle. - The JSON file must be an array of objects, where each object represents a chunk.
- Each chunk’s fields must correspond to the configured chunk fields.
- The file name is used as the
Crawling Web Pages
- This API supports incremental web crawling by adding content for an existing web source in Search AI.
- The
sourceNamemust match the Source Title for the web domain in Search AI. - Set
sourceTypeto"web". - Provide the URLs to crawl in the
urlsarray under thedocumentsfield.
- The
- The web crawl uses the crawl configuration set in Search AI for that source.
- Existing URLs are re-crawled; new URLs are crawled if the crawl configuration permits.
API Specifications
| Field | Value |
|---|---|
| Method | POST |
| Endpoint | https://{{host}}/api/public/bot/:botId/ingest-data |
| Content Type | application/json |
| Authorization | auth: {{JWT Token}} |
| API Scope | Ingest data |
Query Parameters
| Parameter | Required | Description |
|---|---|---|
host | Required | The environment URL. For example, https://platform.example.org. |
| Bot ID | Required | Unique identifier of your application. To view it, go to Dev Tools under App Settings and check the API scopes. |
Request Parameters
| Parameter | Required | Description |
|---|---|---|
sourceName | Yes | If the given name does not exist, a new source is created automatically. |
sourceType | Yes | Accepted values: "json" — to upload structured chunk fields via the request object (fileId is ignored); "file" — to upload documents using a fileId (chunk payload is ignored); "web" — to crawl web pages using provided URLs. |
documents | Yes | Depending on sourceType: for json, pass structured content directly with a title and chunks array; for web, pass a urls array of pages to crawl; for file, pass objects with a fileId and optional fileName, permissions, category, and priority. |
Sample Request — Ingesting Chunks Directly
chunks object must correspond to the configured chunk fields. To view chunk fields, refer to the Chunk Browser.
Sample Request — Incremental Web Crawl
Sample Request — Ingesting Content from Files
fileId. Pass that fileId here to ingest and index the file contents.