Skip to main content
Answer Streaming delivers generative responses incrementally, token by token in real time, instead of waiting for the full answer to be generated. This reduces perceived latency and creates a faster, more interactive experience — especially for longer answers. Supported channel: Web/Mobile SDK only.
: When streaming is enabled, Search API responses are not returned. Streaming works only through compatible SDK-based and channel integrations where responses can be rendered progressively.

Supported Models

ModelDefault PromptsCustom Prompts
OpenAI✅ Streaming-enabled prompts available by default✅ Supported
Azure OpenAI❌ No Default Prompts✅ Supported
Other providers❌ Not supported❌ Not supported

Enabling Streaming

Streaming is enabled through prompt configuration, not a global setting. The steps differ depending on whether you use a default or custom prompt.

Using Default Prompts (OpenAI only)

For OpenAI models, streaming-enabled prompts are available out of the box. Navigate to Configuration > Answer Generation, select the generative model, and choose a streaming-enabled prompt from the list. Selecting it automatically enables streaming — no additional steps needed.

Using Custom Prompts (OpenAI and Azure OpenAI)

Both of the following steps are required for streaming to take effect: Step 1 — Add streaming options to the prompt body:
"stream": true,
"stream_options": {
  "include_usage": true
}
Step 2 — Enable the Streaming toggle in the UI.
Both steps are required. Enabling only the toggle or only the prompt option will not activate streaming.