Last updated: Feb. 1, 2026
1. Pick a Mode
| Mode | Why choose it? | Notes |
|---|---|---|
| Speech-to-Speech (Multimodal) | Fastest turn-taking and most natural flow | Recommended starting point. Try the Gemini 2.5 engine (beta) for the lowest latency, but note it’s experimental and may be less stable. |
| Pipeline | Maximum control over voice and long-form replies | If you select Pipeline, continue to the Transcriber step below. |
2. Choose a Transcriber (Pipeline only)
| Transcriber | Accuracy | Latency | Best for |
|---|---|---|---|
| Azure | ⭐️⭐️⭐️⭐️ | ⏱️⏱️⏱️ (slower) | Highest transcription fidelity |
| Gladia | ⭐️⭐️⭐️ | ⏱️ (faster) | Good all-rounder for most languages |
| Deepgram | ⭐️⭐️⭐️ | ⏱️ (faster) | Solid alternative—test which performs better for your language and audio setup |
3. Select an LLM Model
| Model | Strengths | Trade-offs |
|---|---|---|
| GPT-4o | Smartest reasoning, handles complex prompts | Slightly higher latency and cost |
| Gemini 2.5-Flash-Lite | Blazing-fast, still highly capable | May miss nuance in very complex tasks—test for your use case |
4. Noise Cancellation
If callers are on speaker phone or in a quiet environment, keep noise cancellation ON. If your call volume is low or some words are “clipped,” turn it OFF so the transcriber gets the full waveform.5. Conversation Timers
| Parameter | Recommended | Why |
|---|---|---|
| Re-engagement | ≈ 30 s | Gives callers enough time to think. Lower values can feel pushy. |
| Max silence duration | ≈ 60 s | Prevents premature hang-ups while still ending truly silent calls. |
Test different values in real calls—too low can interrupt, too high leaves awkward gaps.
6. Initial Message
| Mode | How it’s used | Best practice |
|---|---|---|
| Pipeline | Read exactly as written (converted by TTS) | Write the greeting verbatim: “Hello, this is Alex from …”. |
| Speech-to-Speech | Interpreted as a prompt by the model | Include instructions like “Greet the customer and say …” or prepend say exactly: to ensure literal output. |
7. Ambient sound
Ambient sound adds subtle background noise to the assistant’s voice and is enabled by default.8. Endpointing sliders
Control when your assistant starts talking with the endpointing sensitivity slider at the bottom of assistant settings.| Setting | Effect | Use when |
|---|---|---|
| Lower sensitivity | Assistant responds faster after caller stops speaking | You want snappy, quick-turn conversations |
| Higher sensitivity | Assistant waits longer before responding | Callers give longer, more detailed replies |
9. Debug using the call transcript
10. Still have questions?
If you need help, contact our support team via the chat widget inside the app.11. Debug with the Test Assistant (chat)
Start a free chat session
Chat with your assistant to validate prompts, tools, and behavior. These chats are free and do not consume call minutes.
Ensure you have chat credits
Make sure you have at least one chat credit available. If you do not have any, go to the credits page and convert your minute credits into chat credits.

