1. Pipeline (classic)
Flow: Speech → text (transcription) → language model → reply text → speech (TTS). This is the classic pipeline: the model reasons over text, which you can trace and tune. That path takes a bit more time, but it shines when logic, longer replies, or heavy automation matter. Typical use cases: detailed explanations, structured workflows, scenarios where you want maximum control over wording and voice.2. Speech-to-Speech
Here the text stage is skipped: the model listens and speaks directly — similar to ChatGPT’s voice mode (the blue circle). It is usually the fastest mode with the lowest perceived wait — great for a snappy outbound caller or short, reactive dialogs.3. Dualplex
Dualplex combines both approaches: the system decides what fits best at each moment — e.g. fast turn-taking where it helps, and more “thinking room” where needed. For most setups, Dualplex is a strong default. If you need more precision, longer answers, or finer control, switch to Pipeline.Quick decision
| Goal | Lean toward |
|---|---|
| Good default, balanced | Dualplex |
| Maximum speed, short dialogs | Speech-to-Speech |
| Strong reasoning, long/complex replies, full pipeline control | Pipeline |

