Interactive Path
Wake word, VAD, streaming ASR, endpointing, intent detection, retrieval, LLM/tool call, TTS first chunk, playback, and barge-in.
Question: Which metrics belong on the interactive path?
Time to first partial, partial rewrite rate, endpointing delay, first audio byte, complete spoken turn latency, cancellation success, barge-in recovery, task success, correction rate, and cost per successful turn. Final WER alone is too narrow.