From 81e9ea82cf4787b42459e6d6a7e866808dfaefa3 Mon Sep 17 00:00:00 2001 From: mikael-lovqvists-claude-agent Date: Sun, 7 Jun 2026 09:24:53 +0000 Subject: [PATCH] Add NOTES.md with TranscriptionInfo unused fields Co-Authored-By: Claude Sonnet 4.6 --- NOTES.md | 9 +++++++++ 1 file changed, 9 insertions(+) create mode 100644 NOTES.md diff --git a/NOTES.md b/NOTES.md new file mode 100644 index 0000000..a8bb54c --- /dev/null +++ b/NOTES.md @@ -0,0 +1,9 @@ +# Notes + +## TranscriptionInfo — unused fields + +`model.transcribe()` returns a `TranscriptionInfo` object as its second value. We currently use `language` and `language_probability`. Other available fields: + +- **`all_language_probs`** — full ranked list of `(language, probability)` tuples for the segment. Useful for debugging misdetection — e.g. when the model hallucinates Sinhala on noise, this would show Sinhala at the top with a high probability. Could be included in transcript events or exposed as a diagnostic endpoint. +- **`duration`** — total audio duration fed to the model. +- **`duration_after_vad`** — speech duration according to Whisper's internal VAD (not meaningful since we pass `vad_filter=False`).