Files
stt-server/NOTES.md

729 B

Notes

TranscriptionInfo — unused fields

model.transcribe() returns a TranscriptionInfo object as its second value. We currently use language and language_probability. Other available fields:

  • all_language_probs — full ranked list of (language, probability) tuples for the segment. Useful for debugging misdetection — e.g. when the model hallucinates Sinhala on noise, this would show Sinhala at the top with a high probability. Could be included in transcript events or exposed as a diagnostic endpoint.
  • duration — total audio duration fed to the model.
  • duration_after_vad — speech duration according to Whisper's internal VAD (not meaningful since we pass vad_filter=False).