OptionalsegmentsOptionalavg_logprob?: numberOptionalcompression_ratio?: numberThe compression ratio of the input to the output, measuring how much the text was compressed during the transcription process.
Optionalend?: numberThe ending time of the segment within the audio, in seconds.
Optionalno_speech_prob?: numberThe probability that the segment contains no speech, represented as a decimal between 0 and 1.
Optionalstart?: numberThe starting time of the segment within the audio, in seconds.
Optionaltemperature?: numberThe temperature used in the decoding process, controlling randomness in predictions. Lower values result in more deterministic outputs.
Optionaltext?: stringThe transcription of the segment.
Optionalwords?: { end?: number; start?: number; word?: string }[]The complete transcription of the audio.
Optionaltranscription_Optionalduration?: numberThe total duration of the original audio file, in seconds.
Optionalduration_after_vad?: numberThe duration of the audio after applying Voice Activity Detection (VAD) to remove silent or irrelevant sections, in seconds.
Optionallanguage?: stringThe language of the audio being transcribed or translated.
Optionallanguage_probability?: numberThe confidence level or probability of the detected language being accurate, represented as a decimal between 0 and 1.
OptionalvttThe transcription in WebVTT format, which includes timing and text information for use in subtitles.
Optionalword_The total number of words in the transcription.
The average log probability of the predictions for the words in this segment, indicating overall confidence.