SongGuess
    Preparing search index...

    Interface Ai_Cf_Openai_Whisper_Large_V3_Turbo_Output

    interface Ai_Cf_Openai_Whisper_Large_V3_Turbo_Output {
        segments?: {
            avg_logprob?: number;
            compression_ratio?: number;
            end?: number;
            no_speech_prob?: number;
            start?: number;
            temperature?: number;
            text?: string;
            words?: { end?: number; start?: number; word?: string }[];
        }[];
        text: string;
        transcription_info?: {
            duration?: number;
            duration_after_vad?: number;
            language?: string;
            language_probability?: number;
        };
        vtt?: string;
        word_count?: number;
    }
    Index

    Properties

    segments?: {
        avg_logprob?: number;
        compression_ratio?: number;
        end?: number;
        no_speech_prob?: number;
        start?: number;
        temperature?: number;
        text?: string;
        words?: { end?: number; start?: number; word?: string }[];
    }[]

    Type Declaration

    • Optionalavg_logprob?: number

      The average log probability of the predictions for the words in this segment, indicating overall confidence.

    • Optionalcompression_ratio?: number

      The compression ratio of the input to the output, measuring how much the text was compressed during the transcription process.

    • Optionalend?: number

      The ending time of the segment within the audio, in seconds.

    • Optionalno_speech_prob?: number

      The probability that the segment contains no speech, represented as a decimal between 0 and 1.

    • Optionalstart?: number

      The starting time of the segment within the audio, in seconds.

    • Optionaltemperature?: number

      The temperature used in the decoding process, controlling randomness in predictions. Lower values result in more deterministic outputs.

    • Optionaltext?: string

      The transcription of the segment.

    • Optionalwords?: { end?: number; start?: number; word?: string }[]
    text: string

    The complete transcription of the audio.

    transcription_info?: {
        duration?: number;
        duration_after_vad?: number;
        language?: string;
        language_probability?: number;
    }

    Type Declaration

    • Optionalduration?: number

      The total duration of the original audio file, in seconds.

    • Optionalduration_after_vad?: number

      The duration of the audio after applying Voice Activity Detection (VAD) to remove silent or irrelevant sections, in seconds.

    • Optionallanguage?: string

      The language of the audio being transcribed or translated.

    • Optionallanguage_probability?: number

      The confidence level or probability of the detected language being accurate, represented as a decimal between 0 and 1.

    vtt?: string

    The transcription in WebVTT format, which includes timing and text information for use in subtitles.

    word_count?: number

    The total number of words in the transcription.