Native R 'torch' Implementation of 'OpenAI' 'Whisper'


[Up] [Top]

Documentation for package ‘whisper’ version 0.4.0

Help Pages

.model_sizes Model Download Utilities
apply_bpe Apply BPE Merges
apply_timestamp_rules Apply Timestamp Token Rules
audio_duration Get Audio Duration
audio_to_mel Convert Audio to Mel Spectrogram
beam_search_decode Beam Search Decode
build_byte_decoder Build Reverse Byte Decoder
byte_to_token Convert Byte to BPE Token
clean_text Clean Transcribed Text
compression_ratio Compression Ratio
compute_stft Compute STFT Magnitude
compute_word_timestamps Compute Word-Level Timestamps
copy_if_exists Copy Weight if Exists
create_decoder Create Decoder from Config
create_encoder Create Encoder from Config
create_mel_filterbank_fallback Create Mel Filterbank (Fallback)
decode_bpe_bytes Decode BPE Bytes Back to Text
decode_timestamp Decode Timestamp Token
decode_with_fallback Decode with Temperature Fallback
detect_language Detect Language
detect_language_from_mel Detect Language from Mel Spectrogram
detect_language_from_pipeline Detect Language from Pipeline
download_tokenizer_files Download Tokenizer Files from HuggingFace
download_whisper_model Download Model from HuggingFace
dtw_align DTW Alignment
ensure_tokenizer_files Ensure Tokenizer Files are Downloaded
expand_kv_cache Expand KV Cache for Beam Search
extract_segments Extract Segments with Timestamps
forced_decode Forced Decode
get_initial_tokens Get Initial Decoder Tokens
get_model_path Get Model Cache Path
get_weights_path Get Path to Model Weights
greedy_decode Greedy Decoding
group_into_words Group Subword Tokens into Words
hz_to_mel Convert Hz to Mel Scale
is_timestamp_token Check if Token is Timestamp
list_downloaded_models List Downloaded Models
list_whisper_models List Available Models
load_audio Load and Preprocess Audio
load_decoder_weights Load Decoder Weights
load_encoder_weights Load Encoder Weights
load_mel_filterbank Load Pre-computed Mel Filterbank
load_whisper_model Load Whisper Model
load_whisper_weights Load Weights from Safetensors
medfilt1 1D Median Filter
mel_to_hz Convert Mel Scale to Hz
model_exists Check if Model is Downloaded
pad_or_trim Pad or Trim Audio to Fixed Length
parse_device Parse Device Argument
parse_dtype Parse Dtype Argument
rearrange_kv_cache Rearrange KV Cache by Beam Indices
sample_decode Sample Decode
serve Serve whisper over HTTP
split_audio Split Long Audio into Chunks
tokenizer_decode Decode Token IDs to Text
tokenizer_encode Encode Text to Token IDs
transcribe Transcribe Audio
transcribe_chunk Transcribe Single Chunk
transcribe_long Transcribe Long Audio
whisper_attention Multi-Head Self-Attention
whisper_config Whisper Model Configurations
whisper_decoder Text Decoder
whisper_decoder_layer Decoder Layer
whisper_device Get Default Device
whisper_dtype Get Default Dtype
whisper_encoder Audio Encoder
whisper_encoder_layer Encoder Layer
whisper_language_table Whisper Language Table
whisper_lang_from_id Get Language Code from Token ID
whisper_lang_token Get Language Token ID
whisper_model Whisper Model Module
whisper_pipeline Create a Whisper Pipeline
WHISPER_SAMPLE_RATE Whisper Audio Constants
whisper_special_tokens Special Token IDs
whisper_tokenizer Create Whisper Tokenizer
whisper_tune_gc Tune torch's CUDA garbage collection for whisper inference