AI 模型库
whisperkit-coreml
automatic-speech-recognition--- pretty_name: "WhisperKit" viewer: false library_name: whisperkit tags: - whisper - whisperkit - coreml - asr - quantized - automatic-speech-recognition --- WhisperKit
speaker-diarization-3.1
automatic-speech-recognition自动语音识别
whisper-large-v3-turbo
automatic-speech-recognitionWhisper is a state-of-the-art model for automatic speech recognition (ASR) and speech translation, proposed in the paper Robust Speech Recognition via Large-Scale Weak Supervision by Alec Radford et
whisper-large-v3
automatic-speech-recognitionWhisper is a state-of-the-art model for automatic speech recognition (ASR) and speech translation, proposed in the paper Robust Speech Recognition via Large-Scale Weak Supervision by Alec Radford et
wav2vec2-large-xlsr-53-russian
automatic-speech-recognitionFine-tuned XLSR-53 large model for speech recognition in Russian
voice-activity-detection
automatic-speech-recognition自动语音识别
mms-300m-1130-forced-aligner
automatic-speech-recognitionForced Alignment with Hugging Face CTC Models This Python package provides an efficient way to perform forced alignment between text and audio using Hugging Face's pretrained models. it also features
wav2vec2-large-xlsr-53-portuguese
automatic-speech-recognitionFine-tuned XLSR-53 large model for speech recognition in Portuguese
speaker-diarization-community-1
automatic-speech-recognition自动语音识别
whisper-small
automatic-speech-recognitionWhisper is a pre-trained model for automatic speech recognition (ASR) and speech translation. Trained on 680k hours of labelled data, Whisper models demonstrate a strong ability to generalise to many
Qwen3-ASR-1.7B
automatic-speech-recognitionThe Qwen3-ASR family includes Qwen3-ASR-1.7B and Qwen3-ASR-0.6B, which support language identification and ASR for 52 languages and dialects. Both leverage large-scale speech training data and the str