AI 模型库

共个模型

排序方式

厂商

全部 01.AI Alibaba Alibaba-NLP Andycurrent Anthropic BAAI Bingsu Cohere Comfy-Org DeepSeek E-MIMIC EleutherAI FacebookAI Falconsai FinLang Google Kijai MahmoudAshraf Marqo Meta Midjourney Mistral AI NeoQuasar OpenAI Perplexity AI ProsusAI Qwen RedHatAI ResembleAI Salesforce Stability AI TinyLlama TostAI Xenova amazon apple argmaxinc autogluon cardiffnlp colbert-ir coqui cross-encoder cyankiwi daekeun-ml deepseek-ai depth-anything dima806 distilbert docling-project dphn emilyalsentzer facebook google google-bert google-t5 hexgrad hmellor intfloat jinaai jonatasgrosman k2-fsa laion llava-hf lpiccinelli meta-llama microsoft mistralai mixedbread-ai neuralmind nomic-ai nvidia openai openai-community patrickjohncyh prajjwal1 pyannote rhasspy sentence-transformers speechbrain stabilityai timm trl-internal-testing unsloth usyd-community vikhyatk xAI zai-org 北京智源研究院商汤科技字节跳动智谱AI 月之暗面百川智能百度科大讯飞稀宇科技腾讯阶跃星辰阿里巴巴

任务类型

全部文本生成 42 图文理解 27 句子相似度 18 完形填空 12 特征提取 12 语音识别 11 时序预测 8 图像分类 7 零样本图像分类 6 文本分类 6 语音合成 4 图像特征提取 4 文本排序 3 语音活动检测 2 翻译 2 图生文 2 多模态 2 零样本分类 1 文生图 1 目标检测 1 掩码生成 1 关键点检测 1 图像转3D 1 深度估计 1 音频分类 1

下载量收藏数最新 automatic-speech-recognition ×

whisperkit-coreml

automatic-speech-recognition

argmaxinc · argmaxinc/whisperkit-coreml

--- pretty_name: "WhisperKit" viewer: false library_name: whisperkit tags: - whisper - whisperkit - coreml - asr - quantized - automatic-speech-recognition --- WhisperKit

10,910,125 515 whisperkit

speaker-diarization-3.1

automatic-speech-recognition

pyannote · pyannote/speaker-diarization-3.1

自动语音识别

10,249,640 2081 pyannote-audio

whisper-large-v3-turbo

automatic-speech-recognition

openai · openai/whisper-large-v3-turbo

Whisper is a state-of-the-art model for automatic speech recognition (ASR) and speech translation, proposed in the paper Robust Speech Recognition via Large-Scale Weak Supervision by Alec Radford et

6,876,575 3002 transformers

whisper-large-v3

automatic-speech-recognition

openai · openai/whisper-large-v3

Whisper is a state-of-the-art model for automatic speech recognition (ASR) and speech translation, proposed in the paper Robust Speech Recognition via Large-Scale Weak Supervision by Alec Radford et

4,998,671 5669 transformers

wav2vec2-large-xlsr-53-russian

automatic-speech-recognition

jonatasgrosman · jonatasgrosman/wav2vec2-large-xlsr-53-russian

Fine-tuned XLSR-53 large model for speech recognition in Russian

4,152,128 74 transformers

voice-activity-detection

automatic-speech-recognition

pyannote · pyannote/voice-activity-detection

自动语音识别

3,518,729 233 pyannote-audio

mms-300m-1130-forced-aligner

automatic-speech-recognition

MahmoudAshraf · MahmoudAshraf/mms-300m-1130-forced-aligner

Forced Alignment with Hugging Face CTC Models This Python package provides an efficient way to perform forced alignment between text and audio using Hugging Face's pretrained models. it also features

3,477,232 87 transformers

wav2vec2-large-xlsr-53-portuguese

automatic-speech-recognition

jonatasgrosman · jonatasgrosman/wav2vec2-large-xlsr-53-portuguese

Fine-tuned XLSR-53 large model for speech recognition in Portuguese

3,458,442 54 transformers

speaker-diarization-community-1

automatic-speech-recognition

pyannote · pyannote/speaker-diarization-community-1

自动语音识别

2,856,155 361 pyannote-audio

whisper-small

automatic-speech-recognition

openai · openai/whisper-small

Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation. Trained on 680k hours of labelled data, Whisper models demonstrate a strong ability to generalise to many

2,293,475 556 transformers

Qwen3-ASR-1.7B

automatic-speech-recognition

Qwen · Qwen/Qwen3-ASR-1.7B

The Qwen3-ASR family includes Qwen3-ASR-1.7B and Qwen3-ASR-0.6B, which support language identification and ASR for 52 languages and dialects. Both leverage large-scale speech training data and the str

2,021,550 793