AI 模型库
spkrec-ecapa-voxceleb
Speaker Verification with ECAPA-TDNN embeddings on Voxceleb
Llama-3.2-3B-Instruct
text-generationtext-generation
TRELLIS-image-large
image-to-3d<!-- Provide a quick summary of what the model is/does. -->
esmfold_v1
ESMFold是一种基于ESM-2骨干网络的最先进的端到端蛋白质折叠模型。它不需要任何查找或多序列比对步骤,因此无需依赖任何外部数据库即可进行预测。这使得其推理速度显著快于AlphaFold。
blip-image-captioning-base
image-to-textBLIP:面向统一视觉-语言理解与生成的语言-图像预训练引导方法
all-distilroberta-v1
sentence-similarityall-distilroberta-v1 这是一个句子变换器模型:它将句子和段落映射到768维的稠密向量空间,可用于聚类或语义搜索等任务。
chronos-2-small
time-series-forecastingThis is the _small_ variant of the Chronos-2 model with 28M parameters. For usage and details on the Chronos-2 model, please refer to https://huggingface.co/autogluon/chronos-2.
Qwen2.5-Coder-7B-Instruct
text-generationQwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). As of now, Qwen2.5-Coder has covered six mainstream model sizes, 0.5, 1.5, 3, 7, 14, 32 bil
Qwen3-0.6B-FP8
text-generationQwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groun
whisper-small
automatic-speech-recognitionWhisper is a pre-trained model for automatic speech recognition (ASR) and speech translation. Trained on 680k hours of labelled data, Whisper models demonstrate a strong ability to generalise to many
Qwen3.6-27B
image-text-to-text> [!Note] > This repository contains model weights and configuration files for the post-trained model in the Hugging Face Transformers format. > > These artifacts are compatible with Hugging Face Tra
Qwen3-VL-Embedding-2B
sentence-similarityThe **Qwen3-VL-Embedding** and **Qwen3-VL-Reranker** model series are the latest additions to the Qwen family, built upon the recently open-sourced and powerful Qwen3-VL foundation model. Specifically
Gemma-4-31B-IT-NVFP4
text-generation描述: Gemma 4 31B IT 是由 Google DeepMind 构建的开放多模态模型,支持文本和图像输入,能够将视频作为帧序列进行处理,并生成文本输出。该模型旨在为推理、智能体工作流、编程和多模态理解提供前沿性能。
Bio_ClinicalBERT
fill-mask《公开可用的临床BERT嵌入》论文包含四种独特的临床BERT模型:基于BERT-Base(`cased_L-12_H-768_A-12`)或BioBERT(`BioBERT-Base v1.0 + PubMed 200K + PMC 270K`)初始化,并在所有MIMIC笔记或仅出院小结上进行训练。
chatterbox
text-to-speech**09/04 🔥 Introducing Chatterbox Multilingual in 23 Languages!**
Qwen2.5-0.5B
text-generationQwen2.5是Qwen大语言模型的最新系列。针对Qwen2.5,我们发布了一系列基础语言模型和指令微调语言模型,参数规模从0.5亿到720亿不等。相较于Qwen2,Qwen2.5带来了以下改进:
gte-multilingual-base
sentence-similarity**gte-multilingual-base** 模型是 GTE(通用文本嵌入)模型系列中的最新成员,具备以下关键特性:
vit-base-patch16-224-in21k
image-feature-extractionVision Transformer (ViT) model pre-trained on ImageNet-21k (14 million images, 21,843 classes) at resolution 224x224. It was introduced in the paper An Image is Worth 16x16 Words: Transformers for Ima
OmniVoice
text-to-speech
nsfw-image-detection-384
image-classification__NOTE: Like all models, this one can make mistakes. NSFW content can be subjective and contextual, this model is intended to help identify this content, use at your own risk.__