AI 模型库

共 个模型
下载量 收藏数 最新 text-generation ×

Qwen3-1.7B

text-generation
Qwen · Qwen/Qwen3-1.7B

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groun

3,332,968 460

Gemma-3-1B-it-GLM-4.7-Flash-Heretic-Uncensored-Thinking_GGUF

text-generation
Andycurrent · Andycurrent/Gemma-3-1B-it-GLM-4.7-Flash-Heretic-Uncensored-Thinking_GGUF

--- license: gemma language: - en base_model: - google/gemma-3-1b-it tags: - uncensored - text-generation - reasoning - instruction-tuned - lightweight --- Gemma 3 – 1B IT GLM-4.7 Flash

3,261,957 22

Meta-Llama-3-8B

text-generation
meta-llama · meta-llama/Meta-Llama-3-8B

text-generation

3,235,442 6530

pythia-160m

text-generation
EleutherAI · EleutherAI/pythia-160m

*Pythia Scaling Suite* 是一组为促进可解释性研究而开发的模型集合(详见论文)。该套件包含两组共八个模型,参数量分别为70M、160M、410M、1B、1.4B、2.8B、6.9B和12B。每个参数量对应两个模型:一个在Pile数据集上训练,另一个在P

3,095,627 42

Mistral-7B-Instruct-v0.2

text-generation
mistralai · mistralai/Mistral-7B-Instruct-v0.2

使用 `mistral_common` 进行编码和解码 ```py from mistral_common.tokens.tokenizers.mistral import MistralTokenizer from mistral_common.protocol.instruct.messages import UserMessage from mistral_comm

3,057,726 3133

distilgpt2

text-generation
distilbert · distilbert/distilgpt2

DistilGPT2(Distilled-GPT2的简称)是一个在生成式预训练Transformer 2(GPT-2)最小版本监督下预训练的英语语言模型。与GPT-2类似,DistilGPT2可用于文本生成。本模型卡的用户还应考虑关于设计的相关信息

3,000,562 629

tiny-random-LlamaForCausalLM

text-generation
hmellor · hmellor/tiny-random-LlamaForCausalLM

<!-- Provide a quick summary of what the model is/does. -->

2,988,632 0

TinyLlama-1.1B-Chat-v1.0

text-generation
TinyLlama · TinyLlama/TinyLlama-1.1B-Chat-v1.0

TinyLlama项目旨在**在3万亿个token上预训练一个11亿参数的Llama模型**。通过适当的优化,我们仅需使用16块A100-40G GPU,就能在"短短"90天内完成这一目标🚀🚀。训练已于2023年9月1日开始。

2,953,734 1580

Qwen3-Coder-30B-A3B-Instruct

text-generation
Qwen · Qwen/Qwen3-Coder-30B-A3B-Instruct

**Qwen3-Coder** is available in multiple sizes. Today, we're excited to introduce **Qwen3-Coder-30B-A3B-Instruct**. This streamlined model maintains impressive performance and efficiency, featuring th

2,677,340 1054

Qwen3-14B

text-generation
Qwen · Qwen/Qwen3-14B

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groun

2,627,221 394

Qwen2.5-14B-Instruct

text-generation
Qwen · Qwen/Qwen2.5-14B-Instruct

Qwen2.5是Qwen大语言模型的最新系列。针对Qwen2.5,我们发布了多个基础语言模型和指令微调语言模型,参数量从0.5亿到720亿不等。相较于Qwen2,Qwen2.5带来了以下改进:

2,595,179 334

Qwen3Guard-Gen-0.6B

text-generation
Qwen · Qwen/Qwen3Guard-Gen-0.6B

**Qwen3Guard** is a series of safety moderation models built upon Qwen3 and trained on a dataset of 1.19 million prompts and responses labeled for safety. The series includes models of three sizes (0.

2,554,316 71

Llama-3.2-3B-Instruct

text-generation
meta-llama · meta-llama/Llama-3.2-3B-Instruct

text-generation

2,441,920 2124

Qwen2.5-Coder-7B-Instruct

text-generation
Qwen · Qwen/Qwen2.5-Coder-7B-Instruct

Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). As of now, Qwen2.5-Coder has covered six mainstream model sizes, 0.5, 1.5, 3, 7, 14, 32 bil

2,328,991 705

Qwen3-0.6B-FP8

text-generation
Qwen · Qwen/Qwen3-0.6B-FP8

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groun

2,318,955 59

Gemma-4-31B-IT-NVFP4

text-generation
nvidia · nvidia/Gemma-4-31B-IT-NVFP4

描述: Gemma 4 31B IT 是由 Google DeepMind 构建的开放多模态模型,支持文本和图像输入,能够将视频作为帧序列进行处理,并生成文本输出。该模型旨在为推理、智能体工作流、编程和多模态理解提供前沿性能。

2,262,752 470

Qwen2.5-0.5B

text-generation
Qwen · Qwen/Qwen2.5-0.5B

Qwen2.5是Qwen大语言模型的最新系列。针对Qwen2.5,我们发布了一系列基础语言模型和指令微调语言模型,参数规模从0.5亿到720亿不等。相较于Qwen2,Qwen2.5带来了以下改进:

2,227,722 401

Llama-3.2-1B-Instruct-FP8-dynamic

text-generation
RedHatAI · RedHatAI/Llama-3.2-1B-Instruct-FP8-dynamic

模型概述 - **模型架构:** Meta-Llama-3.2 - **输入:** 文本 - **输出:** 文本 - **模型优化:** - **权重量化:** FP8 - **激活量化:** FP8 - **预期用途:** 适用于多语言的商业和研究用途。与Lla类似

2,114,374 4

Qwen2.5-14B-Instruct-AWQ

text-generation
Qwen · Qwen/Qwen2.5-14B-Instruct-AWQ

Qwen2.5是Qwen大语言模型的最新系列。针对Qwen2.5,我们发布了一系列基础语言模型和指令微调语言模型,参数量从0.5亿到720亿不等。相较于Qwen2,Qwen2.5带来了以下改进:

2,055,940 35

gpt2-large

text-generation
openai-community · openai-community/gpt2-large

目录 - 模型详情 - 模型入门指南 - 用途 - 风险、局限性与偏见 - 训练 - 评估 - 环境影响 - 技术规格 - 引用信息 - 模型卡片作者

2,042,727 349