ModelsHub 模型仓库

Qwen2.5 Coder 32B Instruct (free)

免费使用

qwen/qwen-2.5-coder-32b-instruct:free

Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). Qwen2.5-Coder brings the following improvements upon CodeQwen1.5: Significantly improvements in code generation, code reasoning and code fixing. A more comprehensive foundation for real-world applications such as Code Agents. Not only enhancing coding capabilities but also maintaining its strengths in mathematics and general competencies. To read more about its evaluation results, check out Qwen 2.5 Coder's blog.

2024-11-12 32,768 text->text Qwen

立即聊天

Qwen2.5 Coder 32B Instruct

$0.0002/1k

$0.0006/1k

qwen/qwen-2.5-coder-32b-instruct

2024-11-12 32,768 text->text Qwen

立即聊天

Qwen2.5 7B Instruct

$0.0002/1k

$0.0004/1k

qwen/qwen-2.5-7b-instruct

Qwen2.5 7B is the latest series of Qwen large language models. Qwen2.5 brings the following improvements upon Qwen2: Significantly more knowledge and has greatly improved capabilities in coding and mathematics, thanks to our specialized expert models in these domains. Significant improvements in instruction following, generating long texts (over 8K tokens), understanding structured data (e.g, tables), and generating structured outputs especially JSON. More resilient to the diversity of system prompts, enhancing role-play implementation and condition-setting for chatbots. Long-context Support up to 128K tokens and can generate up to 8K tokens. Multilingual support for over 29 languages, including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, Arabic, and more. Usage of this model is subject to Tongyi Qianwen LICENSE AGREEMENT.

2024-10-16 32,768 text->text Qwen

立即聊天

Qwen2.5 72B Instruct (free)

免费使用

qwen/qwen-2.5-72b-instruct:free

Qwen2.5 72B is the latest series of Qwen large language models. Qwen2.5 brings the following improvements upon Qwen2: Significantly more knowledge and has greatly improved capabilities in coding and mathematics, thanks to our specialized expert models in these domains. Significant improvements in instruction following, generating long texts (over 8K tokens), understanding structured data (e.g, tables), and generating structured outputs especially JSON. More resilient to the diversity of system prompts, enhancing role-play implementation and condition-setting for chatbots. Long-context Support up to 128K tokens and can generate up to 8K tokens. Multilingual support for over 29 languages, including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, Arabic, and more. Usage of this model is subject to Tongyi Qianwen LICENSE AGREEMENT.

2024-09-19 32,768 text->text Qwen

立即聊天

Qwen2.5 72B Instruct

$0.0004/1k

qwen/qwen-2.5-72b-instruct

2024-09-19 32,768 text->text Qwen

立即聊天

Qwen 2 72B Instruct

$0.0036/1k

qwen/qwen-2-72b-instruct

Qwen2 72B is a transformer-based model that excels in language understanding, multilingual capabilities, coding, mathematics, and reasoning. It features SwiGLU activation, attention QKV bias, and group query attention. It is pretrained on extensive data with supervised finetuning and direct preference optimization. For more details, see this blog post and GitHub repo. Usage of this model is subject to Tongyi Qianwen LICENSE AGREEMENT.

2024-06-07 32,768 text->text Qwen

立即聊天

Magnum v4 72B

$0.010/1k

$0.012/1k

anthracite-org/magnum-v4-72b

This is a series of models designed to replicate the prose quality of the Claude 3 models, specifically Sonnet(https://openrouter.ai/anthropic/claude-3.5-sonnet) and Opus(https://openrouter.ai/anthropic/claude-3-opus). The model is fine-tuned on top of Qwen2.5 72B.

2024-10-22 16,384 text->text Qwen

立即聊天

Magnum v2 72B

$0.012/1k

anthracite-org/magnum-v2-72b

From the maker of Goliath, Magnum 72B is the seventh in a family of models designed to achieve the prose quality of the Claude 3 models, notably Opus & Sonnet. The model is based on Qwen2 72B and trained with 55 million tokens of highly curated roleplay (RP) data.

2024-09-30 32,768 text->text Qwen

立即聊天

EVA Qwen2.5 72B

$0.016/1k

$0.024/1k

eva-unit-01/eva-qwen-2.5-72b

EVA Qwen2.5 72B is a roleplay and storywriting specialist model. It's a full-parameter finetune of Qwen2.5-72B on mixture of synthetic and natural data. It uses Celeste 70B 0.1 data mixture, greatly expanding it to improve versatility, creativity and "flavor" of the resulting model.

2024-11-22 16,384 text->text Qwen

立即聊天

DeepSeek: R1 Distill Qwen 7B

$0.0004/1k

$0.0008/1k

deepseek/deepseek-r1-distill-qwen-7b

DeepSeek-R1-Distill-Qwen-7B is a 7 billion parameter dense language model distilled from DeepSeek-R1, leveraging reinforcement learning-enhanced reasoning data generated by DeepSeek's larger models. The distillation process transfers advanced reasoning, math, and code capabilities into a smaller, more efficient model architecture based on Qwen2.5-Math-7B. This model demonstrates strong performance across mathematical benchmarks (92.8% pass@1 on MATH-500), coding tasks (Codeforces rating 1189), and general reasoning (49.1% pass@1 on GPQA Diamond), achieving competitive accuracy relative to larger models while maintaining smaller inference costs.

2025-05-31 131,072 text->text Qwen

立即聊天

DeepSeek: R1 Distill Qwen 32B

$0.0003/1k

$0.0006/1k

deepseek/deepseek-r1-distill-qwen-32b

DeepSeek R1 Distill Qwen 32B is a distilled large language model based on Qwen 2.5 32B, using outputs from DeepSeek R1. It outperforms OpenAI's o1-mini across various benchmarks, achieving new state-of-the-art results for dense models.\n\nOther benchmark results include:\n\n- AIME 2024 pass@1: 72.6\n- MATH-500 pass@1: 94.3\n- CodeForces Rating: 1691\n\nThe model leverages fine-tuning from DeepSeek R1's outputs, enabling competitive performance comparable to larger frontier models.

2025-01-30 131,072 text->text Qwen

立即聊天

DeepSeek: R1 Distill Qwen 14B (free)

免费使用

deepseek/deepseek-r1-distill-qwen-14b:free

DeepSeek R1 Distill Qwen 14B is a distilled large language model based on Qwen 2.5 14B, using outputs from DeepSeek R1. It outperforms OpenAI's o1-mini across various benchmarks, achieving new state-of-the-art results for dense models. Other benchmark results include: AIME 2024 pass@1: 69.7 MATH-500 pass@1: 93.9 CodeForces Rating: 1481 The model leverages fine-tuning from DeepSeek R1's outputs, enabling competitive performance comparable to larger frontier models.

2025-01-30 64,000 text->text Qwen

立即聊天

全部大模型