海量在线大模型 兼容OpenAI API

全部大模型

349个模型 · 2025-12-17 更新
TheDrummer: Rocinante 12B
$0.0007/1k
$0.0017/1k
thedrummer/rocinante-12b
Rocinante 12B is designed for engaging storytelling and rich prose. Early testers have reported: - Expanded vocabulary with unique and expressive word choices - Enhanced creativity for vivid narratives - Adventure-filled and captivating stories
2024-09-30 32,768 text->text Qwen
Qwen: Qwen3 VL 32B Instruct
$0.0020/1k
$0.0060/1k
qwen/qwen3-vl-32b-instruct
Qwen3-VL-32B-Instruct is a large-scale multimodal vision-language model designed for high-precision understanding and reasoning across text, images, and video. With 32 billion parameters, it combines deep visual perception with advanced text comprehension, enabling fine-grained spatial reasoning, document and scene analysis, and long-horizon video understanding.Robust OCR in 32 languages, and enhanced multimodal fusion through Interleaved-MRoPE and DeepStack architectures. Optimized for agentic interaction and visual tool use, Qwen3-VL-32B delivers state-of-the-art performance for complex real-world multimodal tasks.
2025-10-23 262,144 text+image->text Qwen
qwen/qwen-2.5-vl-7b-instruct:free
Qwen2.5 VL 7B is a multimodal LLM from the Qwen Team with the following key enhancements: SoTA understanding of images of various resolution & ratio: Qwen2.5-VL achieves state-of-the-art performance on visual understanding benchmarks, including MathVista, DocVQA, RealWorldQA, MTVQA, etc. Understanding videos of 20min+: Qwen2.5-VL can understand videos over 20 minutes for high-quality video-based question answering, dialog, content creation, etc. Agent that can operate your mobiles, robots, etc.: with the abilities of complex reasoning and decision making, Qwen2.5-VL can be integrated with devices like mobile phones, robots, etc., for automatic operation based on visual environment and text instructions. Multilingual Support: to serve global users, besides English and Chinese, Qwen2.5-VL now supports the understanding of texts in different languages inside images, including most European languages, Japanese, Korean, Arabic, Vietnamese, etc. For more details, see this blog post and GitHub repo. Usage of this model is subject to Tongyi Qianwen LICENSE AGREEMENT.
2024-08-28 32,768 text+image->text Qwen
Qwen: Qwen2.5-VL 7B Instruct
$0.0008/1k
$0.0008/1k
qwen/qwen-2.5-vl-7b-instruct
Qwen2.5 VL 7B is a multimodal LLM from the Qwen Team with the following key enhancements: SoTA understanding of images of various resolution & ratio: Qwen2.5-VL achieves state-of-the-art performance on visual understanding benchmarks, including MathVista, DocVQA, RealWorldQA, MTVQA, etc. Understanding videos of 20min+: Qwen2.5-VL can understand videos over 20 minutes for high-quality video-based question answering, dialog, content creation, etc. Agent that can operate your mobiles, robots, etc.: with the abilities of complex reasoning and decision making, Qwen2.5-VL can be integrated with devices like mobile phones, robots, etc., for automatic operation based on visual environment and text instructions. Multilingual Support: to serve global users, besides English and Chinese, Qwen2.5-VL now supports the understanding of texts in different languages inside images, including most European languages, Japanese, Korean, Arabic, Vietnamese, etc. For more details, see this blog post and GitHub repo. Usage of this model is subject to Tongyi Qianwen LICENSE AGREEMENT.
2024-08-28 32,768 text+image->text Qwen
Qwen: Qwen2.5 VL 72B Instruct
$0.0001/1k
$0.0005/1k
qwen/qwen2.5-vl-72b-instruct
Qwen2.5-VL is proficient in recognizing common objects such as flowers, birds, fish, and insects. It is also highly capable of analyzing texts, charts, icons, graphics, and layouts within images.
2025-02-01 32,768 text+image->text Qwen
Qwen: Qwen2.5 VL 32B Instruct
$0.0002/1k
$0.0009/1k
qwen/qwen2.5-vl-32b-instruct
Qwen2.5-VL-32B is a multimodal vision-language model fine-tuned through reinforcement learning for enhanced mathematical reasoning, structured outputs, and visual problem-solving capabilities. It excels at visual analysis tasks, including object recognition, textual interpretation within images, and precise event localization in extended videos. Qwen2.5-VL-32B demonstrates state-of-the-art performance across multimodal benchmarks such as MMMU, MathVista, and VideoMME, while maintaining strong reasoning and clarity in text-based tasks like MMLU, mathematical problem-solving, and code generation.
2025-03-25 16,384 text+image->text Qwen
Qwen: Qwen2.5 Coder 7B Instruct
$0.0001/1k
$0.0004/1k
qwen/qwen2.5-coder-7b-instruct
Qwen2.5-Coder-7B-Instruct is a 7B parameter instruction-tuned language model optimized for code-related tasks such as code generation, reasoning, and bug fixing. Based on the Qwen2.5 architecture, it incorporates enhancements like RoPE, SwiGLU, RMSNorm, and GQA attention with support for up to 128K tokens using YaRN-based extrapolation. It is trained on a large corpus of source code, synthetic data, and text-code grounding, providing robust performance across programming languages and agentic coding workflows. This model is part of the Qwen2.5-Coder family and offers strong compatibility with tools like vLLM for efficient deployment. Released under the Apache 2.0 license.
2025-04-16 32,768 text->text Qwen
Qwen: Qwen2.5 7B Instruct
$0.0002/1k
$0.0004/1k
qwen/qwen-2.5-7b-instruct
Qwen2.5 7B is the latest series of Qwen large language models. Qwen2.5 brings the following improvements upon Qwen2: Significantly more knowledge and has greatly improved capabilities in coding and mathematics, thanks to our specialized expert models in these domains. Significant improvements in instruction following, generating long texts (over 8K tokens), understanding structured data (e.g, tables), and generating structured outputs especially JSON. More resilient to the diversity of system prompts, enhancing role-play implementation and condition-setting for chatbots. Long-context Support up to 128K tokens and can generate up to 8K tokens. Multilingual support for over 29 languages, including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, Arabic, and more. Usage of this model is subject to Tongyi Qianwen LICENSE AGREEMENT.
2024-10-16 32,768 text->text Qwen
Qwen: Qwen-Turbo
$0.0002/1k
$0.0008/1k
qwen/qwen-turbo
Qwen-Turbo, based on Qwen2.5, is a 1M context model that provides fast speed and low cost, suitable for simple tasks.
2025-02-01 1,000,000 text->text Qwen
Qwen: Qwen-Plus
$0.0016/1k
$0.0048/1k
qwen/qwen-plus
Qwen-Plus, based on the Qwen2.5 foundation model, is a 131K context model with a balanced performance, speed, and cost combination.
2025-02-01 131,072 text->text Qwen
Qwen: Qwen-Max
$0.0064/1k
$0.026/1k
qwen/qwen-max
Qwen-Max, based on Qwen2.5, provides the best inference performance among Qwen models, especially for complex multi-step tasks. It's a large-scale MoE model that has been pretrained on over 20 trillion tokens and further post-trained with curated Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF) methodologies. The parameter count is unknown.
2025-02-01 32,768 text->text Qwen
Qwen: Qwen VL Plus
$0.0008/1k
$0.0025/1k
qwen/qwen-vl-plus
Qwen's Enhanced Large Visual Language Model. Significantly upgraded for detailed recognition capabilities and text recognition abilities, supporting ultra-high pixel resolutions up to millions of pixels and extreme aspect ratios for image input. It delivers significant performance across a broad range of visual tasks.
2025-02-05 7,500 text+image->text Qwen