海量在线大模型 兼容OpenAI API

全部大模型

350个模型 · 2026-04-03 更新
Xiaomi: MiMo-V2-Pro
$0.0040/1k
$0.012/1k
xiaomi/mimo-v2-pro
MiMo-V2-Pro is Xiaomi's flagship foundation model, featuring over 1T total parameters and a 1M context length, deeply optimized for agentic scenarios. It is highly adaptable to general agent frameworks like OpenClaw. It ranks among the global top tier in the standard PinchBench and ClawBench benchmarks, with perceived performance approaching that of Opus 4.6. MiMo-V2-Pro is designed to serve as the brain of agent systems, orchestrating complex workflows, driving production engineering tasks, and delivering results reliably.
2026-03-19 1,048,576 text->text Other
Xiaomi: MiMo-V2-Omni
$0.0016/1k
$0.0080/1k
xiaomi/mimo-v2-omni
MiMo-V2-Omni is a frontier omni-modal model that natively processes image, video, and audio inputs within a unified architecture. It combines strong multimodal perception with agentic capability - visual grounding, multi-step planning, tool use, and code execution - making it well-suited for complex real-world tasks that span modalities, 256K context window.
2026-03-19 262,144 text+image+audio+video->text Other
Xiaomi: MiMo-V2-Flash
$0.0004/1k
$0.0012/1k
xiaomi/mimo-v2-flash
MiMo-V2-Flash is an open-source foundation language model developed by Xiaomi. It is a Mixture-of-Experts model with 309B total parameters and 15B active parameters, adopting hybrid attention architecture. MiMo-V2-Flash supports a hybrid-thinking toggle and a 256K context window, and excels at reasoning, coding, and agent scenarios. On SWE-bench Verified and SWE-bench Multilingual, MiMo-V2-Flash ranks as the top #1 open-source model globally, delivering performance comparable to Claude Sonnet 4.5 while costing only about 3.5% as much. Users can control the reasoning behaviour with the reasoning enabled boolean. Learn more in our docs.
2025-12-15 262,144 text->text Other
Writer: Palmyra X5
$0.0024/1k
$0.024/1k
writer/palmyra-x5
Palmyra X5 is Writer's most advanced model, purpose-built for building and scaling AI agents across the enterprise. It delivers industry-leading speed and efficiency on context windows up to 1 million tokens, powered by a novel transformer architecture and hybrid attention mechanisms. This enables faster inference and expanded memory for processing large volumes of enterprise data, critical for scaling AI agents.
2026-01-21 1,040,000 text->text Other
cognitivecomputations/dolphin-mistral-24b-venice-edition:free
Venice Uncensored Dolphin Mistral 24B Venice Edition is a fine-tuned variant of Mistral-Small-24B-Instruct-2501, developed by dphn.ai in collaboration with Venice.ai. This model is designed as an “uncensored” instruct-tuned LLM, preserving user control over alignment, system prompts, and behavior. Intended for advanced and unrestricted use cases, Venice Uncensored emphasizes steerability and transparent behavior, removing default safety and alignment layers typically found in mainstream assistant models.
2025-07-10 32,768 text->text Other
Upstage: Solar Pro 3
$0.0006/1k
$0.0024/1k
upstage/solar-pro-3
Solar Pro 3 is Upstage's powerful Mixture-of-Experts (MoE) language model. With 102B total parameters and 12B active parameters per forward pass, it delivers exceptional performance while maintaining computational efficiency. Optimized for Korean with English and Japanese support.
2026-01-27 128,000 text->text Other
Tongyi DeepResearch 30B A3B
$0.0004/1k
$0.0018/1k
alibaba/tongyi-deepresearch-30b-a3b
Tongyi DeepResearch is an agentic large language model developed by Tongyi Lab, with 30 billion total parameters activating only 3 billion per token. It's optimized for long-horizon, deep information-seeking tasks and delivers state-of-the-art performance on benchmarks like Humanity's Last Exam, BrowserComp, BrowserComp-ZH, WebWalkerQA, GAIA, xbench-DeepSearch, and FRAMES. This makes it superior for complex agentic search, reasoning, and multi-step problem-solving compared to prior models. The model includes a fully automated synthetic data pipeline for scalable pre-training, fine-tuning, and reinforcement learning. It uses large-scale continual pre-training on diverse agentic data to boost reasoning and stay fresh. It also features end-to-end on-policy RL with a customized Group Relative Policy Optimization, including token-level gradients and negative sample filtering for stable training. The model supports ReAct for core ability checks and an IterResearch-based 'Heavy' mode for max performance through test-time scaling. It's ideal for advanced research agents, tool use, and heavy inference workflows.
2025-09-18 131,072 text->text Other
TheDrummer: Skyfall 36B V2
$0.0022/1k
$0.0032/1k
thedrummer/skyfall-36b-v2
Skyfall 36B v2 is an enhanced iteration of Mistral Small 2501, specifically fine-tuned for improved creativity, nuanced writing, role-playing, and coherent storytelling.
2025-03-11 32,768 text->text Other
TheDrummer: Cydonia 24B V4.1
$0.0012/1k
$0.0020/1k
thedrummer/cydonia-24b-v4.1
Uncensored and creative writing model based on Mistral Small 3.2 24B with good recall, prompt adherence, and intelligence.
2025-09-27 131,072 text->text Other
Tencent: Hunyuan A13B Instruct
$0.0006/1k
$0.0023/1k
tencent/hunyuan-a13b-instruct
Hunyuan-A13B is a 13B active parameter Mixture-of-Experts (MoE) language model developed by Tencent, with a total parameter count of 80B and support for reasoning via Chain-of-Thought. It offers competitive benchmark performance across mathematics, science, coding, and multi-turn reasoning tasks, while maintaining high inference efficiency via Grouped Query Attention (GQA) and quantization support (FP8, GPTQ, etc.).
2025-07-08 131,072 text->text Other
Switchpoint Router
$0.0034/1k
$0.014/1k
switchpoint/router
Switchpoint AI's router instantly analyzes your request and directs it to the optimal AI from an ever-evolving library. As the world of LLMs advances, our router gets smarter, ensuring you always benefit from the industry's newest models without changing your workflow. This model is configured for a simple, flat rate per response here on OpenRouter. It's powered by the full routing engine from Switchpoint AI.
2025-07-12 131,072 text->text Other
stepfun/step-3.5-flash:free
Step 3.5 Flash is StepFun's most capable open-source foundation model. Built on a sparse Mixture of Experts (MoE) architecture, it selectively activates only 11B of its 196B parameters per token. It is a reasoning model that is incredibly speed efficient even at long contexts.
2026-01-30 256,000 text->text Other