海量在线大模型 兼容OpenAI API

全部大模型

326个模型 · 2025-09-17 更新
Z.AI: GLM 4.5
$0.0016/1k
$0.0066/1k
z-ai/glm-4.5
GLM-4.5 is our latest flagship foundation model, purpose-built for agent-based applications. It leverages a Mixture-of-Experts (MoE) architecture and supports a context length of up to 128k tokens. GLM-4.5 delivers significantly enhanced capabilities in reasoning, code generation, and agent alignment. It supports a hybrid inference mode with two options, a "thinking mode" designed for complex reasoning and tool use, and a "non-thinking mode" optimized for instant responses. Users can control the reasoning behaviour with the reasoning enabled boolean. Learn more in our docs
2025-07-26 131,072 text->text Other
Z.AI: GLM 4 32B
$0.0004/1k
$0.0004/1k
z-ai/glm-4-32b
GLM 4 32B is a cost-effective foundation language model. It can efficiently perform complex tasks and has significantly enhanced capabilities in tool use, online search, and code-related intelligent tasks. It is made by the same lab behind the thudm models.
2025-07-25 128,000 text->text Other
cognitivecomputations/dolphin-mistral-24b-venice-edition:free
Venice Uncensored Dolphin Mistral 24B Venice Edition is a fine-tuned variant of Mistral-Small-24B-Instruct-2501, developed by dphn.ai in collaboration with Venice.ai. This model is designed as an “uncensored” instruct-tuned LLM, preserving user control over alignment, system prompts, and behavior. Intended for advanced and unrestricted use cases, Venice Uncensored emphasizes steerability and transparent behavior, removing default safety and alignment layers typically found in mainstream assistant models.
2025-07-10 32,768 text->text Other
TheDrummer: Skyfall 36B V2
$0.0002/1k
$0.0006/1k
thedrummer/skyfall-36b-v2
Skyfall 36B v2 is an enhanced iteration of Mistral Small 2501, specifically fine-tuned for improved creativity, nuanced writing, role-playing, and coherent storytelling.
2025-03-11 32,768 text->text Other
TheDrummer: Anubis Pro 105B V1
$0.0020/1k
$0.0040/1k
thedrummer/anubis-pro-105b-v1
Anubis Pro 105B v1 is an expanded and refined variant of Meta’s Llama 3.3 70B, featuring 50% additional layers and further fine-tuning to leverage its increased capacity. Designed for advanced narrative, roleplay, and instructional tasks, it demonstrates enhanced emotional intelligence, creativity, nuanced character portrayal, and superior prompt adherence compared to smaller models. Its larger parameter count allows for deeper contextual understanding and extended reasoning capabilities, optimized for engaging, intelligent, and coherent interactions.
2025-03-11 131,072 text->text Other
tencent/hunyuan-a13b-instruct:free
Hunyuan-A13B is a 13B active parameter Mixture-of-Experts (MoE) language model developed by Tencent, with a total parameter count of 80B and support for reasoning via Chain-of-Thought. It offers competitive benchmark performance across mathematics, science, coding, and multi-turn reasoning tasks, while maintaining high inference efficiency via Grouped Query Attention (GQA) and quantization support (FP8, GPTQ, etc.).
2025-07-08 32,768 text->text Other
Tencent: Hunyuan A13B Instruct
$0.0001/1k
$0.0001/1k
tencent/hunyuan-a13b-instruct
Hunyuan-A13B is a 13B active parameter Mixture-of-Experts (MoE) language model developed by Tencent, with a total parameter count of 80B and support for reasoning via Chain-of-Thought. It offers competitive benchmark performance across mathematics, science, coding, and multi-turn reasoning tasks, while maintaining high inference efficiency via Grouped Query Attention (GQA) and quantization support (FP8, GPTQ, etc.).
2025-07-08 32,768 text->text Other
THUDM: GLM Z1 32B
$0.0002/1k
$0.0006/1k
thudm/glm-z1-32b
GLM-Z1-32B-0414 is an enhanced reasoning variant of GLM-4-32B, built for deep mathematical, logical, and code-oriented problem solving. It applies extended reinforcement learning—both task-specific and general pairwise preference-based—to improve performance on complex multi-step tasks. Compared to the base GLM-4-32B model, Z1 significantly boosts capabilities in structured reasoning and formal domains. The model supports enforced “thinking” steps via prompt engineering and offers improved coherence for long-form outputs. It’s optimized for use in agentic workflows, and includes support for long context (via YaRN), JSON tool calling, and fine-grained sampling configuration for stable inference. Ideal for use cases requiring deliberate, multi-step reasoning or formal derivations.
2025-04-18 32,768 text->text Other
THUDM: GLM 4.1V 9B Thinking
$0.0001/1k
$0.0006/1k
thudm/glm-4.1v-9b-thinking
GLM-4.1V-9B-Thinking is a 9B parameter vision-language model developed by THUDM, based on the GLM-4-9B foundation. It introduces a reasoning-centric "thinking paradigm" enhanced with reinforcement learning to improve multimodal reasoning, long-context understanding (up to 64K tokens), and complex problem solving. It achieves state-of-the-art performance among models in its class, outperforming even larger models like Qwen-2.5-VL-72B on a majority of benchmark tasks.
2025-07-11 65,536 text+image->text Other
Switchpoint Router
$0.0034/1k
$0.014/1k
switchpoint/router
Switchpoint AI's router instantly analyzes your request and directs it to the optimal AI from an ever-evolving library. As the world of LLMs advances, our router gets smarter, ensuring you always benefit from the industry's newest models without changing your workflow. This model is configured for a simple, flat rate per response here on OpenRouter. It's powered by the full routing engine from Switchpoint AI.
2025-07-12 131,072 text->text Other
StepFun: Step3
$0.0023/1k
$0.0057/1k
stepfun-ai/step3
Step3 is a cutting-edge multimodal reasoning model—built on a Mixture-of-Experts architecture with 321B total parameters and 38B active. It is designed end-to-end to minimize decoding costs while delivering top-tier performance in vision–language reasoning. Through the co-design of Multi-Matrix Factorization Attention (MFA) and Attention-FFN Disaggregation (AFD), Step3 maintains exceptional efficiency across both flagship and low-end accelerators.
2025-08-29 65,536 text+image->text Other
Sonoma Sky Alpha
免费使用
openrouter/sonoma-sky-alpha
This is a cloaked model provided to the community to gather feedback. A maximally intelligent general-purpose frontier model with a 2 million token context window. Supports image inputs and parallel tool calling. Note: It’s free to use during this testing period, and prompts and completions are logged by the model creator for feedback and training.
2025-09-06 2,000,000 text+image->text Other