海量在线大模型 兼容OpenAI API

全部大模型

350个模型 · 2026-04-03 更新
Qwen: Qwen3 14B
$0.0002/1k
$0.0010/1k
qwen/qwen3-14b
Qwen3-14B is a dense 14.8B parameter causal language model from the Qwen3 series, designed for both complex reasoning and efficient dialogue. It supports seamless switching between a "thinking" mode for tasks like math, programming, and logical inference, and a "non-thinking" mode for general-purpose conversation. The model is fine-tuned for instruction-following, agent tool use, creative writing, and multilingual tasks across 100+ languages and dialects. It natively handles 32K token contexts and can extend to 131K tokens using YaRN-based scaling.
2025-04-29 40,960 text->text Qwen3
Qwen: Qwen Plus 0728 (thinking)
$0.0010/1k
$0.0031/1k
qwen/qwen-plus-2025-07-28:thinking
Qwen Plus 0728, based on the Qwen3 foundation model, is a 1 million context hybrid reasoning model with a balanced performance, speed, and cost combination.
2025-09-09 1,000,000 text->text Qwen3
Qwen: Qwen Plus 0728
$0.0010/1k
$0.0031/1k
qwen/qwen-plus-2025-07-28
Qwen Plus 0728, based on the Qwen3 foundation model, is a 1 million context hybrid reasoning model with a balanced performance, speed, and cost combination.
2025-09-09 1,000,000 text->text Qwen3
Prime Intellect: INTELLECT-3
$0.0008/1k
$0.0044/1k
prime-intellect/intellect-3
INTELLECT-3 is a 106B-parameter Mixture-of-Experts model (12B active) post-trained from GLM-4.5-Air-Base using supervised fine-tuning (SFT) followed by large-scale reinforcement learning (RL). It offers state-of-the-art performance for its size across math, code, science, and general reasoning, consistently outperforming many larger frontier models. Designed for strong multi-step problem solving, it maintains high accuracy on structured tasks while remaining efficient at inference thanks to its MoE architecture.
2025-11-27 131,072 text->text Other
perplexity/sonar-reasoning-pro
Note: Sonar Pro pricing includes Perplexity search pricing. See details here Sonar Reasoning Pro is a premier reasoning model powered by DeepSeek R1 with Chain of Thought (CoT). Designed for advanced use cases, it supports in-depth, multi-step queries with a larger context window and can surface more citations per search, enabling more comprehensive and extensible responses.
2025-03-07 128,000 text+image->text Other
Perplexity: Sonar Pro Search
$0.012/1k
$0.060/1k
perplexity/sonar-pro-search
Exclusively available on the OpenRouter API, Sonar Pro's new Pro Search mode is Perplexity's most advanced agentic search system. It is designed for deeper reasoning and analysis. Pricing is based on tokens plus $18 per thousand requests. This model powers the Pro Search mode on the Perplexity platform. Sonar Pro Search adds autonomous, multi-step reasoning to Sonar Pro. So, instead of just one query + synthesis, it plans and executes entire research workflows using tools.
2025-10-31 200,000 text+image->text Other
Perplexity: Sonar Pro
$0.012/1k
$0.060/1k
perplexity/sonar-pro
Note: Sonar Pro pricing includes Perplexity search pricing. See details here For enterprises seeking more advanced capabilities, the Sonar Pro API can handle in-depth, multi-step queries with added extensibility, like double the number of citations per search as Sonar on average. Plus, with a larger context window, it can handle longer and more nuanced searches and follow-up questions.
2025-03-07 200,000 text+image->text Other
perplexity/sonar-deep-research
Sonar Deep Research is a research-focused model designed for multi-step retrieval, synthesis, and reasoning across complex topics. It autonomously searches, reads, and evaluates sources, refining its approach as it gathers information. This enables comprehensive report generation across domains like finance, technology, health, and current events. Notes on Pricing (Source) - Input tokens comprise of Prompt tokens (user prompt) + Citation tokens (these are processed tokens from running searches) - Deep Research runs multiple searches to conduct exhaustive research. Searches are priced at $5/1000 searches. A request that does 30 searches will cost $0.15 in this step. - Reasoning is a distinct step in Deep Research since it does extensive automated reasoning through all the material it gathers during its research phase. Reasoning tokens here are a bit different than the CoTs in the answer - these are tokens that we use to reason through the research material prior to generating the outputs via the CoTs. Reasoning tokens are priced at $3/1M tokens
2025-03-07 128,000 text->text Other
Perplexity: Sonar
$0.0040/1k
$0.0040/1k
perplexity/sonar
Sonar is lightweight, affordable, fast, and simple to use — now featuring citations and the ability to customize sources. It is designed for companies seeking to integrate lightweight question-and-answer features optimized for speed.
2025-01-28 127,072 text+image->text Other
Nous: Hermes 4 405B
$0.0040/1k
$0.012/1k
nousresearch/hermes-4-405b
Hermes 4 is a large-scale reasoning model built on Meta-Llama-3.1-405B and released by Nous Research. It introduces a hybrid reasoning mode, where the model can choose to deliberate internally with ... traces or respond directly, offering flexibility between speed and depth. Users can control the reasoning behaviour with the reasoning enabled boolean. Learn more in our docs The model is instruction-tuned with an expanded post-training corpus (~60B tokens) emphasizing reasoning traces, improving performance in math, code, STEM, and logical reasoning, while retaining broad assistant utility. It also supports structured outputs, including JSON mode, schema adherence, function calling, and tool use. Hermes 4 is trained for steerability, lower refusal rates, and alignment toward neutral, user-directed behavior.
2025-08-27 131,072 text->text Other
nvidia/nemotron-nano-9b-v2:free
NVIDIA-Nemotron-Nano-9B-v2 is a large language model (LLM) trained from scratch by NVIDIA, and designed as a unified model for both reasoning and non-reasoning tasks. It responds to user queries and tasks by first generating a reasoning trace and then concluding with a final response. The model's reasoning capabilities can be controlled via a system prompt. If the user prefers the model to provide its final answer without intermediate reasoning traces, it can be configured to do so.
2025-09-06 128,000 text->text Other
NVIDIA: Nemotron Nano 9B V2
$0.0002/1k
$0.0006/1k
nvidia/nemotron-nano-9b-v2
NVIDIA-Nemotron-Nano-9B-v2 is a large language model (LLM) trained from scratch by NVIDIA, and designed as a unified model for both reasoning and non-reasoning tasks. It responds to user queries and tasks by first generating a reasoning trace and then concluding with a final response. The model's reasoning capabilities can be controlled via a system prompt. If the user prefers the model to provide its final answer without intermediate reasoning traces, it can be configured to do so.
2025-09-06 131,072 text->text Other