Inception: Mercury 2

$0.0010/1k

$0.0030/1k

inception/mercury-2

上下文长度: 128,000 text->text Other 2026-03-04 更新

Mercury 2 is an extremely fast reasoning LLM, and the first reasoning diffusion LLM (dLLM). Instead of generating tokens sequentially, Mercury 2 produces and refines multiple tokens in parallel, achieving >1,000 tokens/sec on standard GPUs. Mercury 2 is 5x+ faster than leading speed-optimized LLMs like Claude 4.5 Haiku and GPT 5 Mini, at a fraction of the cost. Mercury 2 supports tunable reasoning levels, 128K context, native tool use, and schema-aligned JSON output. Built for coding workflows where latency compounds, real-time voice/search, and agent loops. OpenAI API compatible. Read more in the blog post.

模型参数

架构信息

模态: text->text

Tokenizer: Other

限制信息

上下文长度: 128,000

最大回复长度: 50,000

Inception: Mercury 2

模型参数

架构信息

限制信息

相关模型

Z.ai: GLM 5V Turbo

Z.ai: GLM 5 Turbo

Z.ai: GLM 5

Z.ai: GLM 4.7 Flash

Z.ai: GLM 4.7

Z.ai: GLM 4.6V