Xiaomi: MiMo-V2-Omni

$0.0016/1k

$0.0080/1k

xiaomi/mimo-v2-omni

上下文长度: 262,144 text+image+audio+video->text Other 2026-03-19 更新

MiMo-V2-Omni is a frontier omni-modal model that natively processes image, video, and audio inputs within a unified architecture. It combines strong multimodal perception with agentic capability - visual grounding, multi-step planning, tool use, and code execution - making it well-suited for complex real-world tasks that span modalities, 256K context window.

模型参数

架构信息

模态: text+image+audio+video->text

Tokenizer: Other

限制信息

上下文长度: 262,144

最大回复长度: 65,536

Xiaomi: MiMo-V2-Omni

模型参数

架构信息

限制信息

相关模型

Z.ai: GLM 5V Turbo

Z.ai: GLM 5 Turbo

Z.ai: GLM 5

Z.ai: GLM 4.7 Flash

Z.ai: GLM 4.7

Z.ai: GLM 4.6V