海量在线大模型 兼容OpenAI API

全部大模型

350个模型 · 2026-04-03 更新
ByteDance Seed: Seed-2.0-Lite
$0.0010/1k
$0.0080/1k
bytedance-seed/seed-2.0-lite
Seed-2.0-Lite is a versatile, cost‑efficient enterprise workhorse that delivers strong multimodal and agent capabilities while offering noticeably lower latency, making it a practical default choice for most production workloads across text, vision, and tools. Engineered for high-frequency visual understanding and agentic workflows, it's an ideal choice for deployment at scale with minimal latency.
2026-03-10 262,144 text+image+video->text Other
ByteDance Seed: Seed 1.6 Flash
$0.0003/1k
$0.0012/1k
bytedance-seed/seed-1.6-flash
Seed 1.6 Flash is an ultra-fast multimodal deep thinking model by ByteDance Seed, supporting both text and visual understanding. It features a 256k context window and can generate outputs of up to 16k tokens.
2025-12-23 262,144 text+image+video->text Other
ByteDance Seed: Seed 1.6
$0.0010/1k
$0.0080/1k
bytedance-seed/seed-1.6
Seed 1.6 is a general-purpose model released by the ByteDance Seed team. It incorporates multimodal capabilities and adaptive deep thinking with a 256K context window.
2025-12-23 262,144 text+image+video->text Other
Baidu: ERNIE 4.5 VL 424B A47B
$0.0017/1k
$0.0050/1k
baidu/ernie-4.5-vl-424b-a47b
ERNIE-4.5-VL-424B-A47B is a multimodal Mixture-of-Experts (MoE) model from Baidu’s ERNIE 4.5 series, featuring 424B total parameters with 47B active per token. It is trained jointly on text and image data using a heterogeneous MoE architecture and modality-isolated routing to enable high-fidelity cross-modal reasoning, image understanding, and long-context generation (up to 131k tokens). Fine-tuned with techniques like SFT, DPO, UPO, and RLVR, this model supports both “thinking” and non-thinking inference modes. Designed for vision-language tasks in English and Chinese, it is optimized for efficient scaling and can operate under 4-bit/8-bit quantization.
2025-07-01 123,000 text+image->text Other
Baidu: ERNIE 4.5 VL 28B A3B
$0.0006/1k
$0.0022/1k
baidu/ernie-4.5-vl-28b-a3b
A powerful multimodal Mixture-of-Experts chat model featuring 28B total parameters with 3B activated per token, delivering exceptional text and vision understanding through its innovative heterogeneous MoE structure with modality-isolated routing. Built with scaling-efficient infrastructure for high-throughput training and inference, the model leverages advanced post-training techniques including SFT, DPO, and UPO for optimized performance, while supporting an impressive 131K context length and RLVR alignment for superior cross-modal reasoning and generation capabilities.
2025-08-13 30,000 text+image->text Other
Baidu: ERNIE 4.5 300B A47B
$0.0011/1k
$0.0044/1k
baidu/ernie-4.5-300b-a47b
ERNIE-4.5-300B-A47B is a 300B parameter Mixture-of-Experts (MoE) language model developed by Baidu as part of the ERNIE 4.5 series. It activates 47B parameters per token and supports text generation in both English and Chinese. Optimized for high-throughput inference and efficient scaling, it uses a heterogeneous MoE structure with advanced routing and quantization strategies, including FP8 and 2-bit formats. This version is fine-tuned for language-only tasks and supports reasoning, tool parameters, and extended context lengths up to 131k tokens. Suitable for general-purpose LLM applications with high reasoning and throughput demands.
2025-07-01 123,000 text->text Other
baidu/ernie-4.5-21b-a3b-thinking
ERNIE-4.5-21B-A3B-Thinking is Baidu's upgraded lightweight MoE model, refined to boost reasoning depth and quality for top-tier performance in logical puzzles, math, science, coding, text generation, and expert-level academic benchmarks.
2025-10-10 131,072 text->text Other
Baidu: ERNIE 4.5 21B A3B
$0.0003/1k
$0.0011/1k
baidu/ernie-4.5-21b-a3b
A sophisticated text-based Mixture-of-Experts (MoE) model featuring 21B total parameters with 3B activated per token, delivering exceptional multimodal understanding and generation through heterogeneous MoE structures and modality-isolated routing. Supporting an extensive 131K token context length, the model achieves efficient inference via multi-expert parallel collaboration and quantization, while advanced post-training techniques including SFT, DPO, and UPO ensure optimized performance across diverse applications with specialized routing and balancing losses for superior task handling.
2025-08-13 120,000 text->text Other
Arcee AI: Virtuoso Large
$0.0030/1k
$0.0048/1k
arcee-ai/virtuoso-large
Virtuoso‑Large is Arcee's top‑tier general‑purpose LLM at 72 B parameters, tuned to tackle cross‑domain reasoning, creative writing and enterprise QA. Unlike many 70 B peers, it retains the 128 k context inherited from Qwen 2.5, letting it ingest books, codebases or financial filings wholesale. Training blended DeepSeek R1 distillation, multi‑epoch supervised fine‑tuning and a final DPO/RLHF alignment stage, yielding strong performance on BIG‑Bench‑Hard, GSM‑8K and long‑context Needle‑In‑Haystack tests. Enterprises use Virtuoso‑Large as the "fallback" brain in Conductor pipelines when other SLMs flag low confidence. Despite its size, aggressive KV‑cache optimizations keep first‑token latency in the low‑second range on 8× H100 nodes, making it a practical production‑grade powerhouse.
2025-05-06 131,072 text->text Other
arcee-ai/trinity-mini:free
Trinity Mini is a 26B-parameter (3B active) sparse mixture-of-experts language model featuring 128 experts with 8 active per token. Engineered for efficient reasoning over long contexts (131k) with robust function calling and multi-step agent workflows.
2025-12-01 131,072 text->text Other
Arcee AI: Trinity Mini
$0.0002/1k
$0.0006/1k
arcee-ai/trinity-mini
Trinity Mini is a 26B-parameter (3B active) sparse mixture-of-experts language model featuring 128 experts with 8 active per token. Engineered for efficient reasoning over long contexts (131k) with robust function calling and multi-step agent workflows.
2025-12-01 131,072 text->text Other
arcee-ai/trinity-large-thinking
Trinity Large Thinking is a powerful open source reasoning model from the team at Arcee AI. It shows strong performance in PinchBench, agentic workloads, and reasoning tasks. It is free in open claw for the first five days. Launch video: https://youtu.be/Gc82AXLa0Rg?si=4RLn6WBz33qT--B7
2026-04-01 262,144 text->text Other