ModelsHub 模型仓库

Inflection: Inflection 3 Productivity

$0.010/1k

$0.040/1k

inflection/inflection-3-productivity

Inflection 3 Productivity is optimized for following instructions. It is better for tasks requiring JSON output or precise adherence to provided guidelines. It has access to recent news. For emotional intelligence similar to Pi, see Inflect 3 Pi See Inflection's announcement for more details.

2024-10-11 8,000 text->text Other

立即聊天

Inflection: Inflection 3 Pi

$0.010/1k

$0.040/1k

inflection/inflection-3-pi

Inflection 3 Pi powers Inflection's Pi chatbot, including backstory, emotional intelligence, productivity, and safety. It has access to recent news, and excels in scenarios like customer support and roleplay. Pi has been trained to mirror your tone and style, if you use more emojis, so will Pi! Try experimenting with various prompts and conversation styles.

2024-10-11 8,000 text->text Other

立即聊天

Inception: Mercury Coder

$0.0010/1k

$0.0040/1k

inception/mercury-coder

Mercury Coder is the first diffusion large language model (dLLM). Applying a breakthrough discrete diffusion approach, the model runs 5-10x faster than even speed optimized models like Claude 3.5 Haiku and GPT-4o Mini while matching their performance. Mercury Coder's speed means that developers can stay in the flow while coding, enjoying rapid chat-based iteration and responsive code completion suggestions. On Copilot Arena, Mercury Coder ranks 1st in speed and ties for 2nd in quality. Read more in the blog post here.

2025-05-01 32,000 text->text Other

立即聊天

Inception: Mercury

$0.0010/1k

$0.0040/1k

inception/mercury

Mercury is the first diffusion large language model (dLLM). Applying a breakthrough discrete diffusion approach, the model runs 5-10x faster than even speed optimized models like GPT-4.1 Nano and Claude 3.5 Haiku while matching their performance. Mercury's speed enables developers to provide responsive user experiences, including with voice agents, search interfaces, and chatbots. Read more in the blog post here.

2025-06-27 32,000 text->text Other

立即聊天

Google: Gemma 3n 4B (free)

免费使用

google/gemma-3n-e4b-it:free

Gemma 3n E4B-it is optimized for efficient execution on mobile and low-resource devices, such as phones, laptops, and tablets. It supports multimodal inputs—including text, visual data, and audio—enabling diverse tasks such as text generation, speech recognition, translation, and image analysis. Leveraging innovations like Per-Layer Embedding (PLE) caching and the MatFormer architecture, Gemma 3n dynamically manages memory usage and computational load by selectively activating model parameters, significantly reducing runtime resource requirements. This model supports a wide linguistic range (trained in over 140 languages) and features a flexible 32K token context window. Gemma 3n can selectively load parameters, optimizing memory and computational efficiency based on the task or device capabilities, making it well-suited for privacy-focused, offline-capable applications and on-device AI solutions. Read more in the blog post

2025-05-21 8,192 text->text Other

立即聊天

Google: Gemma 3n 4B

$0.0001/1k

$0.0002/1k

google/gemma-3n-e4b-it

2025-05-21 32,768 text->text Other

立即聊天

Google: Gemma 3n 2B (free)

免费使用

google/gemma-3n-e2b-it:free

Gemma 3n E2B IT is a multimodal, instruction-tuned model developed by Google DeepMind, designed to operate efficiently at an effective parameter size of 2B while leveraging a 6B architecture. Based on the MatFormer architecture, it supports nested submodels and modular composition via the Mix-and-Match framework. Gemma 3n models are optimized for low-resource deployment, offering 32K context length and strong multilingual and reasoning performance across common benchmarks. This variant is trained on a diverse corpus including code, math, web, and multimodal data.

2025-07-09 8,192 text->text Other

立即聊天

EleutherAI: Llemma 7b

$0.0032/1k

$0.0048/1k

eleutherai/llemma_7b

Llemma 7B is a language model for mathematics. It was initialized with Code Llama 7B weights, and trained on the Proof-Pile-2 for 200B tokens. Llemma models are particularly strong at chain-of-thought mathematical reasoning and using computational tools for mathematics, such as Python and formal theorem provers.

2025-04-14 4,096 text->text Other

立即聊天

Dolphin3.0 R1 Mistral 24B (free)

免费使用

cognitivecomputations/dolphin3.0-r1-mistral-24b:free

Dolphin 3.0 R1 is the next generation of the Dolphin series of instruct-tuned models. Designed to be the ultimate general purpose local model, enabling coding, math, agentic, function calling, and general use cases. The R1 version has been trained for 3 epochs to reason using 800k reasoning traces from the Dolphin-R1 dataset. Dolphin aims to be a general purpose reasoning instruct model, similar to the models behind ChatGPT, Claude, Gemini. Part of the Dolphin 3.0 Collection Curated and trained by Eric Hartford, Ben Gitter, BlouseJury and Cognitive Computations

2025-02-14 32,768 text->text Other

立即聊天

Dolphin3.0 R1 Mistral 24B

$0.0001/1k

cognitivecomputations/dolphin3.0-r1-mistral-24b

2025-02-14 32,768 text->text Other

立即聊天

Dolphin3.0 Mistral 24B (free)

免费使用

cognitivecomputations/dolphin3.0-mistral-24b:free

Dolphin 3.0 is the next generation of the Dolphin series of instruct-tuned models. Designed to be the ultimate general purpose local model, enabling coding, math, agentic, function calling, and general use cases. Dolphin aims to be a general purpose instruct model, similar to the models behind ChatGPT, Claude, Gemini. Part of the Dolphin 3.0 Collection Curated and trained by Eric Hartford, Ben Gitter, BlouseJury and Cognitive Computations

2025-02-13 32,768 text->text Other

立即聊天

DeepSeek: R1 Distill Qwen 1.5B

$0.0007/1k

deepseek/deepseek-r1-distill-qwen-1.5b

DeepSeek R1 Distill Qwen 1.5B is a distilled large language model based on Qwen 2.5 Math 1.5B, using outputs from DeepSeek R1. It's a very small and efficient model which outperforms GPT 4o 0513 on Math Benchmarks. Other benchmark results include: AIME 2024 pass@1: 28.9 AIME 2024 cons@64: 52.7 MATH-500 pass@1: 83.9 The model leverages fine-tuning from DeepSeek R1's outputs, enabling competitive performance comparable to larger frontier models.

2025-01-31 131,072 text->text Other

立即聊天

全部大模型