AI Models
Browse 351 AI models from 68 providers
Claude 3.5 Sonnet
AnthropicAnthropic's most balanced model, excelling at coding, technical writing, and complex reasoning with industry-leading safety.
Claude 3 Haiku
AnthropicAnthropic's fastest and most cost-effective model, optimized for near-instant responses and high-throughput applications.
Claude 3 Opus
AnthropicAnthropic's most capable and intelligent model, designed for complex analytical tasks, research, and deep reasoning.
DALL-E 3
OpenAIOpenAI's state-of-the-art text-to-image generation model with exceptional prompt adherence and photorealistic output quality.
DeepSeek-R1
DeepSeekA specialized reasoning model that uses chain-of-thought processing to tackle complex logic, math, and problem-solving tasks.
DeepSeek-V3
DeepSeekA powerful open-weight language model with advanced reasoning capabilities and highly competitive pricing.
ElevenLabs Multilingual
ElevenLabsIndustry-leading AI voice synthesis with natural intonation, emotional range, and support for 29 languages.
Gemini 1.5 Flash
GoogleA faster, lighter version of Gemini 1.5 optimized for high-throughput applications while retaining the 1 million token context window.
Gemini 1.5 Pro
GoogleGoogle's most capable mid-size model featuring an unprecedented 1 million token context window and native multimodal understanding.
Gemini 2.0 Flash
GoogleGoogle's next-generation Flash model with improved performance, lower latency, and enhanced multimodal capabilities.
GPT-4o Mini
OpenAIA smaller, faster, and more affordable version of GPT-4o designed for lightweight tasks while maintaining strong reasoning capabilities.
GPT-4o
OpenAIOpenAI's cutting-edge flagship multimodal model with vision, audio, and text capabilities. Delivers GPT-4-class intelligence with significantly improved speed and cost efficiency.
Grok-2
xAIxAI's conversational model with real-time knowledge of the world through integrated X (Twitter) data access.
Llama 3.1 405B
MetaMeta's largest open-weight language model, delivering frontier-level performance rivaling top closed-source models.
Midjourney V6
MidjourneyThe latest iteration of Midjourney's renowned text-to-image model, known for exceptional artistic quality and stylistic versatility.
Mistral Large
MistralMistral AI's flagship large language model with native multilingual capabilities and strong reasoning performance.
Sora
OpenAIOpenAI's groundbreaking text-to-video generation model capable of creating photorealistic videos from text descriptions.
Stable Diffusion 3
Stability AIAn open-weight text-to-image model offering state-of-the-art quality with efficient scaling across different model sizes.
Suno V4
SunoA state-of-the-art AI music generation model that creates full songs with vocals, lyrics, and instrumentation from text prompts.
Whisper Large V3
OpenAIOpenAI's most accurate speech-to-text model supporting 99+ languages with robust noise handling and transcription quality.
MoonshotAI: Kimi K2.7 Code
moonshotaiMoonshotAI: Kimi K2.7 Code is a coding-focused model in Moonshot AI's Kimi K2 family, built to complete end-to-end programming tasks reliably over long contexts.
Anthropic: Claude Fable Latest
~anthropicThis model always redirects to the latest model in the Claude Fable family..
Anthropic: Claude Fable 5
anthropicClaude Fable 5 is a Mythos-class model from Anthropic, built for autonomous knowledge work and coding.
Nex AGI: Nex-N2-Pro (free)
nex-agiNex-N2-Pro is an agentic mixture-of-experts model from Nex AGI, with 17B active parameters out of 397B total.
NVIDIA: Nemotron 3.5 Content Safety (free)
nvidiaNVIDIA Nemotron 3.5 Content Safety is a compact 4B-parameter multimodal guardrail model from NVIDIA, fine-tuned from Google Gemma-3-4B.
NVIDIA: Nemotron 3 Ultra (free)
nvidiaNVIDIA Nemotron 3 Ultra is an open frontier-reasoning and orchestration model from NVIDIA, with 55B active parameters out of 550B total (MoE).
NVIDIA: Nemotron 3 Ultra
nvidiaNVIDIA Nemotron 3 Ultra is an open frontier-reasoning and orchestration model from NVIDIA, with 55B active parameters out of 550B total (MoE).
Qwen: Qwen3.7 Plus
qwenQwen3.7-Plus is a cost-effective model in Alibaba's Qwen3.7 series.
MiniMax: MiniMax M3
minimaxMiniMax-M3 is a multimodal foundation model from MiniMax.
StepFun: Step 3.7 Flash
stepfunStep 3.7 Flash is StepFun's latest high-efficiency multimodal Mixture-of-Experts model.
Anthropic: Claude Opus 4.8 (Fast)
anthropicFast-mode variant of [Opus 4.8](/anthropic/claude-opus-4.8) - identical capabilities with higher output speed at 2x pricing relative to regular Opus 4.8. Learn more in Anthropic's docs: https://platform.claude.com/docs/en/build-with-claude/fast-mode.
Anthropic: Claude Opus 4.8
anthropicClaude Opus 4.8 is Anthropic's most capable generally available model in the Opus family.
Qwen: Qwen3.7 Max
qwenQwen3.7-Max is the flagship model in Alibaba's Qwen3.7 series.
xAI: Grok Build 0.1
x-aiGrok Build 0.1 is xAI’s fast coding model trained specifically for agentic software engineering workflows.
Google: Gemini 3.5 Flash
googleGemini 3.5 Flash is Google's high-efficiency multimodal model, bringing near-Pro level coding and reasoning at Flash-tier cost and speed.
Anthropic: Claude Opus 4.7 (Fast)
anthropicFast-mode variant of [Opus 4.7](/anthropic/claude-opus-4.7) - identical capabilities with higher output speed at premium 6x pricing. Learn more in Anthropic's docs: https://platform.claude.com/docs/en/build-with-claude/fast-mode.
OpenRouter: Fusion
openrouterFusion turns your prompt into a small multi-model deliberation.
Perceptron: Perceptron Mk1
perceptronPerceptron Mk1 (Mark One) is Perceptron's highest-quality vision-language model for video and embodied reasoning.** It accepts image and video inputs paired with natural language queries, and produces detailed visual understanding....
inclusionAI: Ring-2.6-1T
inclusionaiRing-2.6-1T is a 1T-parameter-scale thinking model with 63B active parameters, built for real-world agent workflows that require both strong capability and operational efficiency.
Google: Gemini 3.1 Flash Lite
googleGemini 3.1 Flash Lite is Google’s GA high-efficiency multimodal model optimized for low-latency, high-volume workloads.
OpenAI: GPT Chat Latest
openaiGPT Chat Latest points to OpenAI's stable API alias `chat-latest` that always resolves to the latest Instant chat model used in ChatGPT.
xAI: Grok 4.3
x-aiGrok 4.3 is a reasoning model from xAI.
IBM: Granite 4.1 8B
ibm-graniteGranite 4.1 8B is a dense, decoder-only 8-billion-parameter language model from IBM, part of the Granite 4.1 family.
Mistral: Mistral Medium 3.5
mistralaiMistral Medium 3.5 is a dense 128B instruction-following model from Mistral AI.
Owl Alpha
openrouterOwl Alpha is a high-performance foundation model designed for agentic workloads.
NVIDIA: Nemotron 3 Nano Omni (free)
nvidiaNVIDIA Nemotron™ 3 Nano Omni is a 30B-A3B open multimodal model designed to function as a perception and context sub-agent in enterprise agent systems.
Poolside: Laguna XS.2 (free)
poolsideLaguna XS.2 is the second-generation model in the XS size class from [Poolside](https://poolside.ai), their efficient coding agent series.
Poolside: Laguna M.1 (free)
poolsideLaguna M.1 is the flagship coding agent model from [Poolside](https://poolside.ai), optimized for complex software engineering tasks.
Anthropic Claude Haiku Latest
~anthropicThis model always redirects to the latest model in the Anthropic Claude Haiku family..
OpenAI GPT Mini Latest
~openaiThis model always redirects to the latest model in the OpenAI GPT Mini family..
Google Gemini Pro Latest
~googleThis model always redirects to the latest model in the Google Gemini Pro family..
MoonshotAI Kimi Latest
~moonshotaiThis model always redirects to the latest model in the MoonshotAI Kimi family..
Google Gemini Flash Latest
~googleThis model always redirects to the latest model in the Google Gemini Flash family..
Anthropic Claude Sonnet Latest
~anthropicThis model always redirects to the latest model in the Anthropic Claude Sonnet family..
OpenAI GPT Latest
~openaiThis model always redirects to the latest model in the OpenAI GPT family..
Qwen: Qwen3.5 Plus 2026-04-20
qwenQwen3.5 Plus (April 2026) is a large-scale multimodal language model from Alibaba.
Qwen: Qwen3.6 Flash
qwenQwen3.6 Flash is a fast, efficient language model from Alibaba's Qwen 3.6 series.
Qwen: Qwen3.6 35B A3B
qwenQwen3.6-35B-A3B is an open-weight multimodal model from Alibaba Cloud with 35 billion total parameters and 3 billion active parameters per token.
Qwen: Qwen3.6 Max Preview
qwenQwen3.6-Max-Preview is a proprietary frontier model from Alibaba Cloud built on a sparse mixture-of-experts architecture with approximately 1 trillion total parameters.
Qwen: Qwen3.6 27B
qwenQwen3.6 27B is a dense 27-billion-parameter language model from the Qwen Team at Alibaba, released in April 2026.
OpenAI: GPT-5.5 Pro
openaiGPT-5.5 Pro is OpenAI’s high-capability model optimized for deep reasoning and accuracy on complex, high-stakes workloads.
OpenAI: GPT-5.5
openaiGPT-5.5 is OpenAI’s frontier model designed for complex professional workloads, building on GPT-5.4 with stronger reasoning, higher reliability, and improved token efficiency on hard tasks.
DeepSeek: DeepSeek V4 Pro
deepseekDeepSeek V4 Pro is a large-scale Mixture-of-Experts model from DeepSeek with 1.6T total parameters and 49B activated parameters, supporting a 1M-token context window.
DeepSeek: DeepSeek V4 Flash
deepseekDeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepSeek with 284B total parameters and 13B activated parameters, supporting a 1M-token context window.
inclusionAI: Ling-2.6-1T
inclusionaiLing-2.6-1T is an instant (instruct) model from inclusionAI and the company’s trillion-parameter flagship, designed for real-world agents that require fast execution and high efficiency at scale.
Tencent: Hy3 preview
tencentHy3 preview is a high-efficiency Mixture-of-Experts model from Tencent designed for agentic workflows and production use.
Xiaomi: MiMo-V2.5-Pro
xiaomiMiMo-V2.5-Pro is Xiaomi’s flagship model, delivering strong performance in general agentic capabilities, complex software engineering, and long-horizon tasks, with top rankings on benchmarks such as ClawEval, GDPVal, and SWE-bench Pro.....
Xiaomi: MiMo-V2.5
xiaomiMiMo-V2.5 is a native omnimodal model by Xiaomi.
OpenAI: GPT-5.4 Image 2
openai[GPT-5.4](https://openrouter.ai/openai/gpt-5.4) Image 2 combines OpenAI's GPT-5.4 model with state-of-the-art image generation capabilities from GPT Image 2.
inclusionAI: Ling-2.6-flash
inclusionaiLing-2.6-flash is an instant (instruct) model from inclusionAI with 104B total parameters and 7.4B active parameters, designed for real-world agents that require fast responses, strong execution, and high token efficiency.....
Anthropic: Claude Opus Latest
~anthropicThis model always redirects to the latest model in the Claude Opus family..
Pareto Code Router
openrouterThe Pareto Router maintains a tiered shortlist of strong coding models, ranked by [Artificial Analysis](https://artificialanalysis.ai/) coding percentiles.
MoonshotAI: Kimi K2.6
moonshotaiKimi K2.6 is Moonshot AI's next-generation multimodal model, designed for long-horizon coding, coding-driven UI/UX generation, and multi-agent orchestration.
Anthropic: Claude Opus 4.7
anthropicOpus 4.7 is the next generation of Anthropic's Opus family, built for long-running, asynchronous agents.
Anthropic: Claude Opus 4.6 (Fast)
anthropicFast-mode variant of [Opus 4.6](/anthropic/claude-opus-4.6) - identical capabilities with higher output speed at premium 6x pricing. Learn more in Anthropic's docs: https://platform.claude.com/docs/en/build-with-claude/fast-mode.
Z.ai: GLM 5.1
z-aiGLM-5.1 delivers a major leap in coding capability, with particularly significant gains in handling long-horizon tasks.
Google: Gemma 4 26B A4B (free)
googleGemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google DeepMind.
Google: Gemma 4 26B A4B
googleGemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google DeepMind.
Google: Gemma 4 31B (free)
googleGemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supporting text and image input with text output.
Google: Gemma 4 31B
googleGemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supporting text and image input with text output.
Qwen: Qwen3.6 Plus
qwenQwen 3.6 Plus builds on a hybrid architecture that combines efficient linear attention with sparse mixture-of-experts routing, enabling strong scalability and high-performance inference.
Arcee AI: Trinity Large Thinking
arcee-aiTrinity Large Thinking is a powerful open source reasoning model from the team at Arcee AI.
xAI: Grok 4.20 Multi-Agent
x-aiGrok 4.20 Multi-Agent is a variant of xAI’s Grok 4.20 designed for collaborative, agent-based workflows.
xAI: Grok 4.20
x-aiGrok 4.20 is a reasoning model from xAI with industry-leading speed and agentic tool calling capabilities.
Google: Lyria 3 Pro Preview
googleFull-length songs are priced at $0.08 per song.
Google: Lyria 3 Clip Preview
google30 second duration clips are priced at $0.04 per clip.
Kwaipilot: KAT-Coder-Pro V2
kwaipilotKAT-Coder-Pro V2 is the latest high-performance model in KwaiKAT’s KAT-Coder series, designed for complex enterprise-grade software engineering and SaaS integration.
Reka Edge
rekaaiReka Edge is an extremely efficient 7B multimodal vision-language model that accepts image/video+text inputs and generates text outputs.
MiniMax: MiniMax M2.7
minimaxMiniMax-M2.7 is a next-generation large language model designed for autonomous, real-world productivity and continuous improvement.
OpenAI: GPT-5.4 Nano
openaiGPT-5.4 nano is the most lightweight and cost-efficient variant of the GPT-5.4 family, optimized for speed-critical and high-volume tasks.
OpenAI: GPT-5.4 Mini
openaiGPT-5.4 mini brings the core capabilities of GPT-5.4 to a faster, more efficient model optimized for high-throughput workloads.
Mistral: Mistral Small 4
mistralaiMistral Small 4 is the next major release in the Mistral Small family, unifying the capabilities of several flagship Mistral models into a single system.
Z.ai: GLM 5 Turbo
z-aiGLM-5 Turbo is a new model from Z.ai designed for fast inference and strong performance in agent-driven environments such as OpenClaw scenarios.
NVIDIA: Nemotron 3 Super (free)
nvidiaNVIDIA Nemotron 3 Super is a 120B-parameter open hybrid MoE model, activating just 12B parameters for maximum compute efficiency and accuracy in complex multi-agent applications.
NVIDIA: Nemotron 3 Super
nvidiaNVIDIA Nemotron 3 Super is a 120B-parameter open hybrid MoE model, activating just 12B parameters for maximum compute efficiency and accuracy in complex multi-agent applications.
ByteDance Seed: Seed-2.0-Lite
bytedance-seedSeed-2.0-Lite is a versatile, cost‑efficient enterprise workhorse that delivers strong multimodal and agent capabilities while offering noticeably lower latency, making it a practical default choice for most production workloads across....
Qwen: Qwen3.5-9B
qwenQwen3.5-9B is a multimodal foundation model from the Qwen3.5 family, designed to deliver strong reasoning, coding, and visual understanding in an efficient 9B-parameter architecture.
OpenAI: GPT-5.4 Pro
openaiGPT-5.4 Pro is OpenAI's most advanced model, building on GPT-5.4's unified architecture with enhanced reasoning capabilities for complex, high-stakes tasks.
OpenAI: GPT-5.4
openaiGPT-5.4 is OpenAI’s latest frontier model, unifying the Codex and GPT lines into a single system.
Inception: Mercury 2
inceptionMercury 2 is an extremely fast reasoning LLM, and the first reasoning diffusion LLM (dLLM).
OpenAI: GPT-5.3 Chat
openaiGPT-5.3 Chat is an update to ChatGPT's most-used model that makes everyday conversations smoother, more useful, and more directly helpful.
Google: Gemini 3.1 Flash Lite Preview
googleGemini 3.1 Flash Lite Preview is Google's high-efficiency model optimized for high-volume use cases.
ByteDance Seed: Seed-2.0-Mini
bytedance-seedSeed-2.0-mini targets latency-sensitive, high-concurrency, and cost-sensitive scenarios, emphasizing fast response and flexible inference deployment.
Google: Nano Banana 2 (Gemini 3.1 Flash Image Preview)
googleGemini 3.1 Flash Image Preview, a.k.a.
Qwen: Qwen3.5-35B-A3B
qwenThe Qwen3.5 Series 35B-A3B is a native vision-language model designed with a hybrid architecture that integrates linear attention mechanisms and a sparse mixture-of-experts model, achieving higher inference efficiency.
Qwen: Qwen3.5-27B
qwenThe Qwen3.5 27B native vision-language Dense model incorporates a linear attention mechanism, delivering fast response times while balancing inference speed and performance.
Qwen: Qwen3.5-122B-A10B
qwenThe Qwen3.5 122B-A10B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency.
Qwen: Qwen3.5-Flash
qwenThe Qwen3.5 native vision-language Flash models are built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency.
LiquidAI: LFM2-24B-A2B
liquidLFM2-24B-A2B is the largest model in the LFM2 family of hybrid architectures designed for efficient on-device deployment.
Google: Gemini 3.1 Pro Preview Custom Tools
googleGemini 3.1 Pro Preview Custom Tools is a variant of Gemini 3.1 Pro that improves tool selection behavior by preventing overuse of a general bash tool when more efficient third-party....
OpenAI: GPT-5.3-Codex
openaiGPT-5.3-Codex is OpenAI’s most advanced agentic coding model, combining the frontier software engineering performance of GPT-5.2-Codex with the broader reasoning and professional knowledge capabilities of GPT-5.2.
AionLabs: Aion-2.0
aion-labsAion-2.0 is a variant of DeepSeek V3.2 optimized for immersive roleplaying and storytelling.
Google: Gemini 3.1 Pro Preview
googleGemini 3.1 Pro Preview is Google’s frontier reasoning model, delivering enhanced software engineering performance, improved agentic reliability, and more efficient token usage across complex workflows.
Anthropic: Claude Sonnet 4.6
anthropicSonnet 4.6 is Anthropic's most capable Sonnet-class model yet, with frontier performance across coding, agents, and professional work.
Qwen: Qwen3.5 Plus 2026-02-15
qwenThe Qwen3.5 native vision-language series Plus models are built on a hybrid architecture that integrates linear attention mechanisms with sparse mixture-of-experts models, achieving higher inference efficiency.
Qwen: Qwen3.5 397B A17B
qwenThe Qwen3.5 series 397B-A17B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency.
MiniMax: MiniMax M2.5
minimaxMiniMax-M2.5 is a SOTA large language model designed for real-world productivity.
Z.ai: GLM 5
z-aiGLM-5 is Z.ai’s flagship open-source foundation model engineered for complex systems design and long-horizon agent workflows.
Qwen: Qwen3 Max Thinking
qwenQwen3-Max-Thinking is the flagship reasoning model in the Qwen3 series, designed for high-stakes cognitive tasks that require deep, multi-step reasoning.
Anthropic: Claude Opus 4.6
anthropicOpus 4.6 is Anthropic’s strongest model for coding and long-running professional tasks.
Qwen: Qwen3 Coder Next
qwenQwen3-Coder-Next is an open-weight causal language model optimized for coding agents and local development workflows.
Free Models Router
openrouterThe simplest way to get free inference.
StepFun: Step 3.5 Flash
stepfunStep 3.5 Flash is StepFun's most capable open-source foundation model.
MoonshotAI: Kimi K2.5
moonshotaiKimi K2.5 is Moonshot AI's native multimodal model, delivering state-of-the-art visual coding capability and a self-directed agent swarm paradigm.
Upstage: Solar Pro 3
upstageSolar Pro 3 is Upstage's powerful Mixture-of-Experts (MoE) language model.
MiniMax: MiniMax M2-her
minimaxMiniMax M2-her is a dialogue-first large language model built for immersive roleplay, character-driven chat, and expressive multi-turn conversations.
Writer: Palmyra X5
writerPalmyra X5 is Writer's most advanced model, purpose-built for building and scaling AI agents across the enterprise.
LiquidAI: LFM2.5-1.2B-Thinking (free)
liquidLFM2.5-1.2B-Thinking is a lightweight reasoning-focused model optimized for agentic tasks, data extraction, and RAG—while still running comfortably on edge devices.
LiquidAI: LFM2.5-1.2B-Instruct (free)
liquidLFM2.5-1.2B-Instruct is a compact, high-performance instruction-tuned model built for fast on-device AI.
OpenAI: GPT Audio
openaiThe gpt-audio model is OpenAI's first generally available audio model.
OpenAI: GPT Audio Mini
openaiA cost-efficient version of GPT Audio.
Z.ai: GLM 4.7 Flash
z-aiAs a 30B-class SOTA model, GLM-4.7-Flash offers a new option that balances performance and efficiency.
OpenAI: GPT-5.2-Codex
openaiGPT-5.2-Codex is an upgraded version of GPT-5.1-Codex optimized for software engineering and coding workflows.
ByteDance Seed: Seed 1.6 Flash
bytedance-seedSeed 1.6 Flash is an ultra-fast multimodal deep thinking model by ByteDance Seed, supporting both text and visual understanding.
ByteDance Seed: Seed 1.6
bytedance-seedSeed 1.6 is a general-purpose model released by the ByteDance Seed team.
MiniMax: MiniMax M2.1
minimaxMiniMax-M2.1 is a lightweight, state-of-the-art large language model optimized for coding, agentic workflows, and modern application development.
Z.ai: GLM 4.7
z-aiGLM-4.7 is Z.ai’s latest flagship model, featuring upgrades in two key areas: enhanced programming capabilities and more stable multi-step reasoning/execution.
Google: Gemini 3 Flash Preview
googleGemini 3 Flash Preview is a high speed, high value thinking model designed for agentic workflows, multi turn chat, and coding assistance.
Xiaomi: MiMo-V2-Flash
xiaomiMiMo-V2-Flash is an open-source foundation language model developed by Xiaomi.
NVIDIA: Nemotron 3 Nano 30B A3B (free)
nvidiaNVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest compute efficiency and accuracy for developers to build specialized agentic AI systems.
NVIDIA: Nemotron 3 Nano 30B A3B
nvidiaNVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest compute efficiency and accuracy for developers to build specialized agentic AI systems.
OpenAI: GPT-5.2 Chat
openaiGPT-5.2 Chat (AKA Instant) is the fast, lightweight member of the 5.2 family, optimized for low-latency chat while retaining strong general intelligence.
OpenAI: GPT-5.2 Pro
openaiGPT-5.2 Pro is OpenAI’s most advanced model, offering major improvements in agentic coding and long context performance over GPT-5 Pro.
OpenAI: GPT-5.2
openaiGPT-5.2 is the latest frontier-grade model in the GPT-5 series, offering stronger agentic and long context perfomance compared to GPT-5.1.
Mistral: Devstral 2 2512
mistralaiDevstral 2 is a state-of-the-art open-source model by Mistral AI specializing in agentic coding.
Relace: Relace Search
relaceThe relace-search model uses 4-12 `view_file` and `grep` tools in parallel to explore a codebase and return relevant files to the user request.
Z.ai: GLM 4.6V
z-aiGLM-4.6V is a large multimodal model designed for high-fidelity visual understanding and long-context reasoning across images, documents, and mixed media.
EssentialAI: Rnj 1 Instruct
essentialaiRnj-1 is an 8B-parameter, dense, open-weight model family developed by Essential AI and trained from scratch with a focus on programming, math, and scientific reasoning.
Body Builder (beta)
openrouterTransform your natural language requests into structured OpenRouter API request objects.
OpenAI: GPT-5.1-Codex-Max
openaiGPT-5.1-Codex-Max is OpenAI’s latest agentic coding model, designed for long-running, high-context software development tasks.
Amazon: Nova 2 Lite
amazonNova 2 Lite is a fast, cost-effective reasoning model for everyday workloads that can process text, images, and videos to generate text.
Mistral: Ministral 3 14B 2512
mistralaiThe largest model in the Ministral 3 family, Ministral 3 14B offers frontier capabilities and performance comparable to its larger Mistral Small 3.2 24B counterpart.
Mistral: Ministral 3 8B 2512
mistralaiA balanced model in the Ministral 3 family, Ministral 3 8B is a powerful, efficient tiny language model with vision capabilities..
Mistral: Ministral 3 3B 2512
mistralaiThe smallest model in the Ministral 3 family, Ministral 3 3B is a powerful, efficient tiny language model with vision capabilities..
Mistral: Mistral Large 3 2512
mistralaiMistral Large 3 2512 is Mistral’s most capable model to date, featuring a sparse mixture-of-experts architecture with 41B active parameters (675B total), and released under the Apache 2.0 license..
Arcee AI: Trinity Mini
arcee-aiTrinity Mini is a 26B-parameter (3B active) sparse mixture-of-experts language model featuring 128 experts with 8 active per token.
DeepSeek: DeepSeek V3.2
deepseekDeepSeek-V3.2 is a large language model designed to harmonize high computational efficiency with strong reasoning and agentic tool-use performance.
Prime Intellect: INTELLECT-3
prime-intellectINTELLECT-3 is a 106B-parameter Mixture-of-Experts model (12B active) post-trained from GLM-4.5-Air-Base using supervised fine-tuning (SFT) followed by large-scale reinforcement learning (RL).
Anthropic: Claude Opus 4.5
anthropicClaude Opus 4.5 is Anthropic’s frontier reasoning model optimized for complex software engineering, agentic workflows, and long-horizon computer use.
AllenAI: Olmo 3 32B Think
allenaiOlmo 3 32B Think is a large-scale, 32-billion-parameter model purpose-built for deep reasoning, complex logic chains and advanced instruction-following scenarios.
Google: Nano Banana Pro (Gemini 3 Pro Image Preview)
googleNano Banana Pro is Google’s most advanced image-generation and editing model, built on Gemini 3 Pro.
Deep Cogito: Cogito v2.1 671B
deepcogitoCogito v2.1 671B MoE represents one of the strongest open models globally, matching performance of frontier closed and open models.
OpenAI: GPT-5.1
openaiGPT-5.1 is the latest frontier-grade model in the GPT-5 series, offering stronger general-purpose reasoning, improved instruction adherence, and a more natural conversational style compared to GPT-5.
OpenAI: GPT-5.1 Chat
openaiGPT-5.1 Chat (AKA Instant is the fast, lightweight member of the 5.1 family, optimized for low-latency chat while retaining strong general intelligence.
OpenAI: GPT-5.1-Codex
openaiGPT-5.1-Codex is a specialized version of GPT-5.1 optimized for software engineering and coding workflows.
OpenAI: GPT-5.1-Codex-Mini
openaiGPT-5.1-Codex-Mini is a smaller and faster version of GPT-5.1-Codex.
MoonshotAI: Kimi K2 Thinking
moonshotaiKimi K2 Thinking is Moonshot AI’s most advanced open reasoning model to date, extending the K2 series into agentic, long-horizon reasoning.
Amazon: Nova Premier 1.0
amazonAmazon Nova Premier is the most capable of Amazon’s multimodal models for complex reasoning tasks and for use as the best teacher for distilling custom models..
Perplexity: Sonar Pro Search
perplexityExclusively available on the OpenRouter API, Sonar Pro's new Pro Search mode is Perplexity's most advanced agentic search system.
Mistral: Voxtral Small 24B 2507
mistralaiVoxtral Small is an enhancement of Mistral Small 3, incorporating state-of-the-art audio input capabilities while retaining best-in-class text performance.
OpenAI: gpt-oss-safeguard-20b
openaigpt-oss-safeguard-20b is a safety reasoning model from OpenAI built upon gpt-oss-20b.
NVIDIA: Nemotron Nano 12B 2 VL (free)
nvidiaNVIDIA Nemotron Nano 2 VL is a 12-billion-parameter open multimodal reasoning model designed for video understanding and document intelligence.
MiniMax: MiniMax M2
minimaxMiniMax-M2 is a compact, high-efficiency large language model optimized for end-to-end coding and agentic workflows.
Qwen: Qwen3 VL 32B Instruct
qwenQwen3-VL-32B-Instruct is a large-scale multimodal vision-language model designed for high-precision understanding and reasoning across text, images, and video.
IBM: Granite 4.0 Micro
ibm-graniteGranite-4.0-H-Micro is a 3B parameter from the Granite 4 family of models.
Microsoft: Phi 4 Mini Instruct
microsoftPhi-4-mini-instruct is a lightweight open model built upon synthetic data and filtered publicly available websites - with a focus on high-quality, reasoning dense data.
OpenAI: GPT-5 Image Mini
openaiGPT-5 Image Mini combines OpenAI's advanced language capabilities, powered by [GPT-5 Mini](https://openrouter.ai/openai/gpt-5-mini), with GPT Image 1 Mini for efficient image generation.
Anthropic: Claude Haiku 4.5
anthropicClaude Haiku 4.5 is Anthropic’s fastest and most efficient model, delivering near-frontier intelligence at a fraction of the cost and latency of larger Claude models.
Qwen: Qwen3 VL 8B Thinking
qwenQwen3-VL-8B-Thinking is the reasoning-optimized variant of the Qwen3-VL-8B multimodal model, designed for advanced visual and textual reasoning across complex scenes, documents, and temporal sequences.
Qwen: Qwen3 VL 8B Instruct
qwenQwen3-VL-8B-Instruct is a multimodal vision-language model from the Qwen3-VL series, built for high-fidelity understanding and reasoning across text, images, and video.
OpenAI: GPT-5 Image
openai[GPT-5](https://openrouter.ai/openai/gpt-5) Image combines OpenAI's GPT-5 model with state-of-the-art image generation capabilities.
OpenAI: o3 Deep Research
openaio3-deep-research is OpenAI's advanced model for deep research, designed to tackle complex, multi-step research tasks. Note: This model always uses the 'web_search' tool which adds additional cost..
OpenAI: o4 Mini Deep Research
openaio4-mini-deep-research is OpenAI's faster, more affordable deep research model—ideal for tackling complex, multi-step research tasks. Note: This model always uses the 'web_search' tool which adds additional cost..
NVIDIA: Llama 3.3 Nemotron Super 49B V1.5
nvidiaLlama-3.3-Nemotron-Super-49B-v1.5 is a 49B-parameter, English-centric reasoning/chat model derived from Meta’s Llama-3.3-70B-Instruct with a 128K context.
Google: Nano Banana (Gemini 2.5 Flash Image)
googleGemini 2.5 Flash Image, a.k.a.
Qwen: Qwen3 VL 30B A3B Thinking
qwenQwen3-VL-30B-A3B-Thinking is a multimodal model that unifies strong text generation with visual understanding for images and videos.
Qwen: Qwen3 VL 30B A3B Instruct
qwenQwen3-VL-30B-A3B-Instruct is a multimodal model that unifies strong text generation with visual understanding for images and videos.
OpenAI: GPT-5 Pro
openaiGPT-5 Pro is OpenAI’s most advanced model, offering major improvements in reasoning, code quality, and user experience.
Z.ai: GLM 4.6
z-aiCompared with GLM-4.5, this generation brings several key improvements: Longer context window: The context window has been expanded from 128K to 200K tokens, enabling the model to handle more complex....
Anthropic: Claude Sonnet 4.5
anthropicClaude Sonnet 4.5 is Anthropic’s most advanced Sonnet model to date, optimized for real-world agents and coding workflows.
DeepSeek: DeepSeek V3.2 Exp
deepseekDeepSeek-V3.2-Exp is an experimental large language model released by DeepSeek as an intermediate step between V3.1 and future architectures.
TheDrummer: Cydonia 24B V4.1
thedrummerUncensored and creative writing model based on Mistral Small 3.2 24B with good recall, prompt adherence, and intelligence..
Relace: Relace Apply 3
relaceRelace Apply 3 is a specialized code-patching LLM that merges AI-suggested edits straight into your source files.
Google: Gemini 2.5 Flash Lite Preview 09-2025
googleGemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency.
Qwen: Qwen3 VL 235B A22B Thinking
qwenQwen3-VL-235B-A22B Thinking is a multimodal model that unifies strong text generation with visual understanding across images and video.
Qwen: Qwen3 VL 235B A22B Instruct
qwenQwen3-VL-235B-A22B Instruct is an open-weight multimodal model that unifies strong text generation with visual understanding across images and video.
Qwen: Qwen3 Max
qwenQwen3-Max is an updated release built on the Qwen3 series, offering major improvements in reasoning, instruction following, multilingual support, and long-tail knowledge coverage compared to the January 2025 version.
Qwen: Qwen3 Coder Plus
qwenQwen3 Coder Plus is Alibaba's proprietary version of the Open Source Qwen3 Coder 480B A35B.
OpenAI: GPT-5 Codex
openaiGPT-5-Codex is a specialized version of GPT-5 optimized for software engineering and coding workflows.
DeepSeek: DeepSeek V3.1 Terminus
deepseekDeepSeek-V3.1 Terminus is an update to [DeepSeek V3.1](/deepseek/deepseek-chat-v3.1) that maintains the model's original capabilities while addressing issues reported by users, including language consistency and agent capabilities, further optimizing the model's....
Qwen: Qwen3 Coder Flash
qwenQwen3 Coder Flash is Alibaba's fast and cost efficient version of their proprietary Qwen3 Coder Plus.
Qwen: Qwen3 Next 80B A3B Thinking
qwenQwen3-Next-80B-A3B-Thinking is a reasoning-first chat model in the Qwen3-Next line that outputs structured “thinking” traces by default.
Qwen: Qwen3 Next 80B A3B Instruct (free)
qwenQwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next series optimized for fast, stable responses without “thinking” traces.
Qwen: Qwen3 Next 80B A3B Instruct
qwenQwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next series optimized for fast, stable responses without “thinking” traces.
Qwen: Qwen Plus 0728 (thinking)
qwenQwen Plus 0728, based on the Qwen3 foundation model, is a 1 million context hybrid reasoning model with a balanced performance, speed, and cost combination..
Qwen: Qwen Plus 0728
qwenQwen Plus 0728, based on the Qwen3 foundation model, is a 1 million context hybrid reasoning model with a balanced performance, speed, and cost combination..
NVIDIA: Nemotron Nano 9B V2 (free)
nvidiaNVIDIA-Nemotron-Nano-9B-v2 is a large language model (LLM) trained from scratch by NVIDIA, and designed as a unified model for both reasoning and non-reasoning tasks.
MoonshotAI: Kimi K2 0905
moonshotaiKimi K2 0905 is the September update of [Kimi K2 0711](moonshotai/kimi-k2).
Qwen: Qwen3 30B A3B Thinking 2507
qwenQwen3-30B-A3B-Thinking-2507 is a 30B parameter Mixture-of-Experts reasoning model optimized for complex tasks requiring extended multi-step thinking.
Nous: Hermes 4 70B
nousresearchHermes 4 70B is a hybrid reasoning model from Nous Research, built on Meta-Llama-3.1-70B.
Nous: Hermes 4 405B
nousresearchHermes 4 is a large-scale reasoning model built on Meta-Llama-3.1-405B and released by Nous Research.
DeepSeek: DeepSeek V3.1
deepseekDeepSeek-V3.1 is a large hybrid reasoning model (671B parameters, 37B active) that supports both thinking and non-thinking modes via prompt templates.
Mistral: Mistral Medium 3.1
mistralaiMistral Medium 3.1 is an updated version of Mistral Medium 3, which is a high-performance enterprise-grade language model designed to deliver frontier-level capabilities at significantly reduced operational cost.
Z.ai: GLM 4.5V
z-aiGLM-4.5V is a vision-language foundation model for multimodal agent applications.
AI21: Jamba Large 1.7
ai21Jamba Large 1.7 is the latest model in the Jamba open family, offering improvements in grounding, instruction-following, and overall efficiency.
OpenAI: GPT-5 Chat
openaiGPT-5 Chat is designed for advanced, natural, multimodal, and context-aware conversations for enterprise applications..
OpenAI: GPT-5
openaiGPT-5 is OpenAI’s most advanced model, offering major improvements in reasoning, code quality, and user experience.
OpenAI: GPT-5 Mini
openaiGPT-5 Mini is a compact version of GPT-5, designed to handle lighter-weight reasoning tasks.
OpenAI: GPT-5 Nano
openaiGPT-5-Nano is the smallest and fastest variant in the GPT-5 system, optimized for developer tools, rapid interactions, and ultra-low latency environments.
OpenAI: gpt-oss-120b (free)
openaigpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language model from OpenAI designed for high-reasoning, agentic, and general-purpose production use cases.
OpenAI: gpt-oss-120b
openaigpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language model from OpenAI designed for high-reasoning, agentic, and general-purpose production use cases.
OpenAI: gpt-oss-20b (free)
openaigpt-oss-20b is an open-weight 21B parameter model released by OpenAI under the Apache 2.0 license.
OpenAI: gpt-oss-20b
openaigpt-oss-20b is an open-weight 21B parameter model released by OpenAI under the Apache 2.0 license.
Anthropic: Claude Opus 4.1
anthropicClaude Opus 4.1 is an updated version of Anthropic’s flagship model, offering improved performance in coding, reasoning, and agentic tasks.
Mistral: Codestral 2508
mistralaiMistral's cutting-edge language model for coding released end of July 2025.
Qwen: Qwen3 Coder 30B A3B Instruct
qwenQwen3-Coder-30B-A3B-Instruct is a 30.5B parameter Mixture-of-Experts (MoE) model with 128 experts (8 active per forward pass), designed for advanced code generation, repository-scale understanding, and agentic tool use.
Qwen: Qwen3 30B A3B Instruct 2507
qwenQwen3-30B-A3B-Instruct-2507 is a 30.5B-parameter mixture-of-experts language model from Qwen, with 3.3B active parameters per inference.
Z.ai: GLM 4.5
z-aiGLM-4.5 is our latest flagship foundation model, purpose-built for agent-based applications.
Z.ai: GLM 4.5 Air
z-aiGLM-4.5-Air is the lightweight variant of our latest flagship model family, also purpose-built for agent-centric applications.
Qwen: Qwen3 235B A22B Thinking 2507
qwenQwen3-235B-A22B-Thinking-2507 is a high-performance, open-weight Mixture-of-Experts (MoE) language model optimized for complex reasoning tasks.
Qwen: Qwen3 Coder 480B A35B (free)
qwenQwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation model developed by the Qwen team.
Qwen: Qwen3 Coder 480B A35B
qwenQwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation model developed by the Qwen team.
ByteDance: UI-TARS 7B
bytedanceUI-TARS-1.5 is a multimodal vision-language agent optimized for GUI-based environments, including desktop interfaces, web browsers, mobile systems, and games.
Google: Gemini 2.5 Flash Lite
googleGemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency.
Qwen: Qwen3 235B A22B Instruct 2507
qwenQwen3-235B-A22B-Instruct-2507 is a multilingual, instruction-tuned mixture-of-experts language model based on the Qwen3-235B architecture, with 22B active parameters per forward pass.
Switchpoint Router
switchpointSwitchpoint AI's router instantly analyzes your request and directs it to the optimal AI from an ever-evolving library.
MoonshotAI: Kimi K2 0711
moonshotaiKimi K2 Instruct is a large-scale Mixture-of-Experts (MoE) language model developed by Moonshot AI, featuring 1 trillion total parameters with 32 billion active per forward pass.
Venice: Uncensored (free)
cognitivecomputationsVenice Uncensored Dolphin Mistral 24B Venice Edition is a fine-tuned variant of Mistral-Small-24B-Instruct-2501, developed by dphn.ai in collaboration with Venice.ai.
Tencent: Hunyuan A13B Instruct
tencentHunyuan-A13B is a 13B active parameter Mixture-of-Experts (MoE) language model developed by Tencent, with a total parameter count of 80B and support for reasoning via Chain-of-Thought.
Morph: Morph V3 Large
morphMorph's high-accuracy apply model for complex code edits.
Morph: Morph V3 Fast
morphMorph's fastest apply model for code edits.
Baidu: ERNIE 4.5 VL 424B A47B
baiduERNIE-4.5-VL-424B-A47B is a multimodal Mixture-of-Experts (MoE) model from Baidu’s ERNIE 4.5 series, featuring 424B total parameters with 47B active per token.
Mistral: Mistral Small 3.2 24B
mistralaiMistral-Small-3.2-24B-Instruct-2506 is an updated 24B parameter model from Mistral optimized for instruction following, repetition reduction, and improved function calling.
MiniMax: MiniMax M1
minimaxMiniMax-M1 is a large-scale, open-weight reasoning model designed for extended context and high-efficiency inference.
Google: Gemini 2.5 Flash
googleGemini 2.5 Flash is Google's state-of-the-art workhorse model, specifically designed for advanced reasoning, coding, mathematics, and scientific tasks.
Google: Gemini 2.5 Pro
googleGemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks.
OpenAI: o3 Pro
openaiThe o-series of models are trained with reinforcement learning to think before they answer and perform complex reasoning.
Google: Gemini 2.5 Pro Preview 06-05
googleGemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks.
DeepSeek: R1 0528
deepseekMay 28th update to the [original DeepSeek R1](/deepseek/deepseek-r1) Performance on par with [OpenAI o1](/openai/o1), but open-sourced and with fully open reasoning tokens.
Anthropic: Claude Opus 4
anthropicClaude Opus 4 is benchmarked as the world’s best coding model, at time of release, bringing sustained performance on complex, long-running tasks and agent workflows.
Anthropic: Claude Sonnet 4
anthropicClaude Sonnet 4 significantly enhances the capabilities of its predecessor, Sonnet 3.7, excelling in both coding and reasoning tasks with improved precision and controllability.
Google: Gemma 3n 4B
googleGemma 3n E4B-it is optimized for efficient execution on mobile and low-resource devices, such as phones, laptops, and tablets.
Mistral: Mistral Medium 3
mistralaiMistral Medium 3 is a high-performance enterprise-grade language model designed to deliver frontier-level capabilities at significantly reduced operational cost.
Google: Gemini 2.5 Pro Preview 05-06
googleGemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks.
Arcee AI: Virtuoso Large
arcee-aiVirtuoso‑Large is Arcee's top‑tier general‑purpose LLM at 72 B parameters, tuned to tackle cross‑domain reasoning, creative writing and enterprise QA.
Arcee AI: Coder Large
arcee-aiCoder‑Large is a 32 B‑parameter offspring of Qwen 2.5‑Instruct that has been further trained on permissively‑licensed GitHub, CodeSearchNet and synthetic bug‑fix corpora.
Meta: Llama Guard 4 12B
meta-llamaLlama Guard 4 is a Llama 4 Scout-derived multimodal pretrained model, fine-tuned for content safety classification.
Qwen: Qwen3 30B A3B
qwenQwen3, the latest generation in the Qwen large language model series, features both dense and mixture-of-experts (MoE) architectures to excel in reasoning, multilingual support, and advanced agent tasks.
Qwen: Qwen3 8B
qwenQwen3-8B is a dense 8.2B parameter causal language model from the Qwen3 series, designed for both reasoning-heavy tasks and efficient dialogue.
Qwen: Qwen3 14B
qwenQwen3-14B is a dense 14.8B parameter causal language model from the Qwen3 series, designed for both complex reasoning and efficient dialogue.
Qwen: Qwen3 32B
qwenQwen3-32B is a dense 32.8B parameter causal language model from the Qwen3 series, optimized for both complex reasoning and efficient dialogue.
Qwen: Qwen3 235B A22B
qwenQwen3-235B-A22B is a 235B parameter mixture-of-experts (MoE) model developed by Qwen, activating 22B parameters per forward pass.
OpenAI: o4 Mini High
openaiOpenAI o4-mini-high is the same model as [o4-mini](/openai/o4-mini) with reasoning_effort set to high.
OpenAI: o3
openaio3 is a well-rounded and powerful model across domains.
OpenAI: o4 Mini
openaiOpenAI o4-mini is a compact reasoning model in the o-series, optimized for fast, cost-efficient performance while retaining strong multimodal and agentic capabilities.
OpenAI: GPT-4.1
openaiGPT-4.1 is a flagship large language model optimized for advanced instruction following, real-world software engineering, and long-context reasoning.
OpenAI: GPT-4.1 Mini
openaiGPT-4.1 Mini is a mid-sized model delivering performance competitive with GPT-4o at substantially lower latency and cost.
OpenAI: GPT-4.1 Nano
openaiFor tasks that demand low latency, GPT‑4.1 nano is the fastest and cheapest model in the GPT-4.1 series.
Meta: Llama 4 Maverick
meta-llamaLlama 4 Maverick 17B Instruct (128E) is a high-capacity multimodal language model from Meta, built on a mixture-of-experts (MoE) architecture with 128 experts and 17 billion active parameters per forward....
Meta: Llama 4 Scout
meta-llamaLlama 4 Scout 17B Instruct (16E) is a mixture-of-experts (MoE) language model developed by Meta, activating 17 billion parameters out of a total of 109B.
DeepSeek: DeepSeek V3 0324
deepseekDeepSeek V3, a 685B-parameter, mixture-of-experts model, is the latest iteration of the flagship chat model family from the DeepSeek team.
OpenAI: o1-pro
openaiThe o1 series of models are trained with reinforcement learning to think before they answer and perform complex reasoning.
Mistral: Mistral Small 3.1 24B
mistralaiMistral Small 3.1 24B Instruct is an upgraded variant of Mistral Small 3 (2501), featuring 24 billion parameters with advanced multimodal capabilities.
Google: Gemma 3 4B
googleGemma 3 introduces multimodality, supporting vision-language input and text outputs.
Google: Gemma 3 12B
googleGemma 3 introduces multimodality, supporting vision-language input and text outputs.
Cohere: Command A
cohereCommand A is an open-weights 111B parameter model with a 256k context window focused on delivering great performance across agentic, multilingual, and coding use cases.
OpenAI: GPT-4o-mini Search Preview
openaiGPT-4o mini Search Preview is a specialized model for web search in Chat Completions.
OpenAI: GPT-4o Search Preview
openaiGPT-4o Search Previewis a specialized model for web search in Chat Completions.
Reka Flash 3
rekaaiReka Flash 3 is a general-purpose, instruction-tuned large language model with 21 billion parameters, developed by Reka.
Google: Gemma 3 27B
googleGemma 3 introduces multimodality, supporting vision-language input and text outputs.
TheDrummer: Skyfall 36B V2
thedrummerSkyfall 36B v2 is an enhanced iteration of Mistral Small 2501, specifically fine-tuned for improved creativity, nuanced writing, role-playing, and coherent storytelling..
Perplexity: Sonar Reasoning Pro
perplexityNote: Sonar Pro pricing includes Perplexity search pricing.
Perplexity: Sonar Pro
perplexityNote: Sonar Pro pricing includes Perplexity search pricing.
Perplexity: Sonar Deep Research
perplexitySonar Deep Research is a research-focused model designed for multi-step retrieval, synthesis, and reasoning across complex topics.
Mistral: Saba
mistralaiMistral Saba is a 24B-parameter language model specifically designed for the Middle East and South Asia, delivering accurate and contextually relevant responses while maintaining efficient performance.
OpenAI: o3 Mini High
openaiOpenAI o3-mini-high is the same model as [o3-mini](/openai/o3-mini) with reasoning_effort set to high.
AionLabs: Aion-1.0
aion-labsAion-1.0 is a multi-model system designed for high performance across various tasks, including reasoning and coding.
AionLabs: Aion-1.0-Mini
aion-labsAion-1.0-Mini 32B parameter model is a distilled version of the DeepSeek-R1 model, designed for strong performance in reasoning domains such as mathematics, coding, and logic.
AionLabs: Aion-RP 1.0 (8B)
aion-labsAion-RP-Llama-3.1-8B ranks the highest in the character evaluation portion of the RPBench-Auto benchmark, a roleplaying-specific variant of Arena-Hard-Auto, where LLMs evaluate each other’s responses.
Qwen: Qwen2.5 VL 72B Instruct
qwenQwen2.5-VL is proficient in recognizing common objects such as flowers, birds, fish, and insects.
Qwen: Qwen-Plus
qwenQwen-Plus, based on the Qwen2.5 foundation model, is a 131K context model with a balanced performance, speed, and cost combination..
OpenAI: o3 Mini
openaiOpenAI o3-mini is a cost-efficient language model optimized for STEM reasoning tasks, particularly excelling in science, mathematics, and coding.
Mistral: Mistral Small 3
mistralaiMistral Small 3 is a 24B-parameter language model optimized for low-latency performance across common AI tasks.
DeepSeek: R1 Distill Qwen 32B
deepseekDeepSeek R1 Distill Qwen 32B is a distilled large language model based on [Qwen 2.5 32B](https://huggingface.co/Qwen/Qwen2.5-32B), using outputs from [DeepSeek R1](/deepseek/deepseek-r1).
Perplexity: Sonar
perplexitySonar is lightweight, affordable, fast, and simple to use — now featuring citations and the ability to customize sources.
DeepSeek: R1 Distill Llama 70B
deepseekDeepSeek R1 Distill Llama 70B is a distilled large language model based on [Llama-3.3-70B-Instruct](/meta-llama/llama-3.3-70b-instruct), using outputs from [DeepSeek R1](/deepseek/deepseek-r1).
MiniMax: MiniMax-01
minimaxMiniMax-01 is a combines MiniMax-Text-01 for text generation and MiniMax-VL-01 for image understanding.
Microsoft: Phi 4
microsoft[Microsoft Research](/microsoft) Phi-4 is designed to perform well in complex reasoning tasks and can operate efficiently in situations with limited memory or where quick responses are needed.
Sao10K: Llama 3.1 70B Hanami x1
sao10kThis is [Sao10K](/sao10k)'s experiment over [Euryale v2.2](/sao10k/l3.1-euryale-70b)..
DeepSeek: DeepSeek V3
deepseekDeepSeek-V3 is the latest model from the DeepSeek team, building upon the instruction following and coding abilities of the previous versions.
Sao10K: Llama 3.3 Euryale 70B
sao10kEuryale L3.3 70B is a model focused on creative roleplay from [Sao10k](https://ko-fi.com/sao10k).
OpenAI: o1
openaiThe latest and strongest model family from OpenAI, o1 is designed to spend more time thinking before responding.
Cohere: Command R7B (12-2024)
cohereCommand R7B (12-2024) is a small, fast update of the Command R+ model, delivered in December 2024.
Meta: Llama 3.3 70B Instruct (free)
meta-llamaThe Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B (text in/text out).
Meta: Llama 3.3 70B Instruct
meta-llamaThe Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B (text in/text out).
Amazon: Nova Lite 1.0
amazonAmazon Nova Lite 1.0 is a very low-cost multimodal model from Amazon that focused on fast processing of image, video, and text inputs to generate text output.
Amazon: Nova Micro 1.0
amazonAmazon Nova Micro 1.0 is a text-only model that delivers the lowest latency responses in the Amazon Nova family of models at a very low cost.
Amazon: Nova Pro 1.0
amazonAmazon Nova Pro 1.0 is a capable multimodal model from Amazon focused on providing a combination of accuracy, speed, and cost for a wide range of tasks.
OpenAI: GPT-4o (2024-11-20)
openaiThe 2024-11-20 version of GPT-4o offers a leveled-up creative writing ability with more natural, engaging, and tailored writing to improve relevance & readability.
Mistral Large 2407
mistralaiThis is Mistral AI's flagship model, Mistral Large 2 (version mistral-large-2407).
Qwen2.5 Coder 32B Instruct
qwenQwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen).
TheDrummer: UnslopNemo 12B
thedrummerUnslopNemo v4.1 is the latest addition from the creator of Rocinante, designed for adventure writing and role-play scenarios..
Anthropic: Claude 3.5 Haiku
anthropicClaude 3.5 Haiku features offers enhanced capabilities in speed, coding accuracy, and tool use.
Magnum v4 72B
anthracite-orgThis is a series of models designed to replicate the prose quality of the Claude 3 models, specifically Sonnet(https://openrouter.ai/anthropic/claude-3.5-sonnet) and Opus(https://openrouter.ai/anthropic/claude-3-opus). The model is fine-tuned on top of [Qwen2.5 72B](https://openrouter.ai/qwen/qwen-2.5-72b-instruct)..
Qwen: Qwen2.5 7B Instruct
qwenQwen2.5 7B is the latest series of Qwen large language models.
Inflection: Inflection 3 Productivity
inflectionInflection 3 Productivity is optimized for following instructions.
Inflection: Inflection 3 Pi
inflectionInflection 3 Pi powers Inflection's [Pi](https://pi.ai) chatbot, including backstory, emotional intelligence, productivity, and safety.
TheDrummer: Rocinante 12B
thedrummerRocinante 12B is designed for engaging storytelling and rich prose.
Meta: Llama 3.2 3B Instruct (free)
meta-llamaLlama 3.2 3B is a 3-billion-parameter multilingual large language model, optimized for advanced natural language processing tasks like dialogue generation, reasoning, and summarization.
Meta: Llama 3.2 3B Instruct
meta-llamaLlama 3.2 3B is a 3-billion-parameter multilingual large language model, optimized for advanced natural language processing tasks like dialogue generation, reasoning, and summarization.
Meta: Llama 3.2 1B Instruct
meta-llamaLlama 3.2 1B is a 1-billion-parameter language model focused on efficiently performing natural language tasks, such as summarization, dialogue, and multilingual text analysis.
Meta: Llama 3.2 11B Vision Instruct
meta-llamaLlama 3.2 11B Vision is a multimodal model with 11 billion parameters, designed to handle tasks combining visual and textual data.
Qwen2.5 72B Instruct
qwenQwen2.5 72B is the latest series of Qwen large language models.
Cohere: Command R+ (08-2024)
coherecommand-r-plus-08-2024 is an update of the [Command R+](/models/cohere/command-r-plus) with roughly 50% higher throughput and 25% lower latencies as compared to the previous Command R+ version, while keeping the hardware footprint....
Sao10K: Llama 3.1 Euryale 70B v2.2
sao10kEuryale L3.1 70B v2.2 is a model focused on creative roleplay from [Sao10k](https://ko-fi.com/sao10k).
Nous: Hermes 3 70B Instruct
nousresearchHermes 3 is a generalist language model with many improvements over [Hermes 2](/models/nousresearch/nous-hermes-2-mistral-7b-dpo), including advanced agentic capabilities, much better roleplaying, reasoning, multi-turn conversation, long context coherence, and improvements across the....
Nous: Hermes 3 405B Instruct (free)
nousresearchHermes 3 is a generalist language model with many improvements over Hermes 2, including advanced agentic capabilities, much better roleplaying, reasoning, multi-turn conversation, long context coherence, and improvements across the....
Nous: Hermes 3 405B Instruct
nousresearchHermes 3 is a generalist language model with many improvements over Hermes 2, including advanced agentic capabilities, much better roleplaying, reasoning, multi-turn conversation, long context coherence, and improvements across the....
Sao10K: Llama 3 8B Lunaris
sao10kLunaris 8B is a versatile generalist and roleplaying model based on Llama 3.
OpenAI: GPT-4o (2024-08-06)
openaiThe 2024-08-06 version of GPT-4o offers improved performance in structured outputs, with the ability to supply a JSON schema in the respone_format.
Meta: Llama 3.1 8B Instruct
meta-llamaMeta's latest class of model (Llama 3.1) launched with a variety of sizes & flavors.
Meta: Llama 3.1 70B Instruct
meta-llamaMeta's latest class of model (Llama 3.1) launched with a variety of sizes & flavors.
Mistral: Mistral Nemo
mistralaiA 12B parameter model with a 128k token context length built by Mistral in collaboration with NVIDIA.
OpenAI: GPT-4o-mini (2024-07-18)
openaiGPT-4o mini is OpenAI's newest model after [GPT-4 Omni](/models/openai/gpt-4o), supporting both text and image inputs with text outputs.
Google: Gemma 2 27B
googleGemma 2 27B by Google is an open model built from the same research and technology used to create the [Gemini models](/models?q=gemini).
OpenAI: GPT-4o (2024-05-13)
openaiGPT-4o ("o" for "omni") is OpenAI's latest AI model, supporting both text and image inputs with text outputs.
Meta: Llama 3 8B Instruct
meta-llamaMeta's latest class of model (Llama 3) launched with a variety of sizes & flavors.
Meta: Llama 3 70B Instruct
meta-llamaMeta's latest class of model (Llama 3) launched with a variety of sizes & flavors.
Mistral: Mixtral 8x22B Instruct
mistralaiMistral's official instruct fine-tuned version of [Mixtral 8x22B](/models/mistralai/mixtral-8x22b).
WizardLM-2 8x22B
microsoftWizardLM-2 8x22B is Microsoft AI's most advanced Wizard model.
OpenAI: GPT-4 Turbo
openaiThe latest GPT-4 Turbo model with vision capabilities.
OpenAI: GPT-4 Turbo Preview
openaiThe preview GPT-4 model with improved instruction following, JSON mode, reproducible outputs, parallel function calling, and more.
OpenAI: GPT-3.5 Turbo (older v0613)
openaiGPT-3.5 Turbo is OpenAI's fastest model.
Auto Router
openrouterYour prompt will be processed by a meta-model and routed to one of dozens of models (see below), optimizing for the best possible output.
OpenAI: GPT-3.5 Turbo Instruct
openaiThis model is a variant of GPT-3.5 Turbo tuned for instructional prompts and omitting chat-related optimizations.
OpenAI: GPT-3.5 Turbo 16k
openaiThis model offers four times the context length of gpt-3.5-turbo, allowing it to support approximately 20 pages of text in a single request at a higher cost.
Mancer: Weaver (alpha)
mancerAn attempt to recreate Claude-style verbosity, but don't expect the same level of coherence or memory.
ReMM SLERP 13B
undi95A recreation trial of the original MythoMax-L2-B13 but with updated models.
MythoMax 13B
grypheOne of the highest performing and most popular fine-tunes of Llama 2 13B, with rich descriptions and roleplay.
OpenAI: GPT-4
openaiOpenAI's flagship model, GPT-4 is a large-scale multimodal language model capable of solving difficult problems with greater accuracy than previous models due to its broader general knowledge and advanced reasoning....
OpenAI: GPT-3.5 Turbo
openaiGPT-3.5 Turbo is OpenAI's fastest model.