Every model, one place.

GptGet AI model API gateway: one OpenAI-compatible key for GPT, Claude, Gemini, DeepSeek, Qwen and more.

S StepFun: Step 3.7 Flash

stepfun/step-3.7-flash

Step 3.7 Flash is StepFun's latest high-efficiency multimodal Mixture-of-Experts model.

256,000 tokens

Anthropic: Claude Opus 4.8 (Fast)

anthropic/claude-opus-4.8-fast

Fast-mode variant of [Opus 4.8](/anthropic/claude-opus-4.8) - identical capabilities wit

1,000,000 tokens

Anthropic: Claude Opus 4.8

anthropic/claude-opus-4.8

Claude Opus 4.8 is Anthropic's most capable generally available model in the Opus family

1,000,000 tokens

Qwen: Qwen3.7 Max

qwen/qwen3.7-max

Qwen3.7-Max is the flagship model in Alibaba's Qwen3.7 series. It supports text input an

1,000,000 tokens

X xAI: Grok Build 0.1

x-ai/grok-build-0.1

Grok Build 0.1 is xAI’s fast coding model trained specifically for agentic software engi

256,000 tokens

Google: Gemini 3.5 Flash

google/gemini-3.5-flash

Gemini 3.5 Flash is Google's high-efficiency multimodal model, bringing near-Pro level c

1,048,576 tokens

Anthropic: Claude Opus 4.7 (Fast)

anthropic/claude-opus-4.7-fast

Fast-mode variant of [Opus 4.7](/anthropic/claude-opus-4.7) - identical capabilities wit

1,000,000 tokens

P Perceptron: Perceptron Mk1

perceptron/perceptron-mk1

Perceptron Mk1 (Mark One) is Perceptron's highest-quality vision-language model for vide

32,768 tokens

I inclusionAI: Ring-2.6-1T

inclusionai/ring-2.6-1t

Ring-2.6-1T is a 1T-parameter-scale thinking model with 63B active parameters, built for

262,144 tokens

Google: Gemini 3.1 Flash Lite

google/gemini-3.1-flash-lite

Gemini 3.1 Flash Lite is Google’s GA high-efficiency multimodal model optimized for low-

1,048,576 tokens

OpenAI: GPT Chat Latest

openai/gpt-chat-latest

GPT Chat Latest points to OpenAI's stable API alias `chat-latest` that always resolves t

400,000 tokens

X xAI: Grok 4.3

x-ai/grok-4.3

Grok 4.3 is a reasoning model from xAI. It accepts text and image inputs with text outpu

1,000,000 tokens

I IBM: Granite 4.1 8B

ibm-granite/granite-4.1-8b

Granite 4.1 8B is a dense, decoder-only 8-billion-parameter language model from IBM, par

131,072 tokens

M Mistral: Mistral Medium 3.5

mistralai/mistral-medium-3-5

Mistral Medium 3.5 is a dense 128B instruction-following model from Mistral AI. It suppo

262,144 tokens

O Owl Alpha

openrouter/owl-alpha

Owl Alpha is a high-performance foundation model designed for agentic workloads. Nativel

1,048,756 tokens

N NVIDIA: Nemotron 3 Nano Omni (free)

nvidia/nemotron-3-nano-omni-30b-a3b-reasoning:free

NVIDIA Nemotron™ 3 Nano Omni is a 30B-A3B open multimodal model designed to function as

256,000 tokens

P Poolside: Laguna XS.2 (free)

poolside/laguna-xs.2:free

Laguna XS.2 is the second-generation model in the XS size class from [Poolside](https://

262,144 tokens

P Poolside: Laguna M.1 (free)

poolside/laguna-m.1:free

Laguna M.1 is the flagship coding agent model from [Poolside](https://poolside.ai), opti

262,144 tokens

Qwen: Qwen3.5 Plus 2026-04-20

qwen/qwen3.5-plus-20260420

Qwen3.5 Plus (April 2026) is a large-scale multimodal language model from Alibaba. It ac

1,000,000 tokens

Qwen: Qwen3.6 Flash

qwen/qwen3.6-flash

Qwen3.6 Flash is a fast, efficient language model from Alibaba's Qwen 3.6 series. It sup

1,000,000 tokens

Qwen: Qwen3.6 35B A3B

qwen/qwen3.6-35b-a3b

Qwen3.6-35B-A3B is an open-weight multimodal model from Alibaba Cloud with 35 billion to

262,144 tokens

Qwen: Qwen3.6 Max Preview

qwen/qwen3.6-max-preview

Qwen3.6-Max-Preview is a proprietary frontier model from Alibaba Cloud built on a sparse

262,144 tokens

Qwen: Qwen3.6 27B

qwen/qwen3.6-27b

Qwen3.6 27B is a dense 27-billion-parameter language model from the Qwen Team at Alibaba

262,144 tokens

OpenAI: GPT-5.5 Pro

openai/gpt-5.5-pro

GPT-5.5 Pro is OpenAI’s high-capability model optimized for deep reasoning and accuracy

1,050,000 tokens

OpenAI: GPT-5.5

openai/gpt-5.5

GPT-5.5 is OpenAI’s frontier model designed for complex professional workloads, building

1,050,000 tokens

DeepSeek: DeepSeek V4 Pro

deepseek/deepseek-v4-pro

DeepSeek V4 Pro is a large-scale Mixture-of-Experts model from DeepSeek with 1.6T total

1,048,576 tokens

DeepSeek: DeepSeek V4 Flash

deepseek/deepseek-v4-flash

DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepSeek with

1,048,576 tokens

I inclusionAI: Ling-2.6-1T

inclusionai/ling-2.6-1t

Ling-2.6-1T is an instant (instruct) model from inclusionAI and the company’s trillion-p

262,144 tokens

T Tencent: Hy3 preview

tencent/hy3-preview

Hy3 preview is a high-efficiency Mixture-of-Experts model from Tencent designed for agen

262,144 tokens

X Xiaomi: MiMo-V2.5-Pro

xiaomi/mimo-v2.5-pro

MiMo-V2.5-Pro is Xiaomi’s flagship model, delivering strong performance in general agent

1,048,576 tokens

X Xiaomi: MiMo-V2.5

xiaomi/mimo-v2.5

MiMo-V2.5 is a native omnimodal model by Xiaomi. It delivers Pro-level agentic performan

1,048,576 tokens

OpenAI: GPT-5.4 Image 2

openai/gpt-5.4-image-2

[GPT-5.4](https://openrouter.ai/openai/gpt-5.4) Image 2 combines OpenAI's GPT-5.4 model

272,000 tokens

I inclusionAI: Ling-2.6-flash

inclusionai/ling-2.6-flash

Ling-2.6-flash is an instant (instruct) model from inclusionAI with 104B total parameter

262,144 tokens

O Pareto Code Router

openrouter/pareto-code

The Pareto Router maintains a tiered shortlist of strong coding models, ranked by [Artif

2,000,000 tokens

M MoonshotAI: Kimi K2.6

moonshotai/kimi-k2.6

Kimi K2.6 is Moonshot AI's next-generation multimodal model, designed for long-horizon c

262,144 tokens

M MoonshotAI: Kimi K2.6 (free)

moonshotai/kimi-k2.6:free

Kimi K2.6 is Moonshot AI's next-generation multimodal model, designed for long-horizon c

262,144 tokens

Anthropic: Claude Opus 4.7

anthropic/claude-opus-4.7

Opus 4.7 is the next generation of Anthropic's Opus family, built for long-running, asyn

1,000,000 tokens

Anthropic: Claude Opus 4.6 (Fast)

anthropic/claude-opus-4.6-fast

Fast-mode variant of [Opus 4.6](/anthropic/claude-opus-4.6) - identical capabilities wit

1,000,000 tokens

Z Z.ai: GLM 5.1

z-ai/glm-5.1

GLM-5.1 delivers a major leap in coding capability, with particularly significant gains

202,752 tokens

Google: Gemma 4 26B A4B

google/gemma-4-26b-a4b-it

Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google De

262,144 tokens

Google: Gemma 4 26B A4B (free)

google/gemma-4-26b-a4b-it:free

Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google De

262,144 tokens

Google: Gemma 4 31B

google/gemma-4-31b-it

Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supporting text a

262,144 tokens

Google: Gemma 4 31B (free)

google/gemma-4-31b-it:free

Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supporting text a

262,144 tokens

Qwen: Qwen3.6 Plus

qwen/qwen3.6-plus

Qwen 3.6 Plus builds on a hybrid architecture that combines efficient linear attention w

1,000,000 tokens

Z Z.ai: GLM 5V Turbo

z-ai/glm-5v-turbo

GLM-5V-Turbo is Z.ai’s first native multimodal agent foundation model, built for vision-

202,752 tokens

A Arcee AI: Trinity Large Thinking

arcee-ai/trinity-large-thinking

Trinity Large Thinking is a powerful open source reasoning model from the team at Arcee

262,144 tokens

X xAI: Grok 4.20 Multi-Agent

x-ai/grok-4.20-multi-agent

Grok 4.20 Multi-Agent is a variant of xAI’s Grok 4.20 designed for collaborative, agent-

2,000,000 tokens

X xAI: Grok 4.20

x-ai/grok-4.20

Grok 4.20 is a reasoning model from xAI with industry-leading speed and agentic tool cal

2,000,000 tokens

Google: Lyria 3 Pro Preview

google/lyria-3-pro-preview

Full-length songs are priced at $0.08 per song. Lyria 3 is Google's family of music gene

1,048,576 tokens

Google: Lyria 3 Clip Preview

google/lyria-3-clip-preview

30 second duration clips are priced at $0.04 per clip. Lyria 3 is Google's family of mus

1,048,576 tokens

K Kwaipilot: KAT-Coder-Pro V2

kwaipilot/kat-coder-pro-v2

KAT-Coder-Pro V2 is the latest high-performance model in KwaiKAT’s KAT-Coder series, des

256,000 tokens

R Reka Edge

rekaai/reka-edge

Reka Edge is an extremely efficient 7B multimodal vision-language model that accepts ima

16,384 tokens

M MiniMax: MiniMax M2.7

minimax/minimax-m2.7

MiniMax-M2.7 is a next-generation large language model designed for autonomous, real-wor

204,800 tokens

OpenAI: GPT-5.4 Nano

openai/gpt-5.4-nano

GPT-5.4 nano is the most lightweight and cost-efficient variant of the GPT-5.4 family, o

400,000 tokens

OpenAI: GPT-5.4 Mini

openai/gpt-5.4-mini

GPT-5.4 mini brings the core capabilities of GPT-5.4 to a faster, more efficient model o

400,000 tokens

M Mistral: Mistral Small 4

mistralai/mistral-small-2603

Mistral Small 4 is the next major release in the Mistral Small family, unifying the capa

262,144 tokens

Z Z.ai: GLM 5 Turbo

z-ai/glm-5-turbo

GLM-5 Turbo is a new model from Z.ai designed for fast inference and strong performance

202,752 tokens

N NVIDIA: Nemotron 3 Super

nvidia/nemotron-3-super-120b-a12b

NVIDIA Nemotron 3 Super is a 120B-parameter open hybrid MoE model, activating just 12B p

1,000,000 tokens

N NVIDIA: Nemotron 3 Super (free)

nvidia/nemotron-3-super-120b-a12b:free

NVIDIA Nemotron 3 Super is a 120B-parameter open hybrid MoE model, activating just 12B p

1,000,000 tokens

B ByteDance Seed: Seed-2.0-Lite

bytedance-seed/seed-2.0-lite

Seed-2.0-Lite is a versatile, cost‑efficient enterprise workhorse that delivers strong m

262,144 tokens

Qwen: Qwen3.5-9B

qwen/qwen3.5-9b

Qwen3.5-9B is a multimodal foundation model from the Qwen3.5 family, designed to deliver

262,144 tokens

OpenAI: GPT-5.4 Pro

openai/gpt-5.4-pro

GPT-5.4 Pro is OpenAI's most advanced model, building on GPT-5.4's unified architecture

1,050,000 tokens

OpenAI: GPT-5.4

openai/gpt-5.4

GPT-5.4 is OpenAI’s latest frontier model, unifying the Codex and GPT lines into a singl

1,050,000 tokens

I Inception: Mercury 2

inception/mercury-2

Mercury 2 is an extremely fast reasoning LLM, and the first reasoning diffusion LLM (dLL

128,000 tokens

OpenAI: GPT-5.3 Chat

openai/gpt-5.3-chat

GPT-5.3 Chat is an update to ChatGPT's most-used model that makes everyday conversations

128,000 tokens

Google: Gemini 3.1 Flash Lite Preview

google/gemini-3.1-flash-lite-preview

Gemini 3.1 Flash Lite Preview is Google's high-efficiency model optimized for high-volum

1,048,576 tokens

B ByteDance Seed: Seed-2.0-Mini

bytedance-seed/seed-2.0-mini

Seed-2.0-mini targets latency-sensitive, high-concurrency, and cost-sensitive scenarios,

262,144 tokens

Qwen: Qwen3.5-35B-A3B

qwen/qwen3.5-35b-a3b

The Qwen3.5 Series 35B-A3B is a native vision-language model designed with a hybrid arch

262,144 tokens

Qwen: Qwen3.5-27B

qwen/qwen3.5-27b

The Qwen3.5 27B native vision-language Dense model incorporates a linear attention mecha

262,144 tokens

Qwen: Qwen3.5-122B-A10B

qwen/qwen3.5-122b-a10b

The Qwen3.5 122B-A10B native vision-language model is built on a hybrid architecture tha

262,144 tokens

Qwen: Qwen3.5-Flash

qwen/qwen3.5-flash-02-23

The Qwen3.5 native vision-language Flash models are built on a hybrid architecture that

1,000,000 tokens

L LiquidAI: LFM2-24B-A2B

liquid/lfm-2-24b-a2b

LFM2-24B-A2B is the largest model in the LFM2 family of hybrid architectures designed fo

128,000 tokens

Google: Gemini 3.1 Pro Preview Custom Tools

google/gemini-3.1-pro-preview-customtools

Gemini 3.1 Pro Preview Custom Tools is a variant of Gemini 3.1 Pro that improves tool se

1,048,756 tokens

OpenAI: GPT-5.3-Codex

openai/gpt-5.3-codex

GPT-5.3-Codex is OpenAI’s most advanced agentic coding model, combining the frontier sof

400,000 tokens

A AionLabs: Aion-2.0

aion-labs/aion-2.0

Aion-2.0 is a variant of DeepSeek V3.2 optimized for immersive roleplaying and storytell

131,072 tokens

Google: Gemini 3.1 Pro Preview

google/gemini-3.1-pro-preview

Gemini 3.1 Pro Preview is Google’s frontier reasoning model, delivering enhanced softwar

1,048,576 tokens

Anthropic: Claude Sonnet 4.6

anthropic/claude-sonnet-4.6

Sonnet 4.6 is Anthropic's most capable Sonnet-class model yet, with frontier performance

1,000,000 tokens

Qwen: Qwen3.5 Plus 2026-02-15

qwen/qwen3.5-plus-02-15

The Qwen3.5 native vision-language series Plus models are built on a hybrid architecture

1,000,000 tokens

Qwen: Qwen3.5 397B A17B

qwen/qwen3.5-397b-a17b

The Qwen3.5 series 397B-A17B native vision-language model is built on a hybrid architect

262,144 tokens

M MiniMax: MiniMax M2.5

minimax/minimax-m2.5

MiniMax-M2.5 is a SOTA large language model designed for real-world productivity. Traine

204,800 tokens

Z Z.ai: GLM 5

z-ai/glm-5

GLM-5 is Z.ai’s flagship open-source foundation model engineered for complex systems des

202,752 tokens

Qwen: Qwen3 Max Thinking

qwen/qwen3-max-thinking

Qwen3-Max-Thinking is the flagship reasoning model in the Qwen3 series, designed for hig

262,144 tokens

Anthropic: Claude Opus 4.6

anthropic/claude-opus-4.6

Opus 4.6 is Anthropic’s strongest model for coding and long-running professional tasks.

1,000,000 tokens

Qwen: Qwen3 Coder Next

qwen/qwen3-coder-next

Qwen3-Coder-Next is an open-weight causal language model optimized for coding agents and

262,144 tokens

O Free Models Router

openrouter/free

The simplest way to get free inference. openrouter/free is a router that selects free mo

200,000 tokens

S StepFun: Step 3.5 Flash

stepfun/step-3.5-flash

Step 3.5 Flash is StepFun's most capable open-source foundation model. Built on a sparse

262,144 tokens

M MoonshotAI: Kimi K2.5

moonshotai/kimi-k2.5

Kimi K2.5 is Moonshot AI's native multimodal model, delivering state-of-the-art visual c

262,144 tokens

U Upstage: Solar Pro 3

upstage/solar-pro-3

Solar Pro 3 is Upstage's powerful Mixture-of-Experts (MoE) language model. With 102B tot

128,000 tokens

M MiniMax: MiniMax M2-her

minimax/minimax-m2-her

MiniMax M2-her is a dialogue-first large language model built for immersive roleplay, ch

65,536 tokens

W Writer: Palmyra X5

writer/palmyra-x5

Palmyra X5 is Writer's most advanced model, purpose-built for building and scaling AI ag

1,040,000 tokens

L LiquidAI: LFM2.5-1.2B-Thinking (free)

liquid/lfm-2.5-1.2b-thinking:free

LFM2.5-1.2B-Thinking is a lightweight reasoning-focused model optimized for agentic task

32,768 tokens

L LiquidAI: LFM2.5-1.2B-Instruct (free)

liquid/lfm-2.5-1.2b-instruct:free

LFM2.5-1.2B-Instruct is a compact, high-performance instruction-tuned model built for fa

32,768 tokens

OpenAI: GPT Audio

openai/gpt-audio

The gpt-audio model is OpenAI's first generally available audio model. The new snapshot

128,000 tokens

OpenAI: GPT Audio Mini

openai/gpt-audio-mini

A cost-efficient version of GPT Audio. The new snapshot features an upgraded decoder for

128,000 tokens

Z Z.ai: GLM 4.7 Flash

z-ai/glm-4.7-flash

As a 30B-class SOTA model, GLM-4.7-Flash offers a new option that balances performance a

202,752 tokens

OpenAI: GPT-5.2-Codex

openai/gpt-5.2-codex

GPT-5.2-Codex is an upgraded version of GPT-5.1-Codex optimized for software engineering

400,000 tokens

B ByteDance Seed: Seed 1.6 Flash

bytedance-seed/seed-1.6-flash

Seed 1.6 Flash is an ultra-fast multimodal deep thinking model by ByteDance Seed, suppor

262,144 tokens

B ByteDance Seed: Seed 1.6

bytedance-seed/seed-1.6

Seed 1.6 is a general-purpose model released by the ByteDance Seed team. It incorporates

262,144 tokens

M MiniMax: MiniMax M2.1

minimax/minimax-m2.1

MiniMax-M2.1 is a lightweight, state-of-the-art large language model optimized for codin

204,800 tokens

Z Z.ai: GLM 4.7

z-ai/glm-4.7

GLM-4.7 is Z.ai’s latest flagship model, featuring upgrades in two key areas: enhanced p

202,752 tokens

Google: Gemini 3 Flash Preview

google/gemini-3-flash-preview

Gemini 3 Flash Preview is a high speed, high value thinking model designed for agentic w

1,048,576 tokens

X Xiaomi: MiMo-V2-Flash

xiaomi/mimo-v2-flash

MiMo-V2-Flash is an open-source foundation language model developed by Xiaomi. It is a M

262,144 tokens

N NVIDIA: Nemotron 3 Nano 30B A3B

nvidia/nemotron-3-nano-30b-a3b

NVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest compute effici

256,000 tokens

N NVIDIA: Nemotron 3 Nano 30B A3B (free)

nvidia/nemotron-3-nano-30b-a3b:free

NVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest compute effici

256,000 tokens

OpenAI: GPT-5.2 Chat

openai/gpt-5.2-chat

GPT-5.2 Chat (AKA Instant) is the fast, lightweight member of the 5.2 family, optimized

128,000 tokens

OpenAI: GPT-5.2 Pro

openai/gpt-5.2-pro

GPT-5.2 Pro is OpenAI’s most advanced model, offering major improvements in agentic codi

400,000 tokens

OpenAI: GPT-5.2

openai/gpt-5.2

GPT-5.2 is the latest frontier-grade model in the GPT-5 series, offering stronger agenti

400,000 tokens

M Mistral: Devstral 2 2512

mistralai/devstral-2512

Devstral 2 is a state-of-the-art open-source model by Mistral AI specializing in agentic

262,144 tokens

R Relace: Relace Search

relace/relace-search

The relace-search model uses 4-12 `view_file` and `grep` tools in parallel to explore a

256,000 tokens

Z Z.ai: GLM 4.6V

z-ai/glm-4.6v

GLM-4.6V is a large multimodal model designed for high-fidelity visual understanding and

131,072 tokens

N Nex AGI: DeepSeek V3.1 Nex N1

nex-agi/deepseek-v3.1-nex-n1

DeepSeek V3.1 Nex-N1 is the flagship release of the Nex-N1 series — a post-trained model

131,072 tokens

E EssentialAI: Rnj 1 Instruct

essentialai/rnj-1-instruct

Rnj-1 is an 8B-parameter, dense, open-weight model family developed by Essential AI and

32,768 tokens

O Body Builder (beta)

openrouter/bodybuilder

Transform your natural language requests into structured OpenRouter API request objects.

128,000 tokens

OpenAI: GPT-5.1-Codex-Max

openai/gpt-5.1-codex-max

GPT-5.1-Codex-Max is OpenAI’s latest agentic coding model, designed for long-running, hi

400,000 tokens

A Amazon: Nova 2 Lite

amazon/nova-2-lite-v1

Nova 2 Lite is a fast, cost-effective reasoning model for everyday workloads that can pr

1,000,000 tokens

M Mistral: Ministral 3 14B 2512

mistralai/ministral-14b-2512

The largest model in the Ministral 3 family, Ministral 3 14B offers frontier capabilitie

262,144 tokens

M Mistral: Ministral 3 8B 2512

mistralai/ministral-8b-2512

A balanced model in the Ministral 3 family, Ministral 3 8B is a powerful, efficient tiny

262,144 tokens

M Mistral: Ministral 3 3B 2512

mistralai/ministral-3b-2512

The smallest model in the Ministral 3 family, Ministral 3 3B is a powerful, efficient ti

131,072 tokens

M Mistral: Mistral Large 3 2512

mistralai/mistral-large-2512

Mistral Large 3 2512 is Mistral’s most capable model to date, featuring a sparse mixture

262,144 tokens

A Arcee AI: Trinity Mini

arcee-ai/trinity-mini

Trinity Mini is a 26B-parameter (3B active) sparse mixture-of-experts language model fea

131,072 tokens

DeepSeek: DeepSeek V3.2

deepseek/deepseek-v3.2

DeepSeek-V3.2 is a large language model designed to harmonize high computational efficie

131,072 tokens

P Prime Intellect: INTELLECT-3

prime-intellect/intellect-3

INTELLECT-3 is a 106B-parameter Mixture-of-Experts model (12B active) post-trained from

131,072 tokens

Anthropic: Claude Opus 4.5

anthropic/claude-opus-4.5

Claude Opus 4.5 is Anthropic’s frontier reasoning model optimized for complex software e

200,000 tokens

A AllenAI: Olmo 3 32B Think

allenai/olmo-3-32b-think

Olmo 3 32B Think is a large-scale, 32-billion-parameter model purpose-built for deep rea

65,536 tokens

D Deep Cogito: Cogito v2.1 671B

deepcogito/cogito-v2.1-671b

Cogito v2.1 671B MoE represents one of the strongest open models globally, matching perf

128,000 tokens

OpenAI: GPT-5.1

openai/gpt-5.1

GPT-5.1 is the latest frontier-grade model in the GPT-5 series, offering stronger genera

400,000 tokens

OpenAI: GPT-5.1 Chat

openai/gpt-5.1-chat

GPT-5.1 Chat (AKA Instant is the fast, lightweight member of the 5.1 family, optimized f

128,000 tokens

OpenAI: GPT-5.1-Codex

openai/gpt-5.1-codex

GPT-5.1-Codex is a specialized version of GPT-5.1 optimized for software engineering and

400,000 tokens

OpenAI: GPT-5.1-Codex-Mini

openai/gpt-5.1-codex-mini

GPT-5.1-Codex-Mini is a smaller and faster version of GPT-5.1-Codex

400,000 tokens

M MoonshotAI: Kimi K2 Thinking

moonshotai/kimi-k2-thinking

Kimi K2 Thinking is Moonshot AI’s most advanced open reasoning model to date, extending

262,144 tokens

A Amazon: Nova Premier 1.0

amazon/nova-premier-v1

Amazon Nova Premier is the most capable of Amazon’s multimodal models for complex reason

1,000,000 tokens

P Perplexity: Sonar Pro Search

perplexity/sonar-pro-search

Exclusively available on the OpenRouter API, Sonar Pro's new Pro Search mode is Perplexi

200,000 tokens

OpenAI: gpt-oss-safeguard-20b

openai/gpt-oss-safeguard-20b

gpt-oss-safeguard-20b is a safety reasoning model from OpenAI built upon gpt-oss-20b. Th

131,072 tokens

N Nemotron Nano 12b V2 Vl

nvidia/nemotron-nano-12b-v2-vl

NVIDIA Nemotron Nano 2 VL is a 12-billion-parameter open multimodal reasoning model desi

128,000 tokens

N NVIDIA: Nemotron Nano 12B 2 VL (free)

nvidia/nemotron-nano-12b-v2-vl:free

NVIDIA Nemotron Nano 2 VL is a 12-billion-parameter open multimodal reasoning model desi

128,000 tokens

M MiniMax: MiniMax M2

minimax/minimax-m2

MiniMax-M2 is a compact, high-efficiency large language model optimized for end-to-end c

204,800 tokens

Qwen: Qwen3 VL 32B Instruct

qwen/qwen3-vl-32b-instruct

Qwen3-VL-32B-Instruct is a large-scale multimodal vision-language model designed for hig

262,144 tokens

I IBM: Granite 4.0 Micro

ibm-granite/granite-4.0-h-micro

Granite-4.0-H-Micro is a 3B parameter from the Granite 4 family of models. These models

131,000 tokens

M Microsoft: Phi 4 Mini Instruct

microsoft/phi-4-mini-instruct

Phi-4-mini-instruct is a lightweight open model built upon synthetic data and filtered p

131,072 tokens

OpenAI: GPT-5 Image Mini

openai/gpt-5-image-mini

GPT-5 Image Mini combines OpenAI's advanced language capabilities, powered by [GPT-5 Min

400,000 tokens

Anthropic: Claude Haiku 4.5

anthropic/claude-haiku-4.5

Claude Haiku 4.5 is Anthropic’s fastest and most efficient model, delivering near-fronti

200,000 tokens

Qwen: Qwen3 VL 8B Thinking

qwen/qwen3-vl-8b-thinking

Qwen3-VL-8B-Thinking is the reasoning-optimized variant of the Qwen3-VL-8B multimodal mo

256,000 tokens

Qwen: Qwen3 VL 8B Instruct

qwen/qwen3-vl-8b-instruct

Qwen3-VL-8B-Instruct is a multimodal vision-language model from the Qwen3-VL series, bui

256,000 tokens

OpenAI: GPT-5 Image

openai/gpt-5-image

[GPT-5](https://openrouter.ai/openai/gpt-5) Image combines OpenAI's GPT-5 model with sta

400,000 tokens

OpenAI: o3 Deep Research

openai/o3-deep-research

o3-deep-research is OpenAI's advanced model for deep research, designed to tackle comple

200,000 tokens

OpenAI: o4 Mini Deep Research

openai/o4-mini-deep-research

o4-mini-deep-research is OpenAI's faster, more affordable deep research model—ideal for

200,000 tokens

N NVIDIA: Llama 3.3 Nemotron Super 49B V1.5

nvidia/llama-3.3-nemotron-super-49b-v1.5

Llama-3.3-Nemotron-Super-49B-v1.5 is a 49B-parameter, English-centric reasoning/chat mod

131,072 tokens

Qwen: Qwen3 VL 30B A3B Thinking

qwen/qwen3-vl-30b-a3b-thinking

Qwen3-VL-30B-A3B-Thinking is a multimodal model that unifies strong text generation with

131,072 tokens

Qwen: Qwen3 VL 30B A3B Instruct

qwen/qwen3-vl-30b-a3b-instruct

Qwen3-VL-30B-A3B-Instruct is a multimodal model that unifies strong text generation with

262,144 tokens

OpenAI: GPT-5 Pro

openai/gpt-5-pro

GPT-5 Pro is OpenAI’s most advanced model, offering major improvements in reasoning, cod

400,000 tokens

Z Z.ai: GLM 4.6

z-ai/glm-4.6

Compared with GLM-4.5, this generation brings several key improvements: Longer context w

202,752 tokens

Anthropic: Claude Sonnet 4.5

anthropic/claude-sonnet-4.5

Claude Sonnet 4.5 is Anthropic’s most advanced Sonnet model to date, optimized for real-

1,000,000 tokens

DeepSeek: DeepSeek V3.2 Exp

deepseek/deepseek-v3.2-exp

DeepSeek-V3.2-Exp is an experimental large language model released by DeepSeek as an int

163,840 tokens

T TheDrummer: Cydonia 24B V4.1

thedrummer/cydonia-24b-v4.1

Uncensored and creative writing model based on Mistral Small 3.2 24B with good recall, p

131,072 tokens

R Relace: Relace Apply 3

relace/relace-apply-3

Relace Apply 3 is a specialized code-patching LLM that merges AI-suggested edits straigh

256,000 tokens

Google: Gemini 2.5 Flash Lite Preview 09-2025

google/gemini-2.5-flash-lite-preview-09-2025

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimiz

1,048,576 tokens

Qwen: Qwen3 VL 235B A22B Thinking

qwen/qwen3-vl-235b-a22b-thinking

Qwen3-VL-235B-A22B Thinking is a multimodal model that unifies strong text generation wi

131,072 tokens

Qwen: Qwen3 VL 235B A22B Instruct

qwen/qwen3-vl-235b-a22b-instruct

Qwen3-VL-235B-A22B Instruct is an open-weight multimodal model that unifies strong text

262,144 tokens

Qwen: Qwen3 Max

qwen/qwen3-max

Qwen3-Max is an updated release built on the Qwen3 series, offering major improvements i

262,144 tokens

Qwen: Qwen3 Coder Plus

qwen/qwen3-coder-plus

Qwen3 Coder Plus is Alibaba's proprietary version of the Open Source Qwen3 Coder 480B A3

1,000,000 tokens

OpenAI: GPT-5 Codex

openai/gpt-5-codex

GPT-5-Codex is a specialized version of GPT-5 optimized for software engineering and cod

400,000 tokens

DeepSeek: DeepSeek V3.1 Terminus

deepseek/deepseek-v3.1-terminus

DeepSeek-V3.1 Terminus is an update to [DeepSeek V3.1](/deepseek/deepseek-chat-v3.1) tha

163,840 tokens

Qwen: Qwen3 Coder Flash

qwen/qwen3-coder-flash

Qwen3 Coder Flash is Alibaba's fast and cost efficient version of their proprietary Qwen

1,000,000 tokens

Qwen: Qwen3 Next 80B A3B Thinking

qwen/qwen3-next-80b-a3b-thinking

Qwen3-Next-80B-A3B-Thinking is a reasoning-first chat model in the Qwen3-Next line that

262,144 tokens

Qwen: Qwen3 Next 80B A3B Instruct

qwen/qwen3-next-80b-a3b-instruct

Qwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next series

262,144 tokens

Qwen: Qwen3 Next 80B A3B Instruct (free)

qwen/qwen3-next-80b-a3b-instruct:free

Qwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next series

262,144 tokens

Qwen: Qwen Plus 0728

qwen/qwen-plus-2025-07-28

Qwen Plus 0728, based on the Qwen3 foundation model, is a 1 million context hybrid reaso

1,000,000 tokens

Qwen: Qwen Plus 0728 (thinking)

qwen/qwen-plus-2025-07-28:thinking

Qwen Plus 0728, based on the Qwen3 foundation model, is a 1 million context hybrid reaso

1,000,000 tokens

N NVIDIA: Nemotron Nano 9B V2

nvidia/nemotron-nano-9b-v2

NVIDIA-Nemotron-Nano-9B-v2 is a large language model (LLM) trained from scratch by NVIDI

128,000 tokens

N NVIDIA: Nemotron Nano 9B V2 (free)

nvidia/nemotron-nano-9b-v2:free

NVIDIA-Nemotron-Nano-9B-v2 is a large language model (LLM) trained from scratch by NVIDI

128,000 tokens

M MoonshotAI: Kimi K2 0905

moonshotai/kimi-k2-0905

Kimi K2 0905 is the September update of [Kimi K2 0711](moonshotai/kimi-k2). It is a larg

262,144 tokens

Qwen: Qwen3 30B A3B Thinking 2507

qwen/qwen3-30b-a3b-thinking-2507

Qwen3-30B-A3B-Thinking-2507 is a 30B parameter Mixture-of-Experts reasoning model optimi

131,072 tokens

N Nous: Hermes 4 70B

nousresearch/hermes-4-70b

Hermes 4 70B is a hybrid reasoning model from Nous Research, built on Meta-Llama-3.1-70B

131,072 tokens

N Nous: Hermes 4 405B

nousresearch/hermes-4-405b

Hermes 4 is a large-scale reasoning model built on Meta-Llama-3.1-405B and released by N

131,072 tokens

DeepSeek: DeepSeek V3.1

deepseek/deepseek-chat-v3.1

DeepSeek-V3.1 is a large hybrid reasoning model (671B parameters, 37B active) that suppo

163,840 tokens

M Mistral: Mistral Medium 3.1

mistralai/mistral-medium-3.1

Mistral Medium 3.1 is an updated version of Mistral Medium 3, which is a high-performanc

131,072 tokens

B Baidu: ERNIE 4.5 VL 28B A3B

baidu/ernie-4.5-vl-28b-a3b

A powerful multimodal Mixture-of-Experts chat model featuring 28B total parameters with

131,072 tokens

Z Z.ai: GLM 4.5V

z-ai/glm-4.5v

GLM-4.5V is a vision-language foundation model for multimodal agent applications. Built

65,536 tokens

A AI21: Jamba Large 1.7

ai21/jamba-large-1.7

Jamba Large 1.7 is the latest model in the Jamba open family, offering improvements in g

256,000 tokens

OpenAI: GPT-5 Chat

openai/gpt-5-chat

GPT-5 Chat is designed for advanced, natural, multimodal, and context-aware conversation

128,000 tokens

OpenAI: GPT-5

openai/gpt-5

GPT-5 is OpenAI’s most advanced model, offering major improvements in reasoning, code qu

400,000 tokens

OpenAI: GPT-5 Mini

openai/gpt-5-mini

GPT-5 Mini is a compact version of GPT-5, designed to handle lighter-weight reasoning ta

400,000 tokens

OpenAI: GPT-5 Nano

openai/gpt-5-nano

GPT-5-Nano is the smallest and fastest variant in the GPT-5 system, optimized for develo

400,000 tokens

OpenAI: gpt-oss-120b

openai/gpt-oss-120b

gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language model f

131,072 tokens

OpenAI: gpt-oss-120b (free)

openai/gpt-oss-120b:free

gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language model f

131,072 tokens

OpenAI: gpt-oss-20b

openai/gpt-oss-20b

gpt-oss-20b is an open-weight 21B parameter model released by OpenAI under the Apache 2.

131,072 tokens

OpenAI: gpt-oss-20b (free)

openai/gpt-oss-20b:free

gpt-oss-20b is an open-weight 21B parameter model released by OpenAI under the Apache 2.

131,072 tokens

Anthropic: Claude Opus 4.1

anthropic/claude-opus-4.1

Claude Opus 4.1 is an updated version of Anthropic’s flagship model, offering improved p

200,000 tokens

M Mistral: Codestral 2508

mistralai/codestral-2508

Mistral's cutting-edge language model for coding released end of July 2025. Codestral sp

256,000 tokens

Qwen: Qwen3 Coder 30B A3B Instruct

qwen/qwen3-coder-30b-a3b-instruct

Qwen3-Coder-30B-A3B-Instruct is a 30.5B parameter Mixture-of-Experts (MoE) model with 12

160,000 tokens

Qwen: Qwen3 30B A3B Instruct 2507

qwen/qwen3-30b-a3b-instruct-2507

Qwen3-30B-A3B-Instruct-2507 is a 30.5B-parameter mixture-of-experts language model from

262,144 tokens

Z Z.ai: GLM 4.5

z-ai/glm-4.5

GLM-4.5 is our latest flagship foundation model, purpose-built for agent-based applicati

131,072 tokens

Z Z.ai: GLM 4.5 Air

z-ai/glm-4.5-air

GLM-4.5-Air is the lightweight variant of our latest flagship model family, also purpose

131,072 tokens

Z Z.ai: GLM 4.5 Air (free)

z-ai/glm-4.5-air:free

GLM-4.5-Air is the lightweight variant of our latest flagship model family, also purpose

131,072 tokens

Qwen: Qwen3 235B A22B Thinking 2507

qwen/qwen3-235b-a22b-thinking-2507

Qwen3-235B-A22B-Thinking-2507 is a high-performance, open-weight Mixture-of-Experts (MoE

262,144 tokens

Z Z.ai: GLM 4 32B

z-ai/glm-4-32b

GLM 4 32B is a cost-effective foundation language model. It can efficiently perform comp

128,000 tokens

Qwen: Qwen3 Coder 480B A35B

qwen/qwen3-coder

Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation model devel

1,048,576 tokens

Qwen: Qwen3 Coder 480B A35B (free)

qwen/qwen3-coder:free

Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation model devel

1,048,576 tokens

B ByteDance: UI-TARS 7B

bytedance/ui-tars-1.5-7b

UI-TARS-1.5 is a multimodal vision-language agent optimized for GUI-based environments,

128,000 tokens

Google: Gemini 2.5 Flash Lite

google/gemini-2.5-flash-lite

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimiz

1,048,576 tokens

Qwen: Qwen3 235B A22B Instruct 2507

qwen/qwen3-235b-a22b-2507

Qwen3-235B-A22B-Instruct-2507 is a multilingual, instruction-tuned mixture-of-experts la

262,144 tokens

S Switchpoint Router

switchpoint/router

Switchpoint AI's router instantly analyzes your request and directs it to the optimal AI

131,072 tokens

M MoonshotAI: Kimi K2 0711

moonshotai/kimi-k2

Kimi K2 Instruct is a large-scale Mixture-of-Experts (MoE) language model developed by M

131,072 tokens

C Venice: Uncensored (free)

cognitivecomputations/dolphin-mistral-24b-venice-edition:free

Venice Uncensored Dolphin Mistral 24B Venice Edition is a fine-tuned variant of Mistral-

32,768 tokens

T Tencent: Hunyuan A13B Instruct

tencent/hunyuan-a13b-instruct

Hunyuan-A13B is a 13B active parameter Mixture-of-Experts (MoE) language model developed

131,072 tokens

M Morph: Morph V3 Large

morph/morph-v3-large

Morph's high-accuracy apply model for complex code edits. ~4,500 tokens/sec with 98% acc

262,144 tokens

M Morph: Morph V3 Fast

morph/morph-v3-fast

Morph's fastest apply model for code edits. ~10,500 tokens/sec with 96% accuracy for rap

81,920 tokens

B Baidu: ERNIE 4.5 VL 424B A47B

baidu/ernie-4.5-vl-424b-a47b

ERNIE-4.5-VL-424B-A47B is a multimodal Mixture-of-Experts (MoE) model from Baidu’s ERNIE

131,072 tokens

M Mistral: Mistral Small 3.2 24B

mistralai/mistral-small-3.2-24b-instruct

Mistral-Small-3.2-24B-Instruct-2506 is an updated 24B parameter model from Mistral optim

128,000 tokens

M MiniMax: MiniMax M1

minimax/minimax-m1

MiniMax-M1 is a large-scale, open-weight reasoning model designed for extended context a

1,000,000 tokens

Google: Gemini 2.5 Flash

google/gemini-2.5-flash

Gemini 2.5 Flash is Google's state-of-the-art workhorse model, specifically designed for

1,048,576 tokens

Google: Gemini 2.5 Pro

google/gemini-2.5-pro

Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, co

1,048,576 tokens

OpenAI: o3 Pro

openai/o3-pro

The o-series of models are trained with reinforcement learning to think before they answ

200,000 tokens

Google: Gemini 2.5 Pro Preview 06-05

google/gemini-2.5-pro-preview

Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, co

1,048,576 tokens

DeepSeek: R1 0528

deepseek/deepseek-r1-0528

May 28th update to the [original DeepSeek R1](/deepseek/deepseek-r1) Performance on par

163,840 tokens

Anthropic: Claude Opus 4

anthropic/claude-opus-4

Claude Opus 4 is benchmarked as the world’s best coding model, at time of release, bring

200,000 tokens

Anthropic: Claude Sonnet 4

anthropic/claude-sonnet-4

Claude Sonnet 4 significantly enhances the capabilities of its predecessor, Sonnet 3.7,

1,000,000 tokens

Google: Gemma 3n 4B

google/gemma-3n-e4b-it

Gemma 3n E4B-it is optimized for efficient execution on mobile and low-resource devices,

32,768 tokens

M Mistral: Mistral Medium 3

mistralai/mistral-medium-3

Mistral Medium 3 is a high-performance enterprise-grade language model designed to deliv

131,072 tokens

Google: Gemini 2.5 Pro Preview 05-06

google/gemini-2.5-pro-preview-05-06

Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, co

1,048,576 tokens

A Arcee AI: Spotlight

arcee-ai/spotlight

Spotlight is a 7‑billion‑parameter vision‑language model derived from Qwen 2.5‑VL and fi

131,072 tokens

A Arcee AI: Maestro Reasoning

arcee-ai/maestro-reasoning

Maestro Reasoning is Arcee's flagship analysis model: a 32 B‑parameter derivative of Qwe

131,072 tokens

A Arcee AI: Virtuoso Large

arcee-ai/virtuoso-large

Virtuoso‑Large is Arcee's top‑tier general‑purpose LLM at 72 B parameters, tuned to tack

131,072 tokens

A Arcee AI: Coder Large

arcee-ai/coder-large

Coder‑Large is a 32 B‑parameter offspring of Qwen 2.5‑Instruct that has been further tra

32,768 tokens

M Meta: Llama Guard 4 12B

meta-llama/llama-guard-4-12b

Llama Guard 4 is a Llama 4 Scout-derived multimodal pretrained model, fine-tuned for con

163,840 tokens

Qwen: Qwen3 30B A3B

qwen/qwen3-30b-a3b

Qwen3, the latest generation in the Qwen large language model series, features both dens

131,072 tokens

Qwen: Qwen3 8B

qwen/qwen3-8b

Qwen3-8B is a dense 8.2B parameter causal language model from the Qwen3 series, designed

131,072 tokens

Qwen: Qwen3 14B

qwen/qwen3-14b

Qwen3-14B is a dense 14.8B parameter causal language model from the Qwen3 series, design

131,702 tokens

Qwen: Qwen3 32B

qwen/qwen3-32b

Qwen3-32B is a dense 32.8B parameter causal language model from the Qwen3 series, optimi

131,072 tokens

Qwen: Qwen3 235B A22B

qwen/qwen3-235b-a22b

Qwen3-235B-A22B is a 235B parameter mixture-of-experts (MoE) model developed by Qwen, ac

131,072 tokens

OpenAI: o4 Mini High

openai/o4-mini-high

OpenAI o4-mini-high is the same model as [o4-mini](/openai/o4-mini) with reasoning_effor

200,000 tokens

OpenAI: o3

openai/o3

o3 is a well-rounded and powerful model across domains. It sets a new standard for math,

200,000 tokens

OpenAI: o4 Mini

openai/o4-mini

OpenAI o4-mini is a compact reasoning model in the o-series, optimized for fast, cost-ef

200,000 tokens

OpenAI: GPT-4.1

openai/gpt-4.1

GPT-4.1 is a flagship large language model optimized for advanced instruction following,

1,047,576 tokens

OpenAI: GPT-4.1 Mini

openai/gpt-4.1-mini

GPT-4.1 Mini is a mid-sized model delivering performance competitive with GPT-4o at subs

1,047,576 tokens

OpenAI: GPT-4.1 Nano

openai/gpt-4.1-nano

For tasks that demand low latency, GPT‑4.1 nano is the fastest and cheapest model in the

1,047,576 tokens

M Meta: Llama 4 Maverick

meta-llama/llama-4-maverick

Llama 4 Maverick 17B Instruct (128E) is a high-capacity multimodal language model from M

1,048,576 tokens

M Meta: Llama 4 Scout

meta-llama/llama-4-scout

Llama 4 Scout 17B Instruct (16E) is a mixture-of-experts (MoE) language model developed

10,000,000 tokens

DeepSeek: DeepSeek V3 0324

deepseek/deepseek-chat-v3-0324

DeepSeek V3, a 685B-parameter, mixture-of-experts model, is the latest iteration of the

163,840 tokens

OpenAI: o1-pro

openai/o1-pro

The o1 series of models are trained with reinforcement learning to think before they ans

200,000 tokens

M Mistral: Mistral Small 3.1 24B

mistralai/mistral-small-3.1-24b-instruct

Mistral Small 3.1 24B Instruct is an upgraded variant of Mistral Small 3 (2501), featuri

128,000 tokens

Google: Gemma 3 4B

google/gemma-3-4b-it

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It

131,072 tokens

Google: Gemma 3 12B

google/gemma-3-12b-it

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It

131,072 tokens

C Cohere: Command A

cohere/command-a

Command A is an open-weights 111B parameter model with a 256k context window focused on

256,000 tokens

OpenAI: GPT-4o-mini Search Preview

openai/gpt-4o-mini-search-preview

GPT-4o mini Search Preview is a specialized model for web search in Chat Completions. It

128,000 tokens

OpenAI: GPT-4o Search Preview

openai/gpt-4o-search-preview

GPT-4o Search Previewis a specialized model for web search in Chat Completions. It is tr

128,000 tokens

R Reka Flash 3

rekaai/reka-flash-3

Reka Flash 3 is a general-purpose, instruction-tuned large language model with 21 billio

65,536 tokens

Google: Gemma 3 27B

google/gemma-3-27b-it

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It

131,072 tokens

T TheDrummer: Skyfall 36B V2

thedrummer/skyfall-36b-v2

Skyfall 36B v2 is an enhanced iteration of Mistral Small 2501, specifically fine-tuned f

32,768 tokens

P Perplexity: Sonar Reasoning Pro

perplexity/sonar-reasoning-pro

Note: Sonar Pro pricing includes Perplexity search pricing. See [details here](https://d

128,000 tokens

P Perplexity: Sonar Pro

perplexity/sonar-pro

Note: Sonar Pro pricing includes Perplexity search pricing. See [details here](https://d

200,000 tokens

P Perplexity: Sonar Deep Research

perplexity/sonar-deep-research

Sonar Deep Research is a research-focused model designed for multi-step retrieval, synth

128,000 tokens

M Mistral: Saba

mistralai/mistral-saba

Mistral Saba is a 24B-parameter language model specifically designed for the Middle East

32,768 tokens

M Llama Guard 3 8B

meta-llama/llama-guard-3-8b

Llama Guard 3 is a Llama-3.1-8B pretrained model, fine-tuned for content safety classifi

131,072 tokens

OpenAI: o3 Mini High

openai/o3-mini-high

OpenAI o3-mini-high is the same model as [o3-mini](/openai/o3-mini) with reasoning_effor

200,000 tokens

A AionLabs: Aion-1.0

aion-labs/aion-1.0

Aion-1.0 is a multi-model system designed for high performance across various tasks, inc

131,072 tokens

A AionLabs: Aion-1.0-Mini

aion-labs/aion-1.0-mini

Aion-1.0-Mini 32B parameter model is a distilled version of the DeepSeek-R1 model, desig

131,072 tokens

A AionLabs: Aion-RP 1.0 (8B)

aion-labs/aion-rp-llama-3.1-8b

Aion-RP-Llama-3.1-8B ranks the highest in the character evaluation portion of the RPBenc

32,768 tokens

Qwen: Qwen2.5 VL 72B Instruct

qwen/qwen2.5-vl-72b-instruct

Qwen2.5-VL is proficient in recognizing common objects such as flowers, birds, fish, and

131,072 tokens

Qwen: Qwen-Plus

qwen/qwen-plus

Qwen-Plus, based on the Qwen2.5 foundation model, is a 131K context model with a balance

1,000,000 tokens

OpenAI: o3 Mini

openai/o3-mini

OpenAI o3-mini is a cost-efficient language model optimized for STEM reasoning tasks, pa

200,000 tokens

M Mistral: Mistral Small 3

mistralai/mistral-small-24b-instruct-2501

Mistral Small 3 is a 24B-parameter language model optimized for low-latency performance

32,768 tokens

DeepSeek: R1 Distill Qwen 32B

deepseek/deepseek-r1-distill-qwen-32b

DeepSeek R1 Distill Qwen 32B is a distilled large language model based on [Qwen 2.5 32B]

128,000 tokens

P Perplexity: Sonar

perplexity/sonar

Sonar is lightweight, affordable, fast, and simple to use — now featuring citations and

127,072 tokens

DeepSeek: R1 Distill Llama 70B

deepseek/deepseek-r1-distill-llama-70b

DeepSeek R1 Distill Llama 70B is a distilled large language model based on [Llama-3.3-70

131,072 tokens

DeepSeek: R1

deepseek/deepseek-r1

DeepSeek R1 is here: Performance on par with [OpenAI o1](/openai/o1), but open-sourced a

163,840 tokens

M MiniMax: MiniMax-01

minimax/minimax-01

MiniMax-01 is a combines MiniMax-Text-01 for text generation and MiniMax-VL-01 for image

1,000,192 tokens

M Microsoft: Phi 4

microsoft/phi-4

[Microsoft Research](/microsoft) Phi-4 is designed to perform well in complex reasoning

16,384 tokens

S Sao10K: Llama 3.1 70B Hanami x1

sao10k/l3.1-70b-hanami-x1

This is [Sao10K](/sao10k)'s experiment over [Euryale v2.2](/sao10k/l3.1-euryale-70b).

16,000 tokens

DeepSeek: DeepSeek V3

deepseek/deepseek-chat

DeepSeek-V3 is the latest model from the DeepSeek team, building upon the instruction fo

131,072 tokens

S Sao10K: Llama 3.3 Euryale 70B

sao10k/l3.3-euryale-70b

Euryale L3.3 70B is a model focused on creative roleplay from [Sao10k](https://ko-fi.com

131,072 tokens

OpenAI: o1

openai/o1

The latest and strongest model family from OpenAI, o1 is designed to spend more time thi

200,000 tokens

C Cohere: Command R7B (12-2024)

cohere/command-r7b-12-2024

Command R7B (12-2024) is a small, fast update of the Command R+ model, delivered in Dece

128,000 tokens

M Meta: Llama 3.3 70B Instruct

meta-llama/llama-3.3-70b-instruct

The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instructi

131,072 tokens

M Meta: Llama 3.3 70B Instruct (free)

meta-llama/llama-3.3-70b-instruct:free

The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instructi

131,072 tokens

A Amazon: Nova Lite 1.0

amazon/nova-lite-v1

Amazon Nova Lite 1.0 is a very low-cost multimodal model from Amazon that focused on fas

300,000 tokens

A Amazon: Nova Micro 1.0

amazon/nova-micro-v1

Amazon Nova Micro 1.0 is a text-only model that delivers the lowest latency responses in

128,000 tokens

A Amazon: Nova Pro 1.0

amazon/nova-pro-v1

Amazon Nova Pro 1.0 is a capable multimodal model from Amazon focused on providing a com

300,000 tokens

OpenAI: GPT-4o (2024-11-20)

openai/gpt-4o-2024-11-20

The 2024-11-20 version of GPT-4o offers a leveled-up creative writing ability with more

128,000 tokens

M Mistral Large 2407

mistralai/mistral-large-2407

This is Mistral AI's flagship model, Mistral Large 2 (version mistral-large-2407). It's

131,072 tokens

Qwen2.5 Coder 32B Instruct

qwen/qwen-2.5-coder-32b-instruct

Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly

128,000 tokens

T TheDrummer: UnslopNemo 12B

thedrummer/unslopnemo-12b

UnslopNemo v4.1 is the latest addition from the creator of Rocinante, designed for adven

32,768 tokens

Anthropic: Claude 3.5 Haiku

anthropic/claude-3.5-haiku

Claude 3.5 Haiku features offers enhanced capabilities in speed, coding accuracy, and to

200,000 tokens

A Magnum v4 72B

anthracite-org/magnum-v4-72b

This is a series of models designed to replicate the prose quality of the Claude 3 model

32,768 tokens

Qwen: Qwen2.5 7B Instruct

qwen/qwen-2.5-7b-instruct

Qwen2.5 7B is the latest series of Qwen large language models. Qwen2.5 brings the follow

131,072 tokens

I Inflection: Inflection 3 Pi

inflection/inflection-3-pi

Inflection 3 Pi powers Inflection's [Pi](https://pi.ai) chatbot, including backstory, em

8,000 tokens

I Inflection: Inflection 3 Productivity

inflection/inflection-3-productivity

Inflection 3 Productivity is optimized for following instructions. It is better for task

8,000 tokens

T TheDrummer: Rocinante 12B

thedrummer/rocinante-12b

Rocinante 12B is designed for engaging storytelling and rich prose. Early testers have r

32,768 tokens

M Meta: Llama 3.2 11B Vision Instruct

meta-llama/llama-3.2-11b-vision-instruct

Llama 3.2 11B Vision is a multimodal model with 11 billion parameters, designed to handl

131,072 tokens

M Meta: Llama 3.2 1B Instruct

meta-llama/llama-3.2-1b-instruct

Llama 3.2 1B is a 1-billion-parameter language model focused on efficiently performing n

131,072 tokens

M Meta: Llama 3.2 3B Instruct

meta-llama/llama-3.2-3b-instruct

Llama 3.2 3B is a 3-billion-parameter multilingual large language model, optimized for a

131,072 tokens

M Meta: Llama 3.2 3B Instruct (free)

meta-llama/llama-3.2-3b-instruct:free

Llama 3.2 3B is a 3-billion-parameter multilingual large language model, optimized for a

131,072 tokens

Qwen2.5 72B Instruct

qwen/qwen-2.5-72b-instruct

Qwen2.5 72B is the latest series of Qwen large language models. Qwen2.5 brings the follo

131,072 tokens

C Cohere: Command R (08-2024)

cohere/command-r-08-2024

command-r-08-2024 is an update of the [Command R](/models/cohere/command-r) with improve

128,000 tokens

C Cohere: Command R+ (08-2024)

cohere/command-r-plus-08-2024

command-r-plus-08-2024 is an update of the [Command R+](/models/cohere/command-r-plus) w

128,000 tokens

S Sao10K: Llama 3.1 Euryale 70B v2.2

sao10k/l3.1-euryale-70b

Euryale L3.1 70B v2.2 is a model focused on creative roleplay from [Sao10k](https://ko-f

131,072 tokens

N Nous: Hermes 3 70B Instruct

nousresearch/hermes-3-llama-3.1-70b

Hermes 3 is a generalist language model with many improvements over [Hermes 2](/models/n

131,072 tokens

N Nous: Hermes 3 405B Instruct

nousresearch/hermes-3-llama-3.1-405b

Hermes 3 is a generalist language model with many improvements over Hermes 2, including

131,072 tokens

N Nous: Hermes 3 405B Instruct (free)

nousresearch/hermes-3-llama-3.1-405b:free

Hermes 3 is a generalist language model with many improvements over Hermes 2, including

131,072 tokens

S Sao10K: Llama 3 8B Lunaris

sao10k/l3-lunaris-8b

Lunaris 8B is a versatile generalist and roleplaying model based on Llama 3. It's a stra

8,192 tokens

OpenAI: GPT-4o (2024-08-06)

openai/gpt-4o-2024-08-06

The 2024-08-06 version of GPT-4o offers improved performance in structured outputs, with

128,000 tokens

M Meta: Llama 3.1 70B Instruct

meta-llama/llama-3.1-70b-instruct

Meta's latest class of model (Llama 3.1) launched with a variety of sizes & flavors. Thi

131,072 tokens

M Meta: Llama 3.1 8B Instruct

meta-llama/llama-3.1-8b-instruct

Meta's latest class of model (Llama 3.1) launched with a variety of sizes & flavors. Thi

131,072 tokens

M Mistral: Mistral Nemo

mistralai/mistral-nemo

A 12B parameter model with a 128k token context length built by Mistral in collaboration

131,072 tokens

OpenAI: GPT-4o-mini

openai/gpt-4o-mini

GPT-4o mini is OpenAI's newest model after [GPT-4 Omni](/models/openai/gpt-4o), supporti

128,000 tokens

OpenAI: GPT-4o-mini (2024-07-18)

openai/gpt-4o-mini-2024-07-18

GPT-4o mini is OpenAI's newest model after [GPT-4 Omni](/models/openai/gpt-4o), supporti

128,000 tokens

Google: Gemma 2 27B

google/gemma-2-27b-it

Gemma 2 27B by Google is an open model built from the same research and technology used

8,192 tokens

OpenAI: GPT-4o

openai/gpt-4o

GPT-4o ("o" for "omni") is OpenAI's latest AI model, supporting both text and image inpu

128,000 tokens

OpenAI: GPT-4o (2024-05-13)

openai/gpt-4o-2024-05-13

GPT-4o ("o" for "omni") is OpenAI's latest AI model, supporting both text and image inpu

128,000 tokens

M Meta: Llama 3 70B Instruct

meta-llama/llama-3-70b-instruct

Meta's latest class of model (Llama 3) launched with a variety of sizes & flavors. This

8,192 tokens

M Meta: Llama 3 8B Instruct

meta-llama/llama-3-8b-instruct

Meta's latest class of model (Llama 3) launched with a variety of sizes & flavors. This

8,192 tokens

M Mistral: Mixtral 8x22B Instruct

mistralai/mixtral-8x22b-instruct

Mistral's official instruct fine-tuned version of [Mixtral 8x22B](/models/mistralai/mixt

65,536 tokens

M WizardLM-2 8x22B

microsoft/wizardlm-2-8x22b

WizardLM-2 8x22B is Microsoft AI's most advanced Wizard model. It demonstrates highly co

65,536 tokens

OpenAI: GPT-4 Turbo

openai/gpt-4-turbo

The latest GPT-4 Turbo model with vision capabilities. Vision requests can now use JSON

128,000 tokens

Anthropic: Claude 3 Haiku

anthropic/claude-3-haiku

Claude 3 Haiku is Anthropic's fastest and most compact model for near-instant responsive

200,000 tokens

M Mistral Large

mistralai/mistral-large

This is Mistral AI's flagship model, Mistral Large 2 (version `mistral-large-2407`). It'

128,000 tokens

OpenAI: GPT-3.5 Turbo (older v0613)

openai/gpt-3.5-turbo-0613

GPT-3.5 Turbo is OpenAI's fastest model. It can understand and generate natural language

4,095 tokens

OpenAI: GPT-4 Turbo Preview

openai/gpt-4-turbo-preview

The preview GPT-4 model with improved instruction following, JSON mode, reproducible out

128,000 tokens

OpenAI: GPT-4 Turbo (older v1106)

openai/gpt-4-1106-preview

The latest GPT-4 Turbo model with vision capabilities. Vision requests can now use JSON

128,000 tokens

OpenAI: GPT-3.5 Turbo Instruct

openai/gpt-3.5-turbo-instruct

This model is a variant of GPT-3.5 Turbo tuned for instructional prompts and omitting ch

4,095 tokens

OpenAI: GPT-3.5 Turbo 16k

openai/gpt-3.5-turbo-16k

This model offers four times the context length of gpt-3.5-turbo, allowing it to support

16,385 tokens

M Mancer: Weaver (alpha)

mancer/weaver

An attempt to recreate Claude-style verbosity, but don't expect the same level of cohere

8,000 tokens

U ReMM SLERP 13B

undi95/remm-slerp-l2-13b

A recreation trial of the original MythoMax-L2-B13 but with updated models. #merge

6,144 tokens

G MythoMax 13B

gryphe/mythomax-l2-13b

One of the highest performing and most popular fine-tunes of Llama 2 13B, with rich desc

4,096 tokens

OpenAI: GPT-3.5 Turbo

openai/gpt-3.5-turbo

GPT-3.5 Turbo is OpenAI's fastest model. It can understand and generate natural language

16,385 tokens

OpenAI: GPT-4

openai/gpt-4

OpenAI's flagship model, GPT-4 is a large-scale multimodal language model capable of sol

8,191 tokens

A AlfredPros: CodeLLaMa 7B Instruct Solidity

alfredpros/codellama-7b-instruct-solidity

4,096 tokens

A Trinity Large Thinking:free

arcee-ai/trinity-large-thinking:free

4,096 tokens

B Cobuddy:free

baidu/cobuddy:free

4,096 tokens

B Baidu: ERNIE 4.5 21B A3B

baidu/ernie-4.5-21b-a3b

131,072 tokens

B Baidu: ERNIE 4.5 21B A3B Thinking

baidu/ernie-4.5-21b-a3b-thinking

131,072 tokens

B Baidu: ERNIE 4.5 300B A47B

baidu/ernie-4.5-300b-a47b

131,072 tokens

B Qianfan Ocr Fast:free

baidu/qianfan-ocr-fast:free

4,096 tokens

Chatgpt 4o Latest

chatgpt-4o-latest

4,096 tokens

Claude 3 5 Haiku

claude-3-5-haiku

4,096 tokens

Claude 3.5 Haiku 20241022

claude-3-5-haiku-20241022

4,096 tokens

Claude 3 5 Sonnet

claude-3-5-sonnet

4,096 tokens

Claude 3.5 Sonnet 20240620

claude-3-5-sonnet-20240620

4,096 tokens

Claude 3.7 Sonnet 20250219

claude-3-7-sonnet-20250219

4,096 tokens

Claude 3.7 Sonnet 20250219 Thinking

claude-3-7-sonnet-20250219-thinking

4,096 tokens

Claude 3 Sonnet 20240229

claude-3-sonnet-20240229

4,096 tokens

Claude 3.7 Sonnet Thinking

claude-3.7-sonnet-thinking

4,096 tokens

Claude 4 Opus Thinking

claude-4-opus-thinking

4,096 tokens

Claude 4 Sonnet Thinking

claude-4-sonnet-thinking

4,096 tokens

Claude Haiku 4.5 20251001

claude-haiku-4-5-20251001

4,096 tokens

Claude Opus 4.1 20250805

claude-opus-4-1-20250805

4,096 tokens

Claude Opus 4.1 20250805 Thinking

claude-opus-4-1-20250805-thinking

4,096 tokens

Claude Opus 4.20250514 Thinking

claude-opus-4-20250514-thinking

4,096 tokens

Claude Opus 4.5

claude-opus-4-5

4,096 tokens

Claude Opus 4 5 20251101

claude-opus-4-5-20251101

4,096 tokens

Claude Opus 4.5 20251101 Thinking

claude-opus-4-5-20251101-thinking

4,096 tokens

Claude Opus 4.6

claude-opus-4-6

4,096 tokens

Claude Opus 4.7

claude-opus-4-7

4,096 tokens

Claude Opus 4.7 Max

claude-opus-4-7-max

4,096 tokens

Claude Sonnet 4.20250514 Thinking

claude-sonnet-4-20250514-thinking

4,096 tokens

Claude Sonnet 4.5 20250929

claude-sonnet-4-5-20250929

4,096 tokens

Claude Sonnet 4.5 20250929 Thinking

claude-sonnet-4-5-20250929-thinking

4,096 tokens

Claude Sonnet 4.6

claude-sonnet-4-6

4,096 tokens

D DeepSeek V3

deepseek-ai/DeepSeek-V3

4,096 tokens

D DeepSeek V3 0324

deepseek-ai/DeepSeek-V3-0324

4,096 tokens

D DeepSeek V3.1

deepseek-ai/DeepSeek-V3.1

4,096 tokens

D DeepSeek: DeepSeek V3.2 Exp

deepseek-ai/DeepSeek-V3.2-Exp

163,840 tokens

DeepSeek R1 Searching

deepseek-r1-searching

4,096 tokens

DeepSeek V3 1.250821

deepseek-v3-1-250821

4,096 tokens

DeepSeek V3 1 Terminus

deepseek-v3-1-terminus

4,096 tokens

DeepSeek V3.1 0821

deepseek-v3.1-0821

4,096 tokens

DeepSeek V3.1 Think

deepseek-v3.1-think

4,096 tokens

DeepSeek V3.2 Think

deepseek-v3.2-think

4,096 tokens

DeepSeek: DeepSeek V4 Flash (free)

deepseek/deepseek-v4-flash:free

1,048,576 tokens

Dolphin3.0 R1 Mistral 24b

Dolphin3.0-R1-Mistral-24B

4,096 tokens

D Doubao Seedance 1.0 Pro 250528

doubao-seedance-1-0-pro-250528

4,096 tokens

D Doubao Seedance 1.0 Pro Fast 251015

doubao-seedance-1-0-pro-fast-251015

4,096 tokens

D Doubao Seedance 1 5 Pro

doubao-seedance-1-5-pro

4,096 tokens

D Doubao Seedance 1.5 Pro 251215

doubao-seedance-1-5-pro-251215

4,096 tokens

D Doubao Seedream 3.0 T2i 250415

doubao-seedream-3-0-t2i-250415

4,096 tokens

D Doubao Seedream 4.0

doubao-seedream-4-0

4,096 tokens

D Doubao Seedream 4.0 250828

doubao-seedream-4-0-250828

4,096 tokens

D Doubao Seedream 4.0 4k

doubao-seedream-4-0-4k

4,096 tokens

D Doubao Seedream 4.5

doubao-seedream-4-5

4,096 tokens

D Doubao Seedream 4.5 251128

doubao-seedream-4-5-251128

4,096 tokens

D Doubao Seedream 4.5 4k

doubao-seedream-4-5-4k

4,096 tokens

D Doubao Seedream 5.0

doubao-seedream-5-0

4,096 tokens

D Doubao Seedream 5.0 260128

doubao-seedream-5-0-260128

4,096 tokens

D Doubao Seedream 5.0 4k

doubao-seedream-5-0-4k

4,096 tokens

Gemini 1.5 Flash

gemini-1.5-flash

4,096 tokens

Gemini 1.5 Flash 002

gemini-1.5-flash-002

4,096 tokens

Gemini 1.5 Flash Exp 0827

gemini-1.5-flash-exp-0827

4,096 tokens

Gemini 2.0 Flash

gemini-2.0-flash

4,096 tokens

Gemini 2.0 Flash Exp

gemini-2.0-flash-exp

4,096 tokens

Gemini 2.0 Flash Lite

gemini-2.0-flash-lite

4,096 tokens

Gemini 2.0 Flash Lite Preview

gemini-2.0-flash-lite-preview

4,096 tokens

Gemini 2.0 Flash Thinking Exp 1219

gemini-2.0-flash-thinking-exp-1219

4,096 tokens

Google: Nano Banana (Gemini 2.5 Flash Image)

gemini-2.5-flash-image

32,768 tokens

Gemini 2.5 Flash Preview 04.17

gemini-2.5-flash-preview-04-17

4,096 tokens

Gemini 2.5 Flash Preview 09.2025

gemini-2.5-flash-preview-09-2025

4,096 tokens

Gemini 2.5 Pro Ci

gemini-2.5-pro-ci

4,096 tokens

Gemini 2.5 Pro Preview 03.25

gemini-2.5-pro-preview-03-25

4,096 tokens

Gemini 2.5 Pro Preview 06.05

gemini-2.5-pro-preview-06-05

4,096 tokens

Gemini 3.1 Pro High

gemini-3-1-pro-high

4,096 tokens

Gemini 3 Fast

gemini-3-fast

4,096 tokens

Gemini 3 Fast All

gemini-3-fast-all

4,096 tokens

Gemini 3 Fast Deepsearch

gemini-3-fast-deepsearch

4,096 tokens

Gemini 3 Flash All

gemini-3-flash-all

4,096 tokens

Gemini 3 Pro

gemini-3-pro

4,096 tokens

Gemini 3 Pro Canvas

gemini-3-pro-canvas

4,096 tokens

Gemini 3 Pro Ci

gemini-3-pro-ci

4,096 tokens

Gemini 3 Pro Deepsearch

gemini-3-pro-deepsearch

4,096 tokens

Gemini 3 Pro High Ci

gemini-3-pro-high-ci

4,096 tokens

Gemini 3 Pro Latest

gemini-3-pro-latest

4,096 tokens

Gemini 3 Thinking

gemini-3-thinking

4,096 tokens

Gemini 3.1 Fast

gemini-3.1-fast

4,096 tokens

Gemini 3.1 Pro

gemini-3.1-pro

4,096 tokens

Gemini 3.1 Pro Ci

gemini-3.1-pro-ci

4,096 tokens

Gemini 3.1 Thinking

gemini-3.1-thinking

4,096 tokens

Gemini 3.5 Flash Ci

gemini-3.5-flash-ci

4,096 tokens

Gemini 3.5 Thinking

gemini-3.5-thinking

4,096 tokens

G Glm 4

glm-4

4,096 tokens

G Glm 4 Airx

glm-4-airx

4,096 tokens

G Glm 4 Flash

glm-4-flash

4,096 tokens

G Glm 4 Long

glm-4-long

4,096 tokens

G Glm 4.5 X

glm-4.5-x

4,096 tokens

Google: Gemini 2.0 Flash

google/gemini-2.0-flash-001

1,048,576 tokens

Google: Gemini 2.0 Flash Lite

google/gemini-2.0-flash-lite-001

1,048,576 tokens

Gemini 3 Pro Image

google/gemini-3-pro-image

4,096 tokens

GPT 3.5 Turbo 0125

gpt-3.5-turbo-0125

4,096 tokens

GPT 3.5 Turbo 0301

gpt-3.5-turbo-0301

4,096 tokens

GPT 3.5 Turbo 1106

gpt-3.5-turbo-1106

4,096 tokens

GPT 3.5 Turbo 16k 0613

gpt-3.5-turbo-16k-0613

4,096 tokens

GPT 4.0125 Preview

gpt-4-0125-preview

4,096 tokens

GPT 4.0613

gpt-4-0613

4,096 tokens

GPT 4 Vision Preview

gpt-4-vision-preview

4,096 tokens

GPT 5 Codex Mini

gpt-5-codex-mini

4,096 tokens

Grok 3

grok-3

4,096 tokens

Grok 3 All

grok-3-all

4,096 tokens

Grok 3 Ci

grok-3-ci

4,096 tokens

Grok 3 Deepersearch

grok-3-deepersearch

4,096 tokens

Grok 3 Deepsearch

grok-3-deepsearch

4,096 tokens

Grok 3 Image

grok-3-image

4,096 tokens

Grok 3 Reasoning

grok-3-reasoning

4,096 tokens

Grok 3 Search

grok-3-search

4,096 tokens

Grok 4

grok-4

4,096 tokens

Grok 4.0709

grok-4-0709

4,096 tokens

Grok 4.1 Thinking 1129

grok-4-1-thinking-1129

4,096 tokens

Grok 4 Auto

grok-4-auto

4,096 tokens

Grok 4 Ci

grok-4-ci

4,096 tokens

Grok 4 Fast Ci

grok-4-fast-ci

4,096 tokens

Grok 4 Image

grok-4-image

4,096 tokens

Grok 4 Mini Thinking Tahoe

grok-4-mini-thinking-tahoe

4,096 tokens

Grok 4.1

grok-4.1

4,096 tokens

Grok 4.1 Ci

grok-4.1-ci

4,096 tokens

Grok 4.1 Fast

grok-4.1-fast

4,096 tokens

Grok 4.1 Fast Ci

grok-4.1-fast-ci

4,096 tokens

Grok 4.1 Image

grok-4.1-image

4,096 tokens

Grok 4.1 Thinking

grok-4.1-thinking

4,096 tokens

Grok 4.2

grok-4.2

4,096 tokens

Grok 4.2 Ci

grok-4.2-ci

4,096 tokens

Grok 4.2 Fast

grok-4.2-fast

4,096 tokens

Grok 4.2 Fast Ci

grok-4.2-fast-ci

4,096 tokens

Grok 4.2 Image

grok-4.2-image

4,096 tokens

Grok 4.3 Ci

grok-4.3-ci

4,096 tokens

Grok 420 Agents

grok-420-agents

4,096 tokens

Grok 420 Fast

grok-420-fast

4,096 tokens

Grok 420 Thinking

grok-420-thinking

4,096 tokens

Grok Code Fast 1

grok-code-fast-1

4,096 tokens

I Ring 2.6 1t:free

inclusionai/ring-2.6-1t:free

4,096 tokens

J Japanese Stable Diffusion Xl

japanese-stable-diffusion-xl

4,096 tokens

K Kimi K2 0711 Preview

kimi-k2-0711-preview

4,096 tokens

K Kimi K2 250905

kimi-k2-250905

4,096 tokens

K Kimi K2 250905 Ci

kimi-k2-250905-ci

4,096 tokens

K Kimi K2 Instruct 0905

kimi-k2-instruct-0905

4,096 tokens

Llama 2 13b

llama-2-13b

4,096 tokens

Llama 2 70b

llama-2-70b

4,096 tokens

Llama 3 Sonar Large 32k Chat

llama-3-sonar-large-32k-chat

4,096 tokens

Llama 3 Sonar Small 32k Chat

llama-3-sonar-small-32k-chat

4,096 tokens

Llama 3.1 405b

Llama-3.1-405B

4,096 tokens

Llama 3.1 405b Instruct

llama-3.1-405b-instruct

4,096 tokens

Meta Llama 3.3 70b Instruct

Meta-Llama-3-3-70B-Instruct

4,096 tokens

M Llama 3.1 405b Instruct:free

meta-llama/llama-3.1-405b-instruct:free

4,096 tokens

M Llama 3.2 90b Vision Instruct

meta-llama/llama-3.2-90b-vision-instruct

4,096 tokens

M Llama 3.2 90b Vision Instruct:free

meta-llama/llama-3.2-90b-vision-instruct:free

4,096 tokens

M Meta Llama 3.1 405b Instruct

meta-llama/Meta-Llama-3.1-405B-Instruct

4,096 tokens

M Phi 3 Medium 128k Instruct

microsoft/phi-3-medium-128k-instruct

4,096 tokens

M Phi 3 Medium 128k Instruct:free

microsoft/phi-3-medium-128k-instruct:free

4,096 tokens

M MiniMax: MiniMax M2.5 (free)

minimax/minimax-m2.5:free

204,800 tokens

Mistral Small 2407

mistral-small-2407

4,096 tokens

Mistral Small Latest

mistral-small-latest

4,096 tokens

M Mistral: Devstral Medium

mistralai/devstral-medium

131,072 tokens

M Mistral: Devstral Small 1.1

mistralai/devstral-small

131,072 tokens

M Mistral: Mistral 7B Instruct v0.1

mistralai/mistral-7b-instruct-v0.1

4,096 tokens

M Mistral Large 2411

mistralai/mistral-large-2411

131,072 tokens

M Mistral: Pixtral Large 2411

mistralai/pixtral-large-2411

131,072 tokens

Mixtral 8x22b Instruct V0.1

mixtral-8x22b-instruct-v0.1

4,096 tokens

M Kimi K2.5

moonshot/kimi-k2.5

4,096 tokens

N NousResearch: Hermes 2 Pro - Llama-3 8B

nousresearch/hermes-2-pro-llama-3-8b

8,192 tokens

o1 Mini

o1-mini

4,096 tokens

o3 Mini 2025.01 31 High

o3-mini-2025-01-31-high

4,096 tokens

o3 Mini All

o3-mini-all

4,096 tokens

Q Qvq 72b Preview 0310

qvq-72b-preview-0310

4,096 tokens

Qwen 3.5 Plus

qwen-3.5-plus

4,096 tokens

Qwen 3.5 Plus Search

qwen-3.5-plus-search

4,096 tokens

Qwen 3.5 Plus Think

qwen-3.5-plus-think

4,096 tokens

Qwen Image Max

qwen-image-max

4,096 tokens

Qwen Image Plus

qwen-image-plus

4,096 tokens

Qwen Max

qwen-max

4,096 tokens

Qwen Max Search

qwen-max-search

4,096 tokens

Qwen Plus 2025.09 11 Think

qwen-plus-2025-09-11-think

4,096 tokens

Qwen Plus Search

qwen-plus-search

4,096 tokens

Qwen Qwq 32b

Qwen-QwQ-32B

4,096 tokens

Qwen Turbo

qwen-turbo

4,096 tokens

Qwen Vl Max

qwen-vl-max

4,096 tokens

Qwen1.5 110b Chat

Qwen/Qwen1.5-110B-Chat

4,096 tokens

Qwen2.5 72b Instruct

Qwen/Qwen2.5-72B-Instruct

4,096 tokens

Qwen2.5 Coder 32b Instruct

Qwen/Qwen2.5-Coder-32B-Instruct

4,096 tokens

Qwen3 Max 2026

qwen/qwen3-max-2026

4,096 tokens

Qwq 32b

qwen/qwq-32b

4,096 tokens

Qwq 72b Preview

qwen/qwq-72b-preview

4,096 tokens

Qwen2.5 32b Instruct

qwen2.5-32b-instruct

4,096 tokens

Qwen3 235b A22b Search

qwen3-235b-a22b-search

4,096 tokens

Qwen3 235b A22b Thinking 2507 Search

qwen3-235b-a22b-thinking-2507-search

4,096 tokens

Qwen3 30b A3b Instruct 2507 Search

qwen3-30b-a3b-instruct-2507-search

4,096 tokens

Qwen3 30b A3b Think

qwen3-30b-a3b-think

4,096 tokens

Qwen3 32b Think

qwen3-32b-think

4,096 tokens

Qwen3 Coder 480b A35b Instruct

qwen3-coder-480b-a35b-instruct

4,096 tokens

Qwen3 Coder 480b A35b Instruct Search

qwen3-coder-480b-a35b-instruct-search

4,096 tokens

Qwen3 Max 2025.10 30 Think

qwen3-max-2025-10-30-think

4,096 tokens

Qwen3 Max Preview

qwen3-max-preview

4,096 tokens

Qwen3 Max Think

qwen3-max-think

4,096 tokens

Qwen3 Vl Plus

qwen3-vl-plus

4,096 tokens

Qwen3 Vl Plus Think

qwen3-vl-plus-think

4,096 tokens

Qwen3.6 Plus Preview

qwen3.6-plus-preview

4,096 tokens

Q Qwq 32b Search

qwq-32b-search

4,096 tokens

Q Qwq Plus Latest

qwq-plus-latest

4,096 tokens

Q Qwq Plus Latest Thinking

qwq-plus-latest-thinking

4,096 tokens

S Sao10k: Llama 3 Euryale 70B v2.1

sao10k/l3-euryale-70b

8,192 tokens

S Sora Image

sora_image

4,096 tokens

S Suno Lyrics

suno_lyrics

4,096 tokens

S Suno Music

suno_music

4,096 tokens

T Test

test

4,096 tokens

W Wan2.6 5s

wan2.6-5s

4,096 tokens

W Wan2.6 Video 5s

wan2.6-video-5s

4,096 tokens

X Grok 4 Fast

x-ai/grok-4-fast

4,096 tokens

X Xiaomi: MiMo-V2-Omni

xiaomi/mimo-v2-omni

262,144 tokens

X Xiaomi: MiMo-V2-Pro

xiaomi/mimo-v2-pro

1,048,576 tokens

Z Z Image Turbo

z-image-turbo

4,096 tokens

Z Z.ai: GLM 4.5

zai-org/glm-4.5

131,072 tokens

Z Z.ai: GLM 4.5 Air

zai-org/glm-4.5-air

131,072 tokens

A Alibaba: Wan 2.6

alibaba/wan-2.6

4,096 tokens

A Alibaba: Wan 2.7

alibaba/wan-2.7

4,096 tokens

B BAAI: bge-base-en-v1.5

baai/bge-base-en-v1.5

8,192 tokens

B BAAI: bge-large-en-v1.5

baai/bge-large-en-v1.5

8,192 tokens

B BAAI: bge-m3

baai/bge-m3

8,192 tokens

B Black Forest Labs: FLUX.2 Flex

black-forest-labs/flux.2-flex

67,344 tokens

B Black Forest Labs: FLUX.2 Klein 4B

black-forest-labs/flux.2-klein-4b

40,960 tokens

B Black Forest Labs: FLUX.2 Max

black-forest-labs/flux.2-max

46,864 tokens

B Black Forest Labs: FLUX.2 Pro

black-forest-labs/flux.2-pro

46,864 tokens

B ByteDance Seed: Seedream 4.5

bytedance-seed/seedream-4.5

4,096 tokens

B ByteDance: Seedance 1.5 Pro

bytedance/seedance-1-5-pro

4,096 tokens

B ByteDance: Seedance 2.0

bytedance/seedance-2.0

4,096 tokens

B ByteDance: Seedance 2.0 Fast

bytedance/seedance-2.0-fast

4,096 tokens

C Canopy Labs: Orpheus 3B

canopylabs/orpheus-3b-0.1-ft

4,096 tokens

C Cohere: Rerank 4 Fast

cohere/rerank-4-fast

32,768 tokens

C Cohere: Rerank 4 Pro

cohere/rerank-4-pro

32,768 tokens

C Cohere: Rerank v3.5

cohere/rerank-v3.5

4,096 tokens

Google: Chirp 3

google/chirp-3

4,096 tokens

Google: Gemini 3.1 Flash TTS Preview

google/gemini-3.1-flash-tts-preview

8,192 tokens

Google: Gemini Embedding 001

google/gemini-embedding-001

20,000 tokens

Google: Gemini Embedding 2

google/gemini-embedding-2

8,192 tokens

Google: Gemini Embedding 2 Preview

google/gemini-embedding-2-preview

8,192 tokens

Google: Veo 3.1

google/veo-3.1

4,096 tokens

Google: Veo 3.1 Fast

google/veo-3.1-fast

4,096 tokens

Google: Veo 3.1 Lite

google/veo-3.1-lite

4,096 tokens

H hexgrad: Kokoro 82M

hexgrad/kokoro-82m

4,096 tokens

I Intfloat: E5-Base-v2

intfloat/e5-base-v2

8,192 tokens

I Intfloat: E5-Large-v2

intfloat/e5-large-v2

8,192 tokens

I Intfloat: Multilingual-E5-Large

intfloat/multilingual-e5-large

8,192 tokens

K Kling: Video v3.0 Pro

kwaivgi/kling-v3.0-pro

4,096 tokens

K Kling: Video v3.0 Standard

kwaivgi/kling-v3.0-std

4,096 tokens

K Kling: Video O1

kwaivgi/kling-video-o1

4,096 tokens

M MiniMax: Hailuo 2.3

minimax/hailuo-2.3

4,096 tokens

M Mistral: Codestral Embed 2505

mistralai/codestral-embed-2505

8,192 tokens

M Mistral: Mistral Embed 2312

mistralai/mistral-embed-2312

8,192 tokens

M Mistral: Voxtral Mini Transcribe

mistralai/voxtral-mini-transcribe

4,096 tokens

M Mistral: Voxtral Mini TTS

mistralai/voxtral-mini-tts-2603

4,096 tokens

N NVIDIA: Llama Nemotron Embed VL 1B V2 (free)

nvidia/llama-nemotron-embed-vl-1b-v2

131,072 tokens

N NVIDIA: Parakeet TDT 0.6B v3

nvidia/parakeet-tdt-0.6b-v3

4,096 tokens

OpenAI: GPT-4o Mini Transcribe

openai/gpt-4o-mini-transcribe

4,096 tokens

OpenAI: GPT-4o Mini TTS

openai/gpt-4o-mini-tts-2025-12-15

4,096 tokens

OpenAI: GPT-4o Transcribe

openai/gpt-4o-transcribe

4,096 tokens

OpenAI: Sora 2 Pro

openai/sora-2-pro

4,096 tokens

OpenAI: Text Embedding 3 Large

openai/text-embedding-3-large

8,192 tokens

OpenAI: Text Embedding 3 Small

openai/text-embedding-3-small

8,192 tokens

OpenAI: Text Embedding Ada 002

openai/text-embedding-ada-002

8,192 tokens

OpenAI: Whisper 1

openai/whisper-1

4,096 tokens

OpenAI: Whisper Large V3 Turbo

openai/whisper-large-v3-turbo

4,096 tokens

P Perplexity: Embed V1 0.6B

perplexity/pplx-embed-v1-0.6b

32,000 tokens

P Perplexity: Embed V1 4B

perplexity/pplx-embed-v1-4b

32,000 tokens

Qwen: Qwen3 ASR Flash

qwen/qwen3-asr-flash-2026-02-10

4,096 tokens

Qwen: Qwen3 Embedding 4B

qwen/qwen3-embedding-4b

32,768 tokens

Qwen: Qwen3 Embedding 8B

qwen/qwen3-embedding-8b

32,000 tokens

R Recraft: Recraft V3

recraft/recraft-v3

65,536 tokens

R Recraft: Recraft V4

recraft/recraft-v4

65,536 tokens

R Recraft: Recraft V4 Pro

recraft/recraft-v4-pro

65,536 tokens

R Recraft: Recraft V4 Pro Vector

recraft/recraft-v4-pro-vector

65,536 tokens

R Recraft: Recraft V4 Vector

recraft/recraft-v4-vector

65,536 tokens

R Recraft: Recraft V4.1

recraft/recraft-v4.1

65,536 tokens

R Recraft: Recraft V4.1 Pro

recraft/recraft-v4.1-pro

65,536 tokens

R Recraft: Recraft V4.1 Pro Vector

recraft/recraft-v4.1-pro-vector

65,536 tokens

R Recraft: Recraft V4.1 Utility

recraft/recraft-v4.1-utility

65,536 tokens

R Recraft: Recraft V4.1 Utility Pro

recraft/recraft-v4.1-utility-pro

65,536 tokens

R Recraft: Recraft V4.1 Vector

recraft/recraft-v4.1-vector

65,536 tokens

S Sentence Transformers: all-MiniLM-L12-v2

sentence-transformers/all-minilm-l12-v2

8,192 tokens

S Sentence Transformers: all-MiniLM-L6-v2

sentence-transformers/all-minilm-l6-v2

8,192 tokens

S Sentence Transformers: all-mpnet-base-v2

sentence-transformers/all-mpnet-base-v2

8,192 tokens

S Sentence Transformers: multi-qa-mpnet-base-dot-v1

sentence-transformers/multi-qa-mpnet-base-dot-v1

8,192 tokens

S Sentence Transformers: paraphrase-MiniLM-L6-v2

sentence-transformers/paraphrase-minilm-l6-v2

8,192 tokens

S Sesame: CSM 1B

sesame/csm-1b

4,096 tokens

S Sourceful: Riverflow V2 Fast

sourceful/riverflow-v2-fast

8,192 tokens

S Sourceful: Riverflow V2 Fast Preview

sourceful/riverflow-v2-fast-preview

8,192 tokens

S Sourceful: Riverflow V2 Max Preview

sourceful/riverflow-v2-max-preview

8,192 tokens

S Sourceful: Riverflow V2 Pro

sourceful/riverflow-v2-pro

8,192 tokens

S Sourceful: Riverflow V2 Standard Preview

sourceful/riverflow-v2-standard-preview

8,192 tokens

T Thenlper: GTE-Base

thenlper/gte-base

8,192 tokens

T Thenlper: GTE-Large

thenlper/gte-large

8,192 tokens

X xAI: Grok Imagine Image Quality

x-ai/grok-imagine-image-quality

65,536 tokens

X xAI: Grok Imagine Video

x-ai/grok-imagine-video

4,096 tokens

X xAI: Grok Voice TTS 1.0

x-ai/grok-voice-tts-1.0

15,000 tokens

Z Zyphra: Zonos v0.1 Hybrid

zyphra/zonos-v0.1-hybrid

4,096 tokens

Z Zyphra: Zonos v0.1 Transformer

zyphra/zonos-v0.1-transformer

4,096 tokens

Gpt-realtime

openai/gpt-realtime

4,096 tokens

Gpt-realtime-mini

openai/gpt-realtime-mini

4,096 tokens

O MiniCPM O 4 5

openbmb/MiniCPM-o-4_5

4,096 tokens

K MoonshotAI: Kimi K2.5

kimi-k2.5

262,144 tokens

OpenAI: Whisper Large V3

openai/whisper-large-v3

4,096 tokens

Gpt-4o-mini-realtime-preview

openai/gpt-4o-mini-realtime-preview

4,096 tokens

Gpt-4o-realtime-preview

openai/gpt-4o-realtime-preview

4,096 tokens