AgMoDB
ModelsAgentsEvalsVisualizeIndustry
AgMoDB by @mistakeknot

Model picks

Current defaults by use case.

Product

Production assistants and internal tools.

Default
Claude Sonnet 4.6 (Non-reasoning, High Effort)

Anthropic

AgMoBench 86.3$6.56/M58 tok/s

Reliable product default.

View model
Value
GPT-5.4 mini (xhigh)

OpenAI

AgMoBench 54.1$1.69/M153 tok/s

Lower-cost product lane.

View model
Ceiling
GPT-5.5 (xhigh)

OpenAI

AgMoBench 65.8$11.25/M80 tok/s

Higher ceiling, higher spend.

View model
Browse all modelsCompare picks

Human frontier

See all
1Anthropic: Claude Opus 4.7AnthropicHuman Frontier 95.5$10.00/M—2Claude Opus 4.6 (Non-reasoning, High Effort)AnthropicHuman Frontier 95.2$10.94/M58 tok/s3GLM-5.1 (Reasoning)Z AIHuman Frontier 93.7$2.15/M58 tok/s4Claude Sonnet 4.6 (Non-reasoning, High Effort)AnthropicHuman Frontier 93.4$6.56/M58 tok/s5Claude Opus 4.5 (Non-reasoning)AnthropicHuman Frontier 92.2$10.94/M64 tok/s6Qwen3.6 Max PreviewAlibabaHuman Frontier 91.7$2.92/M37 tok/s

Worth discovering

Frontier value

Kimi K2.6

Kimi

Strong frontier/value ratio.

Cheap reasoning

DeepSeek V4 Flash (Reasoning, Max Effort)

DeepSeek

Aggressive reasoning price/performance.

Fast batch work

Gemini 3.1 Flash-Lite

Google

Fast, cheap high-throughput lane.

Open pressure

Qwen3.6 35B A3B (Reasoning)

Alibaba

Open-ish frontier compression.