Links

A Quick List of LLM Benchmarks

A quick dump of the benchmarks that I look at and use personally; I've dropped a few that no longer appear to be kept up to date, and grabbed a few newer ones.

Code Specific

https://www.swebench.com/
https://swe-rebench.com/
https://aider.chat/docs/leaderboards/

Coding Agent

https://www.tbench.ai/leaderboard

General Ability

https://lmarena.ai/leaderboard
https://dubesor.de/benchtable
https://huggingface.co/spaces/TIGER-Lab/MMLU-Pro

Intelligence Index

https://artificialanalysis.ai/leaderboards/models

EQ, Comprehension and Writing Ability

https://eqbench.com/

Context Window

https://fiction.live/stories/Fiction-liveBench-Feb-21-2025/oQdzQvKHw8JyXbN87

Other

https://www.vellum.ai/llm-leaderboard?utm_source=google&utm_medium=organic

A Quick List of LLM Benchmarks

Code Specific

Coding Agent

General Ability

Intelligence Index

EQ, Comprehension and Writing Ability

Context Window

Other

Read more

I Won't Miss The Cold...

My Personal Guide for Developing Software with AI Assistance - 2026 Edition

Clawdbot...

M5 Max Macbook Pro Next Week?