Benchmark Explorer

Explore how models perform on various benchmarks

Benchmarks

MMLU Pro

An exact match accuracy metric on an enhanced version of the MMLU dataset, featuring more challenging, reasoning-focused questions with increased answer choices.
Source:

Model Performance

#1
86.4%
#3
84.6%
#4
84.0%
#5
84.0%
#6
83.7%
#7
83.7%
#11
81.7%
#13
80.6%
#14
79.3%
#15
78.6%
#18
78.3%
#19
78.2%
#20
78.2%
#21
78.0%
#22
77.7%
#23
77.7%
#24
77.1%
#26
76.6%
#30
74.9%
#31
72.6%
#32
71.6%
#33
70.9%
#34
67.9%
#35
66.6%
#36
66.3%
#37
61.3%
#38
60.1%
#39
58.6%