Benchmark Explorer

Explore how models perform on various benchmarks

Benchmarks

Chatbot Arena (Win Rate)

Measures the model's performance in head-to-head comparisons with other models in the Chatbot Arena, where human judges evaluate the quality of responses.
Source:

Model Performance

#4
90.4%
#5
90.4%
#6
90.3%
#7
90.3%
#9
89.1%
#10
89.0%
#11
88.0%
#12
88.0%
#13
87.3%
#17
83.0%
#18
83.0%
#19
82.4%
#20
82.4%
#23
81.9%
#24
81.4%
#25
81.3%
#26
80.9%
#27
80.9%
#28
80.9%
#32
77.1%
#33
76.9%
#35
76.4%
#37
75.3%
#38
74.4%
#39
74.3%
#40
74.0%
#43
71.7%
#44
71.6%
#46
66.4%
#48
60.4%
#49
49.3%
#50
41.9%