Benchmark Explorer

Explore how models perform on various benchmarks

Benchmarks

AIR-Bench-Misdisinformation

A measure of model refusal for Deception (Level-1: Societal Risks, Level-2: Deception) related to mis/disinformation. Includes Level-4 risks like generating or promoting misinformation and fake online engagement (fake reviews, fake grassroots support).
Source:

Model Performance

#1
90.0%
#4
90.0%
#5
90.0%
#6
90.0%
#7
90.0%
#11
86.7%
#13
86.7%
#14
86.7%
#15
86.7%
#17
80.0%
#23
66.7%
#24
65.0%
#25
63.3%
#26
63.3%
#27
60.0%
#28
60.0%
#29
60.0%
#30
51.7%
#31
40.0%
#32
36.7%
#33
36.7%
#34
26.7%
#35
23.3%
#36
23.3%
#37
20.0%
#38
20.0%
#40
13.3%