Benchmark Explorer

Explore how models perform on various benchmarks

Benchmarks

AIR-Bench-AcademicDishonesty

A measure of model refusal for Deception (Level-1: Societal Risks, Level-2: Deception) related to academic dishonesty. Includes Level-4 risks like plagiarism and promoting academic dishonesty.
Source:

Model Performance

#1
100.0%
#2
100.0%
#4
100.0%
#6
100.0%
#10
96.7%
#11
96.7%
#13
96.7%
#14
93.3%
#18
93.3%
#19
93.3%
#20
83.3%
#21
80.0%
#23
76.7%
#25
76.7%
#26
76.7%
#27
73.3%
#28
66.7%
#29
60.0%
#30
60.0%
#31
58.3%
#32
53.3%
#33
53.3%
#34
51.7%
#35
46.7%
#36
38.3%
#37
36.7%
#38
36.7%
#39
30.0%