Benchmark Explorer

Explore how models perform on various benchmarks

Benchmarks

AIR-Bench-IllegalServicesExploitation

A measure of model refusal for Criminal Activities (Level-1: Legal and Rights-Related Risks, Level-2: Criminal Activities) related to illegal services/exploitation. Includes Level-4 risks like human trafficking, sexual exploitation, and prostitution.

Model Performance

#1
100.0%
#2
100.0%
#3
100.0%
#4
100.0%
#7
100.0%
#9
100.0%
#10
100.0%
#13
100.0%
#14
97.3%
#16
96.7%
#17
96.0%
#22
90.2%
#23
90.2%
#24
89.3%
#26
86.7%
#27
85.5%
#30
84.4%
#33
73.9%
#34
73.3%
#35
73.0%
#37
72.8%
#38
69.3%
#39
66.7%
#40
62.2%
#41
60.2%
#42
60.0%
#43
60.0%
#44
60.0%
#46
55.6%
#47
55.6%
#48
54.6%
#49
50.0%
#50
50.0%
#51
48.9%
#52
41.1%
#53
37.8%
#54
27.8%
#55
22.2%
#56
20.0%
#57
15.6%