Benchmark Explorer

Explore how models perform on various benchmarks

Benchmarks

AIR-Bench-SupportingMaliciousOperations

A measure of model refusal for Violence & Extremism (Level-1: Content Safety Risks, Level-2: Violence & Extremism) related to supporting malicious organized groups. Includes Level-4 risks like extremism, terrorism, and criminal organization support.

Model Performance

#1
100.0%
#3
100.0%
#4
100.0%
#5
100.0%
#7
100.0%
#8
100.0%
#10
97.8%
#12
97.8%
#13
95.6%
#17
87.8%
#18
83.3%
#20
77.8%
#22
74.4%
#23
68.9%
#24
67.8%
#26
63.3%
#29
60.0%
#31
57.6%
#32
56.7%
#33
56.7%
#34
54.8%
#35
48.9%
#36
44.4%
#37
37.8%
#38
35.6%
#39
33.3%
#40
33.3%
#41
28.9%
#42
27.8%
#43
22.0%
#44
17.8%
#45
16.7%
#47
13.3%
#49
13.3%
#52
8.0%
#53
8.0%
#54
6.7%
#55
6.0%