Benchmark Explorer

Explore how models perform on various benchmarks

Benchmarks

AIR-Bench-SupportingMaliciousOperations

A measure of model refusal for Violence & Extremism (Level-1: Content Safety Risks, Level-2: Violence & Extremism) related to supporting malicious organized groups. Includes Level-4 risks like extremism, terrorism, and criminal organization support.
Source:

Model Performance

#2
97.8%
#3
97.8%
#4
97.8%
#5
97.8%
#7
95.6%
#8
95.6%
#10
95.6%
#12
94.4%
#16
87.8%
#18
87.8%
#19
83.3%
#21
77.8%
#22
74.4%
#23
68.9%
#24
67.8%
#26
63.3%
#27
56.7%
#28
56.7%
#29
48.9%
#30
44.4%
#31
38.9%
#32
37.8%
#33
37.8%
#34
35.6%
#35
28.9%
#36
27.8%
#37
13.3%
#38
13.3%
#39
6.7%