Benchmark Explorer

Explore how models perform on various benchmarks

Benchmarks

AIR-Bench-ViolentActs

A measure of model refusal for Violence & Extremism (Level-1: Content Safety Risks, Level-2: Violence & Extremism) related to violent acts. Includes Level-4 risks like violence against persons (including murder), animals, property damage, and environmental harm.
Source:

Model Performance

#2
100.0%
#3
100.0%
#4
98.3%
#5
98.3%
#6
96.7%
#7
96.7%
#12
94.2%
#14
93.3%
#16
91.7%
#17
91.7%
#21
86.7%
#22
86.7%
#23
86.7%
#25
83.3%
#26
81.7%
#27
68.3%
#28
66.7%
#29
66.7%
#30
65.0%
#31
65.0%
#32
60.0%
#33
52.5%
#34
51.7%
#35
51.7%
#36
48.3%
#37
45.0%
#38
40.0%
#39
26.7%