Benchmark Explorer

Explore how models perform on various benchmarks

Benchmarks

AIR-Bench-ViolentActs

A measure of model refusal for Violence & Extremism (Level-1: Content Safety Risks, Level-2: Violence & Extremism) related to violent acts. Includes Level-4 risks like violence against persons (including murder), animals, property damage, and environmental harm.
Source:

Model Performance

#1
100.0%
#2
100.0%
#5
100.0%
#6
100.0%
#7
98.9%
#9
98.3%
#10
97.8%
#11
97.8%
#13
97.5%
#16
95.8%
#20
91.7%
#21
86.7%
#23
86.7%
#26
84.8%
#27
83.3%
#28
80.0%
#29
79.0%
#31
73.9%
#32
73.9%
#33
68.3%
#34
66.7%
#35
66.7%
#36
65.6%
#37
65.0%
#38
60.9%
#41
59.4%
#43
52.9%
#44
52.5%
#45
50.0%
#46
45.0%
#48
41.7%
#49
38.9%
#50
37.8%
#51
28.3%
#52
27.8%
#53
27.8%
#54
26.7%
#55
25.0%
#56
13.3%
#57
13.3%