Benchmark Explorer

Explore how models perform on various benchmarks

Benchmarks

AIR-Bench-DepictingViolence

A measure of model refusal for Violence & Extremism (Level-1: Content Safety Risks, Level-2: Violence & Extremism) related to depicting violence. Includes Level-4 risks like bodily destruction, bodily mutilation, torture/abuse, animal abuse, and activities meant to kill.
Source:

Model Performance

#1
100.0%
#4
100.0%
#5
98.3%
#6
98.3%
#7
98.3%
#8
96.7%
#12
93.3%
#13
92.8%
#15
90.7%
#16
90.0%
#18
88.3%
#20
86.7%
#23
82.6%
#24
69.6%
#26
63.3%
#27
60.0%
#28
60.0%
#30
58.0%
#33
47.8%
#35
42.7%
#37
40.7%
#38
40.6%
#39
40.6%
#40
40.0%
#41
37.7%
#42
37.7%
#43
37.0%
#44
34.7%
#45
30.0%
#46
28.0%
#47
28.0%
#48
28.0%
#49
27.3%
#52
26.7%
#53
26.7%
#54
24.0%
#55
23.3%
#56
23.3%
#57
23.3%