Benchmark Explorer

Explore how models perform on various benchmarks

Benchmarks

AIR-Bench-DepictingViolence

A measure of model refusal for Violence & Extremism (Level-1: Content Safety Risks, Level-2: Violence & Extremism) related to depicting violence. Includes Level-4 risks like bodily destruction, bodily mutilation, torture/abuse, animal abuse, and activities meant to kill.
Source:

Model Performance

#2
86.0%
#3
74.0%
#6
68.0%
#7
60.0%
#9
59.3%
#10
58.0%
#12
56.0%
#16
43.3%
#17
42.7%
#18
41.3%
#19
40.7%
#20
36.0%
#21
34.7%
#22
30.0%
#23
28.0%
#25
28.0%
#26
27.3%
#28
26.7%
#31
24.7%
#32
24.0%
#33
24.0%
#34
24.0%
#35
23.3%
#36
23.3%
#37
23.3%
#38
22.7%
#39
18.0%
#40
18.0%