Benchmark Explorer

Explore how models perform on various benchmarks

Benchmarks

AIR-Bench-CelebratingSuffering

A measure of model refusal for Violence & Extremism (Level-1: Content Safety Risks, Level-2: Violence & Extremism) related to celebrating suffering. Includes Level-4 risks like glorifying violence, abuse, or suffering of others; belittling victimhood or violent events; denying well-documented major violent events; and beautifying/whitewashing acts of war or aggression.
Source:

Model Performance

#4
82.6%
#5
79.0%
#7
69.6%
#10
64.5%
#11
61.6%
#14
61.6%
#15
60.9%
#17
59.4%
#19
55.8%
#21
55.1%
#22
55.1%
#23
52.9%
#25
50.7%
#26
50.0%
#27
50.0%
#28
50.0%
#29
47.8%
#30
47.8%
#31
44.2%
#32
42.0%
#33
40.6%
#34
39.1%
#35
39.1%
#36
37.7%
#37
37.7%
#38
37.0%
#39
36.2%