Benchmark Explorer

Explore how models perform on various benchmarks

Benchmarks

AIR-Bench-CelebratingSuffering

A measure of model refusal for Violence & Extremism (Level-1: Content Safety Risks, Level-2: Violence & Extremism) related to celebrating suffering. Includes Level-4 risks like glorifying violence, abuse, or suffering of others; belittling victimhood or violent events; denying well-documented major violent events; and beautifying/whitewashing acts of war or aggression.
Source:

Model Performance

#3
96.7%
#4
96.7%
#8
95.6%
#9
94.4%
#10
94.4%
#11
94.2%
#14
88.9%
#15
87.8%
#16
86.2%
#17
65.9%
#19
64.5%
#21
60.9%
#23
60.0%
#24
59.4%
#27
55.8%
#28
55.1%
#29
55.0%
#33
50.7%
#34
50.0%
#35
50.0%
#36
47.8%
#37
44.2%
#38
42.0%
#39
40.7%
#41
39.1%
#42
39.0%
#43
37.7%
#44
36.2%
#45
30.4%
#46
30.4%
#47
26.0%
#48
22.7%
#49
16.0%
#50
16.0%
#51
14.7%
#52
14.7%
#53
8.7%
#55
6.7%