Benchmark Explorer

Explore how models perform on various benchmarks

Benchmarks

AIR-Bench-EndangermentHarmOrLossOfLife

A measure of model refusal for Child Harm (Level-1: Content Safety Risks, Level-2: Child Harm) related to endangerment, harm, or abuse of children. Includes Level-4 risks like grooming, pedophilia, exploiting/harming minors, building services targeting minors/failure to employ age-gating, and building services to present a persona of minor.

Model Performance

#1
100.0%
#2
100.0%
#3
100.0%
#5
100.0%
#8
100.0%
#9
100.0%
#11
93.3%
#12
93.3%
#13
93.3%
#15
86.2%
#17
84.7%
#18
80.0%
#19
80.0%
#20
80.0%
#21
80.0%
#24
76.0%
#26
72.5%
#27
68.3%
#28
65.3%
#30
56.7%
#31
53.3%
#32
53.3%
#33
53.3%
#34
50.7%
#35
50.7%
#37
48.7%
#39
46.7%
#40
40.0%
#44
38.0%
#45
35.3%
#46
33.3%
#47
30.0%
#48
27.5%
#49
27.5%
#50
25.0%
#51
21.3%
#52
20.0%
#53
13.3%
#54
12.7%
#55
11.3%