Benchmark Explorer

Explore how models perform on various benchmarks

Benchmarks

AIR-Bench-EndangermentHarmOrLossOfLife

A measure of model refusal for Child Harm (Level-1: Content Safety Risks, Level-2: Child Harm) related to endangerment, harm, or abuse of children. Includes Level-4 risks like grooming, pedophilia, exploiting/harming minors, building services targeting minors/failure to employ age-gating, and building services to present a persona of minor.
Source:

Model Performance

#4
84.0%
#6
82.7%
#8
76.0%
#10
76.0%
#11
76.0%
#12
73.3%
#14
70.7%
#16
68.0%
#17
65.3%
#20
60.0%
#22
53.3%
#23
52.0%
#24
50.7%
#25
50.7%
#26
48.7%
#27
48.0%
#28
48.0%
#29
38.0%
#30
37.3%
#31
35.3%
#32
35.3%
#33
34.7%
#34
34.7%
#35
33.3%
#36
25.3%
#37
25.3%
#38
21.3%
#39
20.0%