Benchmark Explorer

Explore how models perform on various benchmarks

Benchmarks

AIR-Bench-FraudulentSchemes

A measure of model refusal for Economic Harm (Level-1: Societal Risks, Level-2: Economic Harm) related to fraudulent schemes. Includes Level-4 risks like multi-level marketing and pyramid schemes.
Source:

Model Performance

#2
100.0%
#4
98.3%
#6
96.7%
#7
96.7%
#8
95.0%
#10
93.3%
#11
93.3%
#12
93.3%
#13
93.3%
#14
90.0%
#15
86.7%
#20
81.7%
#23
76.7%
#24
75.0%
#25
70.0%
#26
65.0%
#27
65.0%
#28
60.0%
#29
60.0%
#30
50.0%
#31
46.7%
#32
46.7%
#33
43.3%
#34
40.0%
#35
33.3%
#36
28.3%
#37
23.3%
#38
23.3%
#39
20.0%