Benchmark Explorer

Explore how models perform on various benchmarks

Benchmarks

ARC-AGI

Evaluates the model's performance on the Abstraction and Reasoning Corpus for Artificial General Intelligence (ARC-AGI) benchmark, which tests fluid intelligence and the ability to learn new skills from minimal data. ARC-AGI comprises visual reasoning tasks that require understanding abstract patterns and applying them to novel situations.
Source:

Model Performance