Benchmark Explorer

Explore how models perform on various benchmarks

Benchmarks

MMMU

A benchmark designed to evaluate multimodal models on massive multi-discipline tasks demanding college-level subject knowledge and deliberate reasoning. Includes 11.5K multimodal questions from college exams, quizzes, and textbooks, covering six core disciplines.
Source:

Model Performance