Benchmark Explorer

Explore how models perform on various benchmarks

Benchmarks

FinanceAgent

A benchmark developed in collaboration with Stanford researchers, a Globally-Systemic Important Bank, and industry experts to evaluate AI agents' ability to perform tasks expected of an entry-level financial analyst. The dataset consists of 537 questions that assess various financial analysis skills.
Source:

Model Performance