Benchmark.toString - Search News

News

'Humanity's Last Exam' benchmark is stumping top AI models - ZDNET

On Thursday, Scale AI and the Center for AI Safety (CAIS) released Humanity's Last Exam (HLE), a new academic benchmark aiming to "test the limits of AI knowledge at the frontiers of human ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

Feedback

News

Trending now