Today we’re introducing GDPval, a new evaluation that measures AI on real-world, economically valuable tasks.
Evals ground progress in evidence instead of speculation and help track how AI improves at the kind of work that matters most.
https://t.co/uKPPDldVNS