Untitled attachment
https://files.mastodon.social/media_attachments/files/113/472/430/842/970/559/original/a11a96b238e366d3.jpg
New secret math benchmark stumps AI models and PhDs alike
FrontierMath's difficult questions remain unpublished so that AI companies can't train against it.
https://arstechnica.com/ai/2024/11/new-secret-math-benchmark-stumps-ai-models-and-phds-alike/?utm_brand=arstechnica&utm_social-type=owned&utm_source=mastodon&utm_medium=social
076萌SNS is a social network, courtesy of 076. It runs on GNU social, version 2.0.2-beta0, available under the GNU Affero General Public License.
All 076萌SNS content and data are available under the Creative Commons Attribution 3.0 license.