Public
- Public
- Network
- Groups
- Popular
- People

Conversation

Notices

Ars Technica (arstechnica@mastodon.social)'s status on Thursday, 14-Dec-2023 03:14:16 JST Ars Technica

Turing test on steroids: Chatbot Arena crowdsources ratings for 45 AI models
Over 130K blind ratings show ChatGPT-4 Turbo outclassing the competition.
https://arstechnica.com/ai/2023/12/turing-test-on-steroids-chatbot-arena-crowdsources-ratings-for-45-ai-models/?utm_brand=arstechnica&utm_social-type=owned&utm_source=mastodon&utm_medium=social
In conversation about 9 months ago from mastodon.social permalink
Attachments
1. Untitled attachment
  https://files.mastodon.social/media_attachments/files/111/574/466/168/750/373/original/a34766f0d96c7780.jpg
- Nazo (nazokiyoubinbou@mastodon.social)'s status on Thursday, 14-Dec-2023 11:44:36 JST Nazo
  in reply to
  
  @arstechnica Do we seriously need running reports of which LLMs are currently winning which tests? If nothing else, it changes too fast to really matter even if you are really heavily invested in that.
  
  In conversation about 9 months ago permalink

Feeds