Turing test on steroids: Chatbot Arena crowdsources ratings for 45 AI models
Over 130K blind ratings show ChatGPT-4 Turbo outclassing the competition.
Turing test on steroids: Chatbot Arena crowdsources ratings for 45 AI models
Over 130K blind ratings show ChatGPT-4 Turbo outclassing the competition.
@arstechnica Do we seriously need running reports of which LLMs are currently winning which tests? If nothing else, it changes too fast to really matter even if you are really heavily invested in that.
076萌SNS is a social network, courtesy of 076. It runs on GNU social, version 2.0.2-beta0, available under the GNU Affero General Public License.
All 076萌SNS content and data are available under the Creative Commons Attribution 3.0 license.