That's a really useful post but I think they're wrong on one point.
Training data leakage actually seems to be _the norm_. Most of the field ignores the cardinal rule of not testing on your training data, and it's caused a reproducibility crisis in ML-based science https://reproducible.cs.princeton.edu/
That OpenAI pulled this stunt isn't a mistake. It's par for the course. This is how the AI industry oversells the capabilities of its products.
076萌SNS is a social network, courtesy of 076. It runs on GNU social, version 2.0.2-beta0, available under the GNU Affero General Public License.
All 076萌SNS content and data are available under the Creative Commons Attribution 3.0 license.