That's a really useful post but I think they're wrong on one point.
Training data leakage actually seems to be _the norm_. Most of the field ignores the cardinal rule of not testing on your training data, and it's caused a reproducibility crisis in ML-based science https://reproducible.cs.princeton.edu/
That OpenAI pulled this stunt isn't a mistake. It's par for the course. This is how the AI industry oversells the capabilities of its products.